ASIP eUpdate, April 2020

ASIP Designer

Synopsys’ solution to efficiently design and implement your own application-specific instruction-set processor (ASIP) when you can’t find suitable processor IP, or when hardware implementations require more flexibility.

This bi-annual newsletter provides you with easy access to ASIP related resources. This issue includes the following topics:

Technology Feature: LLVM extended – A C/C++ Compiler frontend for application-specific processors

Using ASIP Designer, designers can implement processor architectures tailored to application-specific requirements. For an ASIP to be accepted by embedded software users, it needs to be possible to program such specialized processors in C and C++ using a compiler. This obvious request represents a significant challenge, though, as the compiler needs to be adapted to the specialized architecture.

A compiler is typically composed of a frontend and a backend. The frontend reads in the high-level language, performs certain architecture-independent optimizations, and maps the information to an intermediate representation (IR). The backend transforms the IR into machine code. This involves resource and storage decisions, such as deciding which variables to fit into registers or memory, and which addressing modes to use, as well as the selection and scheduling of appropriate machine instructions.

Traditionally it is assumed that for every processor architecture, a different backend will be needed.  An open-source compiler technology such as LLVM offers a framework to develop such custom backends. This might sound appealing in the first place, but the development remains a time consuming and tedious process that requires a lot of compiler design expertise. Once the backend becomes operational one may realize that the initial ASIP architecture wasn’t the right one, and a new alternative is to be explored, which triggers the need for a new or modified backend. The complexity in compiler backend design has long been the biggest hurdle to adopt an ASIP approach.  Indeed, many ASIPs contain architectural features, like irregular register structures linked to irregular instruction-level parallelism, that require coupled and specialized register allocation and scheduling passes which are not readily available.

ASIP Designer overcomes this hurdle, by providing a unique retargetable backend, which reads in the specification of the processor architecture expressed in the nML processor description language, and automatically adapts itself to this model of the architecture. The details of this backend will be subject of a future article.

But what about the frontend? By definition it is not target-specific, so at first glance it should be possible to use popular open-source technology such as LLVM. The LLVM community has put a lot of effort into implementing the C/C++ frontend called Clang, offering full C++ language support and powerful high-level optimizations, representing hundreds of person years of compiler expertise. Taking a second look, however, the standard distribution of Clang is not sufficient for an ASIP approach. What is needed is an extension to the LLVM frontend, that keeps the benefits while still enabling the power of the specialization that goes into ASIP architectures.

Figure 1 shows the approach that Synopsys has chosen for the compiler in its ASIP Designer tool suite.  The right-hand side of the figure shows the compiler flow, with its intermediate representations and the retargetable backend.  The left-hand side of the figure zooms in on the frontend that supports C, C++ and OpenCL C.  It combines ASIP Designer’s original language front-end that implements many powerful optimizations for signal processing applications, with a version of LLVM’s Clang extended to support ASIP architectures. 

Figure 1: ASIP Designer’s compiler flow (right), with an expansion of the frontend (left)

 

Figure 2 captures some of the architectural options from which a designer can choose when defining an ASIP. We will pick a few of these options to illustrate the need for such extensions to the LLVM frontend. Additionally, we will describe some elements of the infrastructure provided with ASIP Designer’s LLVM frontend.

Figure 2: Architectural Options

 

1. Flexible Data Type Systems

The ability to define a data type system, combining data types and operations, tailored to the application, and tailored to the underlying processor architecture, is a key element of an ASIP design flow. A compiler frontend needs to implement these custom types and associated operations as efficiently as the C built-in types and operations.

Linked to byte-addressable memory spaces, the standard distribution of Clang assumes bit-widths for the C built-in types that are a power of two. In addition, the floating-point types are restricted to the IEEE floating point formats.

ASIP Designer extended the LLVM frontend such that the built-in integer types can have any width, and custom floating-point types can be defined, deviating from the standard exponent and mantissa widths. Use cases might be audio applications using 24-bit integers stored in 24-bit elementary memory fields, or AI applications where floating-point types with reduced bit width can be a key optimization. 

ASIP Designer’s extended LLVM implementation automatically adapts to the data type system defined in the processor model, including fractional types, custom floating-point types, custom vector (SIMD) types where the element type can be integer/fractional/floating-point, but also more abstract application types such as pixels, RGB vectors, or voxels.

2. Memory Architecture

ASIPs allow for distributed registers, connected to specific functional units. Also, it is very common for an ASIP to have multiple memories and address spaces. These memories can have different properties, like a scalar byte-addressable memory combined with a vector memory, where the addressable units are complete vectors.

  • Multiple Address Spaces While the standard distribution of LLVM supports an address space attribute, ASIP Designer significantly extended the multiple address space support in its LLVM front end. The different address spaces or memories can be associated with different pointer widths and different addressable unit sizes. So data memories no longer have to be byte addressable. Also, program memory and data memory can be associated with different pointer types. Variables can be allocated to specific memories using storage directives, which are available in the C++ attribute syntax.
  • Native Pointer Support The standard distribution of LLVM lowers pointer operations to integer operations, but an ASIP may implement pointer arithmetic differently from integer arithmetic. For example, pointers can have a different width and can be mapped to dedicated address generation units and corresponding registers. ASIP Designer’s LLVM front-end automatically adopts the pointer implementation as defined in the processor model.
  • C restrict support In DSP code, the restrict keyword is essential to express independent memory accesses, for example, to arrive at maximal software pipelining of loops. The standard distribution of LLVM supports the restrict keyword in a limited way (on function arguments only). ASIP Designer’s extended LLVM frontend generalizes the support of restrict, such that it can also be attached to local pointer variables and to struct member pointers.

3. Frontend Infrastructure

ASIP Designer offers a light-weight version of the standard C++ library in the LLVM distribution, called libc++lite, tuned to embedded applications, offering most of the C++ library functionality while avoiding the code bloat in the standard version. Notwithstanding all the extensions described above, ASIP Designer’s extended LLVM frontend follows the latest LLVM standard distribution. With the 2020.03 release of ASIP Designer, the extended LLVM frontend is built on top of LLVM release 10. This way users can be sure to have the best of both worlds, the latest enhancements that go into the standard LLVM distribution, combined with ASIP Designer’s powerful extensions that make it applicable for an ASIP design flow. 

What’s New in ASIP Designer?

2020.03 Release Update

In March 2020, we launched the latest release of ASIP Designer, providing various enhancements and extensions. The following is an extract, sorted by categories (please refer to the official Release Notes for the comprehensive list).

Example Models

  • Trv processor family, supporting the RISC-V ISA specification
    • Trv32p3: 32-bit datapath, 3 stage pipeline
    • Trv32p5: 32-bit datapath, 5 stage pipeline
    • Trv64p3: 64-bit datapath, 3 stage pipeline
    • Trv64p5: 64-bit datapath, 5 stage pipeline

Each of these four models can optionally be extended with RISC-V ISA compliant compressed instructions, zero-overhead hardware loops and indirect addressing modes with post-modification.

As with all example models, these come in nML source code and can be modified by the user. They serve both as a reference to explain how certain processor features can be modeled in nML, as well as a starting point for customer-specific modifications and extensions.

  • Tvox: this example features an ASIP tailored to the performance- and power-optimized implementation of a complex algorithm for SLAM (synchronous localization and mapping). It is a perfect example to illustrate the design process for an application-optimized processor. The processor is an order of magnitude more power-efficient than a standard  vector DSP thanks to the specialization of the datapath and the memory management. It also illustrates how an ASIP can be designed supporting a memory access scheme that allows for efficient multicore implementations.
  • Tvliw: Variant 3 of the Tvliw model has been extended with a loop instruction buffer.  The purpose of this buffer is to avoid that the instructions inside a loop are fetched from the program memory for each iteration, thus saving on the power consumption that is associated with the instruction fetching.

Processor Modeling

  • The 2019.09 release of ASIP Designer introduced an extension to the nML processor modeling language to efficiently describe Very-Long Instruction-Word (VLIW) architectures with variable-length instructions in a more concise way than before, with full tool support for nop-compression (see also the Technical Feature Section of the November 2019 ASIP eUpdate).  Further building on this technology, we now also added support for specifying variable-length instructions in a more concise way for ASIP architectures with encoded, non-VLIW instruction formats.

C/C++ Compiler

  • The compiler backend execution time has been significantly reduced for many use cases, by combining the separate phases of the code generation into a single executable.
  • Improved compiler backend optimizations, related to disaggregation of record register accesses, reduced jump penalties by better placement of code blocks, and better trade-off between delayed versus non-delayed control instructions. 
  • The LLVM-based front-end, and all example models featuring the LLVM-based frontend have been updated to the most recent LLVM version 10.0.

Simulation and Debugging

  • Template-based SystemC wrapping of ISS: ASIP Designer allows an instruction-set simulator (which can be either cycle-accurate or instruction-accurate) to be automatically wrapped with a SystemC interface for inclusion into a virtual prototype of the system. Using a new template and adapter-based methodology allows to generate different wrappers that can easily be tailored to the different TLM abstraction levels and gives the user fine-grained control over the internals of the SystemC wrapper, to even satisfy very customer-specific virtual prototyping needs.
  • MultiCPU host support in Virtualizer: Most virtual prototypes contain multiple processor cores, executing in parallel. Synopsys’ Virtualizer tool supports a unique concept to accelerate the execution of such multicore platforms, by mapping the individual processor simulation models to different cores of the simulation host machines. This is now automatically supported by ASIP Designer, for systems with multiple ASIP cores. In case there are no or limited data dependencies between the multiple cores, this leads to almost linear simulation speed-up.

RTL Generation, Verification, and Synthesis Support

  • ASIP Designer's RTL generation tool has a large number of options to choose from, to tune the RTL code. A new interactive table greatly simplifies to selection of these options, providing filtering and search functionalities and direct access to the option’s description.  

Additional Resources

ASIP Designer Online Training

Online training for ASIP Designer has seen strong adoption by new users. Additional recordings have been added. Register here for access to the training modules, which provide a deep dive into the concepts, languages, and files that are used to capture a processor design.

 

Webinar – April 28, 2020, 10:00 am PT

Designing Application-Specific Processors for Smart Vision Systems: A High-Performance SLAM Case Study

Depth-sensing camera technology is literally adding a new dimension to the world of computer vision.  Affordable high-resolution depth-sensing cameras will change the way we interact with our mobile devices, enable augmented reality in visual communication and gaming, and lead to the deployment of intelligent robots and drones in daily life.  To make such camera systems future-proof, while keeping the system cost low, the sensor outputs need to be processed close to the image sensor under software control.

Application-specific instruction-set processors (ASIPs) are an ideal technology to add intelligent processing to such sensor systems, at the lowest power and cost.  Synopsys' ASIP Designer tool suite enables system and chip designers to create their own ASIP in a short time, with the exact forms of parallelism and specialization required by their applications.

This webinar will show how the ASIP Designer tool suite was used to design an ASIP for a dense grid-based Simultaneous Localization And Mapping (SLAM) algorithm, for 3D reconstruction of dynamic environments from depth-sensing camera images.  The ASIP, of which the architecture was designed in four person-months, is estimated to consume 4 orders of magnitude less power than a conventional GPU solution, in a fraction of the silicon area.

 

Customer References

“To meet our customer-specific requirements, we are developing specialized processors and programmable accelerators that are fully optimized for performance, power, area, and code size, while offering the required flexibility,” said Thierry Brouste, Manager, Embedded Computing Solutions, STMicroelectronics. “Using ASIP Designer as our tool of choice gives us a significant competitive advantage, because it enables us to quickly develop complex and highly differentiated application-specific processors, while maximizing our design team’s efficiency through design automation and architecture exploration.”

RIKEN’s drug discovery molecular simulation platform team utilizes leading computational technologies using large-scale, high-speed supercomputers, specifically for molecular simulation technologies. These molecular simulators are used to identify drug behavior at the atomic level and help predict what structural formulas make for highly effective and selective drug candidates. Molecular dynamics (MD) simulations are computationally intensive and need petaflops of processing performance. RIKEN recognized that a general-purpose processor would not deliver the required performance, and so they decided to develop their own specialized custom processor using Synopsys’ ASIP Designer tool, and integrated 17 instances of the processor in a custom multicore chip.

 

White Papers   

Over the past decade, the trend in SoC design has been to add more functionality into software. There are several reasons for this, including (a) software is easier and faster to fix and update, (b) evolving trends and not-yet fully specified standards require flexibility since the final functionality might not be known at the time the hardware design must be locked down, and (c) the desire to reuse SoCs for different products and derivatives, improving the return on investment (ROI) for a single design. Read the white paper to find out how ASIPs can contribute and what it takes to develop them.

In order to develop a proprietary processor that can stand the test of time, a highly functional SDK must be developed. The complexity, cost and duration of SDK development vary depending on the architecture of the processor and the skillset of the SDK developers. In this paper, we analyze the requirements for an SDK. We then introduce a tool-based methodology for SDK development based on Synopsys’ ASIP Designer tool suite.

Architectural exploration is at the heart of any ASIP design approach. Designers need to rapidly explore the impact of different architectural choices on power consumption and performance, ideally using real-world application C-code as part of the design flow. This white paper explains the architectural tradeoffs that are available to an ASIP designer, how to trade off performance vs. area, and why an ASIP design can still maintain full C-programmability while being optimized for a certain application domain.

For the SoC implementation of SDR baseband, most or all of the fixed-function logic blocks for physical layer processing are replaced by software running on processor(s). Heterogeneous multicore architectures are deemed better suited for SDR SoCs since they require less silicon area and consume less power.

Modern SoCs integrate dozens of complex system functions, each requiring its own optimal balance of performance, flexibility, energy consumption, communication, and design time. The traditional model of a (configurable) general-purpose processor core with a number of fixed hardware accelerators no longer suffices. ASIPs can offer the best balance for each system function, and thus form the basis of new generations of multicore SoCs.

 

More