草榴社区

ASIP eUpdate, October 2020

草榴社区’ solution to efficiently design and implement your own application-specific instruction-set processor (ASIP) when you can’t find suitable processor IP, or when hardware implementations require more flexibility.

ASIP Designer

草榴社区’ solution to efficiently design and implement your own application-specific instruction-set processor (ASIP) when you can’t find suitable processor IP, or when hardware implementations require more flexibility.

Technology Feature: RISC-V ISA Extensions Made Easy, Using Simple Datapath eXtensions (SDX)

The slowing down of Moore’s law and Dennard scaling triggered an increased awareness of ASIPs (also referred to as domain-specific processors), with those processors implementing a specialized instruction set architecture (ISA), often starting from a baseline ISA such as RISC-V.

Designing such specialized processors involves a number of tasks including determining:

  • Which custom extensions make a difference for a specific application
  • How to get to a compiler and a simulator for the specialized architecture in time
  • How to confirm the target performance can be reached
  • Whether energy consumption will stay within the boundaries
  • And how to verify the processor

草榴社区’ ASIP Designer? is the industry-leading tool to design, program, implement, and verify custom-designed processors. Starting from a single processor specification efficiently describing the ISA and the micro-architecture, designers immediately get a fully featured software development kit (SDK) including a cycle-accurate simulator, debugger and an optimizing C/C++ Compiler, all supporting the specialized architecture. This allows for a compiler-in-the-loop? based tuning of the processor specification, using real application code to benchmark the performance. From the same specification, the RTL code is generated, with synthesis-in-the-loop? to measure gate count and power consumption, and to identify and optimize critical paths in the design.

ASIP Designer comes with an extensive library of example ASIP models. These include a wide range of models supporting the RISC-V ISA specification, featuring both 32-bit and 64-bit datapath implementations and various pipeline depths. Other example models demonstrate how to model DSP, VLIW and SIMD architectures, as well as different forms of architectural specialization for specific application domains like wireless communication or image processing, often using machine learning.

As the models are provided in nML source code, users are free to modify the processors to customize the ISA and the microarchitecture to their application-specific needs.

For a certain class of instructions, it is now easier than ever to implement such customization of the RISC-V ISA, thanks to ASIP Designer’s SDX approach. The guiding principles are as follows:

  • Trv-SDX is a new example processor model that implements the RISC-V ISA, and additionally contains templates for extension instructions.  These templates are encoded using the RISC-V custom-2 opcode space, which have been reserved by the standard to enable custom ISA extensions.
  • The user only has to define the behavior of the desired extension instructions in bit-accurate C code.  The user does not need to edit the nML processor model (or even be familiar with the nML processor description language).  This makes the concept of ISA extensions simple to define and use.
  • All focus can be on the behavior of such extension. The user is free to define any functionality that operates on the register configuration that is predefined for each  instruction template. A variety of register configurations is offered.
  • The compiler is ready-to-use, as intrinsics targeting the extension instructions are already provided.

Let us illustrate the SDX concept with an example, where we define an instruction that performs multiply-add and multiply-sub:

  • Choose sdx0a, a predefined template for a 32-bit three register instruction with accumulation
  • Define the behavior in bit-accurate C code (referred to as “PDG code” in ASIP Designer). For the sdx0a template, the behavior must be specified in the C function psdx0a_user(). The immediate argument is used to select between add or sub.
  •   Defining the behavior is the only edit the user has to do.

  • The compiler intrinsic is readily available to the software programmer, under the name fsdx0a(). There is nothing for the user to modify (unless he/she wants to modify the name of the intrinsic, for better readability and ease-of-use by the software programmer when using the intrinsic)

With this, ASIP Designer knows about the new instruction, and it knows about the behavior linked to this instruction. It will automatically update the entire SDK including the simulator and compiler and the graphical debugger. In addition, it generates RTL featuring the new data path.

With the immediate update of the SDK, the embedded software developer can use the intrinsic function right away, and the compiler takes care of the correct scheduling and optimizations. In the following example code, the fsdx0a() intrinsic function is called to perform a multiply-add operation using the newly defined instruction. 

The latest ASIP Designer release comes with three examples that illustrate the use of SDX:

  1. FFT accelerator: two SDX instructions, the first featuring fixed-point multiplication & scaling using complex data types, and the second featuring butterfly operations
  2. SHA 256 accelerator for secure hashing, with an SDX instruction computing the hash as a result of bitwise AND, OR, XOR operations, shift operations and additions. As the algorithm has state variables, it uses an SDX variant that supports 8 additional register reads and writes
  3. Keyword spotting, using a neural network: it features a vector-MAC instruction implementing packed SIMD operations, working on 8-bit data types, and using register pairing to enable 64-bit accesses

Thanks to SDX, extending the RISC-V ISA with custom instructions has never been easier. Engineers start from an available RISC-V baseline model and can directly focus on defining the functionality of the desired instruction extension. With the compiler-in-the-loop, they can use real application code to profile the impact of the extensions on the performance. And with synthesis-in-the-loop they can check if those extensions will stay within the available area, power and timing budget.

Of course, ASIP Designer also supports the customization of ISAs beyond the scope of SDX. In this case the designer will turn to full nML coding, instead of instantiating predefined instruction stubs. The following table covers some of the parameters a designer can choose from. Turning to nML coding, maximum flexibility is offered. ASIP Designer comes with a rich set of example models, including but also extending beyond RISC-V ISA models, all provided in source code. Thanks to the unique capabilities to automatically generate the SDK and the RTL as explained above, ASIP Designer is the most powerful and versatile approach to design application-optimized processors. 

What’s New: ASIP Designer 2020.09 Release Update

2020.09 Release Update

In September 2020, we launched the latest release of ASIP Designer, providing various enhancements and extensions. The following is an extract, sorted by categories (customers can refer to the official Release Notes for a comprehensive list).

Example Models

Designers can choose from an extensive library of example ASIP models provided as nML source code. In combination with ASIP Designer, these models can be used as a starting point for architectural exploration, and customer-specific production designs.

  • Trv-SDX: SDX is a mechanism to add simple extension instructions to the RISC-V ISA. The concept is illustrated by a number of examples, including Secure Hashing, FFT, and keyword spotting based on a neural network
  • Tmoby is an accelerator tailored to the execution of MobileNet v3. It is an educational example demonstrating how to implement a complex AI application. It features a RISC-V ISA for scalar operations, and accelerates the MobileNet v3 execution through the use of 64-way SIMD, 4-way instruction-level parallelism and specialized operations
  • High-performance SHA256 acceleration on Trv32p3: a workshop describing the design of an accelerator for SHA 256 secure hashing targeting a data rate of 100MB/s. Starting from a plain RISC-V ISA using the Trv32p3 model, the workshop follows a step-wise approach to come to an architecture that executes the SHA256 compression loop in only 1 cycle per iteration. Architectural optimizations go beyond the scope of SDX that was discussed earlier. The example illustrates how to use the compiler-in-the-loop and synthesis-in-the-loop methodologies, uniquely provided by ASIP Designer
  • Template based SystemC models: new template based OSCI and Virtualizer models for the Trv family, enabling a rapid integration of the instruction-set simulator into virtual platforms at different levels of timing accuracy

C/C++ Compiler

ASIP Designer comes with a unique and patented compiler solution, with the compiler automatically retargeting itself to the processor architecture. This eliminates any need for compiler backend customization by the user. With release 2020.09,

  • Compiler options, including optimization levels, can be annotated to individual functions in the application code (as opposed to only at the file level previously)
  • The LLVM-based compiler front-end has been further enhanced, to address the specific requirements of ASIP architectures (see also the technical article in the April 2020 Newsletter. It now supports compile-time initialization of custom vector types, supporting broadcast and element-wise initialization. It is now also aware of the memory layout of types as defined in the compiler header, including promotion and right-padding
  • The interaction between register assignment and scheduling, which was previously restricted to better software pipelining of leaf loops, has been generalized to obtain more effective scheduling of the code blocks around the inner loops, for example resulting in better overlaps with the pre/post ambles of software pipelined loops
  • The LLVM-based front-end, and all example models featuring the LLVM-based frontend have been updated to the most recent LLVM version 11.0

RTL Generation, Verification, and Synthesis Support

  • Simplified integration of a SystemC-wrapped ISS for use in HDL test-benches
  • Further enhancements supporting FPGA prototyping with HAPS? and emulation with Zebu. For HAPS, additional memory types are available to be mapped automatically on FPGA memory resources as available on the board. For Zebu, a new script is available to efficiently map the code, including the generation of register-change dumps. It also supports the ZeBu JTAG transactor server.

Additional Resources

ASIP Designer Online Training

Online training for ASIP Designer has seen strong adoption by new users. Additional recordings have been added. Register here for access to the training modules, which provide a deep dive into the concepts, languages, and files that are used to capture a processor design.

 

Designing Application-Specific Processors for Smart Vision Systems: A High-Performance SLAM Case Study

This 草榴社区 webinar shows how the ASIP Designer tool suite was used to design an ASIP for a dense grid-based Simultaneous Localization And Mapping (SLAM) algorithm, for 3D reconstruction of dynamic environments from depth-sensing camera images.  The ASIP, of which the architecture was designed in four person-months, is estimated to consume 4 orders of magnitude less power than a conventional GPU solution, in a fraction of the silicon area.

 

Customer References

“We were on a tight schedule to develop five complex custom processor models for our multicore data flow processor. By using ASIP Designer and the RISC-V processor models provided with the tool as a starting point, we were able to meet functionality and performance requirements while reducing development time by 50%.”  Sadahiro Kimura, Manager, Semiconductor IP R&D Unit, Advanced Technology Development Section at NSITEXE, a group company of Denso Cooperation

“To meet our customer-specific requirements, we are developing specialized processors and programmable accelerators that are fully optimized for performance, power, area, and code size, while offering the required flexibility,” said Thierry Brouste, Manager, Embedded Computing 草榴社区, STMicroelectronics. “Using ASIP Designer as our tool of choice gives us a significant competitive advantage, because it enables us to quickly develop complex and highly differentiated application-specific processors, while maximizing our design team’s efficiency through design automation and architecture exploration.”

RIKEN’s drug discovery molecular simulation platform team utilizes leading computational technologies using large-scale, high-speed supercomputers, specifically for molecular simulation technologies. These molecular simulators are used to identify drug behavior at the atomic level and help predict what structural formulas make for highly effective and selective drug candidates. Molecular dynamics (MD) simulations are computationally intensive and need petaflops of processing performance. RIKEN recognized that a general-purpose processor would not deliver the required performance, and so they decided to develop their own specialized custom processor using 草榴社区’ ASIP Designer tool, and integrated 17 instances of the processor in a custom multicore chip.

 

White Papers   

Over the past decade, the trend in SoC design has been to add more functionality into software. There are several reasons for this, including (a) software is easier and faster to fix and update, (b) evolving trends and not-yet fully specified standards require flexibility since the final functionality might not be known at the time the hardware design must be locked down, and (c) the desire to reuse SoCs for different products and derivatives, improving the return on investment (ROI) for a single design. Read the white paper to find out how ASIPs can contribute and what it takes to develop them.

In order to develop a proprietary processor that can stand the test of time, a highly functional SDK must be developed. The complexity, cost and duration of SDK development vary depending on the architecture of the processor and the skillset of the SDK developers. In this paper, we analyze the requirements for an SDK. We then introduce a tool-based methodology for SDK development based on 草榴社区’ ASIP Designer tool suite.

Architectural exploration is at the heart of any ASIP design approach. Designers need to rapidly explore the impact of different architectural choices on power consumption and performance, ideally using real-world application C-code as part of the design flow. This white paper explains the architectural tradeoffs that are available to an ASIP designer, how to trade off performance vs. area, and why an ASIP design can still maintain full C-programmability while being optimized for a certain application domain.

Modern SoCs integrate dozens of complex system functions, each requiring its own optimal balance of performance, flexibility, energy consumption, communication, and design time. The traditional model of a (configurable) general-purpose processor core with a number of fixed hardware accelerators no longer suffices. ASIPs can offer the best balance for each system function, and thus form the basis of new generations of multicore SoCs.

 

More