ASIP eUpdate October 2023

Go Back

�� Cloud

Cloud native EDA tools & pre-optimized hardware platforms

Request a Free Trial →

Innovate Faster with �� Multi-Die Solution

Accelerating success from early architecture to manufacturing.

Download eBook →

Explore Silicon Design, Verification & Manufacturing

�� is a leading provider of electronic design automation solutions and services.

Simpleware Software

Virtual Prototyping

�� Cloud

Unlimited access to EDA software licenses on-demand

Request a Free Trial →

Explore Silicon IP

�� is a leading provider of high-quality, silicon-proven semiconductor IP solutions for SoC designs.

�� IP Portfolio

Download Brochure →

�� IP Technical Bulletin

Read Latest Issue →

Explore Systems Verification and Validation

�� is a leading provider of hardware-assisted verification and virtualization solutions.

System Test Generation

Company Overview

Success Stories

Explore our success stories.

Learn More →

SNUG 2025

�� User Group Conference

Learn More →

ASIP eUpdate, October 2023

ASIP Designer

�� solution to efficiently design and implement your own application-specific instruction-set processor (ASIP) when you can��t find suitable processor IP, or when hardware implementations require more flexibility.

This bi-annual newsletter provides you with easy access to ASIP-related resources. This issue includes the following topics:

Technology Feature: Example Processor Models
What��s New: ASIP Designer U-2023.06 Release
Additional Resources: Webinars, Customer Successes, White Papers

Technology Feature: Example Processor Models

Designers can choose from an extensive library of example processor models provided as nML source code. In combination with ASIP Designer?, these models can be used as a starting point for architectural exploration and customer-specific production designs.

A new Example Processor Models web page is now available which provides a concise overview of the example processor models available with ASIP Designer and their features.

Tsec: An ASIP for Post-Quantum Cryptography

In this section we elaborate on a new example processor model that is introduced with the 2023.06 release of ASIP Designer. It is called ��Tsec�� and implements an accelerator for post-quantum cryptography.

Kyber, the first standardized key encryption mechanism designed to withstand attacks with future powerful quantum computers, is computationally very demanding due to extensive use of hashing, for example. The Tsec example is an ASIP optimized for accelerating Kyber. It evolved from a RISC-V base model to which custom application-specific instructions were added as well as architectural specializations that go beyond simple RISC-V extension mechanisms, such as adding heterogeneous storage.

The underlying base model is Trv32p5x, a previously existing example processor model with a RISC-V scalar instruction set (RV32IM) and 5 pipeline stages, enhanced with DSP-type extensions including:

- A zero-overhead looping mechanism that allows to efficiently implement loops that iterate over arrays

- Load and store instructions with a post-modify addressing mode, that allow to make pointer updates without instruction overhead

- 2-way instruction-level parallelism to support the simultaneous execution of a compute operation and a memory access

Using the rich profiling capabilities of ASIP Designer, an open-source software implementation of the Kyber algorithm was simulated and profiled on the baseline model. Two main computational kernels were identified as the dominating bottlenecks: modular finite-field operations such as ��Montgomery reduction�� and ��Barrett reduction��, and a hashing mechanism called ��Keccak state permutation��.

The Montgomery and Barrett reduction functions could be accelerated by fusing them into single instructions. These fused instructions operate just like a custom scalar ALU instruction on the central register file X.

Figure 1: Trv family of processor models

Figure 1: Arithmetic unit for Montgomery reduction

Figure 1 depicts a custom hardware resource as needed for a single fused instruction performing Montgomery reduction. A resource for Barrett reduction looks very similar, so both were merged and shared between the instructions. Furthermore, multiple instances of the Barret reduction block, along with adders and finite-field multipliers, were combined into a larger butterfly-alike hardware block as depicted in Figure 2, which is triggered by even more specialized single instructions.

Figure 2: Custom butterfly unit with Barret reduction logic and finite-field multipliers

The debugger snapshot in Figure 3 shows how the specialized butterfly instructions are utilized by the compiler in the innermost loop of the number-theoretic transform (NTT) function.

Figure 3: Software-pipelined NTT function

The innermost loop is implemented as a hardware loop (zlp). The loop body consisting of six instructions is software-pipelined, consisting of butterfly instructions, finite-field multiplications and additions, with memory accesses scheduled in parallel.

For the Keccak permutation function, the situation is a bit more complicated. The bit-level logic operations of the hashing mechanism can still be fused into one big logic cloud. The interface of the function, however, takes an entire array of 25 64-bit state variables as an argument, which results in extensive load/store traffic on the general-purpose register. The general-purpose register file of the baseline processor (32 x 32-bit) is just not big enough to capture 25 64-bit values simultaneously, and additionally, it would be too expensive to add the number of parallel ports required by the Keccak operation.

Instead, we created a dedicated register file ��S�� with 25 fields of 64 bits, and with dedicated 64-bit load/store access to the data memory. In addition, each register field has a direct port to the Keccak logic, which can thus access all 25 fields in parallel, as depicted in Figure 4.

Figure 4: Keccak Unit with dedicated register file

The debugger snapshot in Figure 5 shows how the compiler schedules a single-cycle instruction triggering the Keccak logic, embedded in a single-instruction hardware loop, which is surrounded by memory load/store instructions to the special S register file.

Figure 2: Instruction formats supported by Trv<x> processor models (visualization by ASIP Designer's nMLView tool)

Figure 5: Single-cycle Keccak instruction scheduled in a single-instruction hardware loop

Figure 6 is a screenshot of the nML viewer, a utility to graphically inspect the hierarchy of the instruction set. It shows how the custom Keccak instruction and the special finite-field instructions (grouped under ��kyber_instrs��) are integrated both in the single-issue 32-bit instruction format as well as in the parallel dual-issue 64-bit instruction format.

Figure 3: Tmoby ASIP architecture, with RISC-V scalar data-path (far left) and vector data-path extensions

Figure 6: Graphical view of the Tsec instruction set (partially expanded)

The new Tsec example model illustrates how ASIP Designer can be used to extend a RISC-V baseline architecture for higher performance. The specialization for the Keccak state permutation and the reduction functions result in an 8.3x speed-up of the Kyber algorithm compared to the original RISC-V baseline implementation with DSP extensions, at a moderate gate-count increase by a factor 1.8x.

What��s New: ASIP Designer U-2023.06 Release

Since the last edition of this newsletter, we have launched a new feature release of ASIP Designer in June 2023, providing various enhancements and extensions. The following is an extract, sorted by categories (customers can refer to the official Release Notes for a comprehensive list).

Click on each tab for additional information about that new feature

Example Processor Models

Processor Modeling

C/C++ Compiler

ChessDE GUI, Instruction-Set Simulation and Debugging

RTL Generation, Verification and Synthesis Support

Example Processor Models +

Example Processor Models

In the 2023.06 release the following updates were made to the library of example processor models:

A new example model ��Tsec�� has been added, demonstrating an accelerator for the Kyber post-quantum cryptography algorithm. More details can be found in the Technology Feature section in this eUpdate.
A new educational example model ��Matmul�� has been added, included in a workshop that demonstrates the successive extension of a RISC-V baseline model with vector SIMD instructions to accelerate matrix multiplication.
The ��Trv�� model family (RISC-V processors) has been updated:
- CSR instructions and interrupt support have been added to all 32-bit variants.
- All floating-point variants now implement the Zfinx ISA, which means that the floating-point instructions access the general-purpose register file X. There is no more separate floating-point register file F.
All models of the ��Tvec�� model family (generic SIMD processors) have been unified and cleaned up.

Processor Modeling +

Processor Modeling

Enhanced hierarchical instantiation of I/O modules inside other I/O modules and I/O interfaces.
Generalized bit selection in nML image attributes, for more convenient modeling of split encodings.

C/C++ Compiler +

C/C++ Compiler

ASIP Designer comes with a unique and patented compiler solution, with the compiler automatically retargeting itself to the processor architecture. This eliminates any need for compiler backend customization by the user. Release 2023.06 offers the following enhancements:

Support for multiple software stacks in the LLVM-based front-end.
Under the hood, a new ��node-based�� list scheduling algorithm has been introduced in the compiler. It assigns operations to time steps in a more balanced way than the original list scheduling algorithm. The new algorithm better exploits the following features of advanced ASIP architectures: negative dependency lengths (exposed pipeline), delay slots, and combinations of ASAP and ALAP preferences.
The LLVM-based front-end has been updated to the more recent LLVM version 16.0.

ChessDE GUI, Instruction-Set Simulation and Debugging +

ChessDE GUI, Instruction-Set Simulation and Debugging

The language server support, which has been introduced to the ChessDE editor in Release 2022.12, has been further enhanced and extended with additional functionality.
Support for parallel compilation of batch projects. Different subsidiary projects of a batch project can now be compiled in parallel.
Integration of pretty-printing in the GDB debug flow.

RTL Generation, Verification and Synthesis Support +

RTL Generation, Verification, and Synthesis Support

Basic support for partitioned synthesis.
Support for unit testing of PDG primitive functions, using annotations in the PDG code.

Additional Resources

Video

Intro to ASIP Designer

The efficient way to design, implement, program and verify your custom processor.

Watch now

Training

ASIP Designer Online Training

Deep dive into the concepts, languages, and files used to capture a processor design.

Request Access

New Videos

ASIP Designer Tutorial Videos

Learn from experts: Demo videos of ASIP Designer methodology and case studies

Events and Webinars

On-Demand Event

ASIP Virtual Seminar (English)

Case Studies in Low-Power Smart Vision and Cryptography Applications

On-Demand Event

Webinar

Taking the Risk out of Developing Your Own RISC-V Processor with Fast Architecture-Driven PPA Optimization

On-Demand Event

ASIP Virtual Seminar (Chinese)

ʹ�� ASIP Designer �� RISC ��չΪ��ļ�� ѧӦ�õİ��о� Case Study in Post-Quantum Cryptography Application (Chinese)

On-Demand Events

More Events

Find more past events here

Learn More

White Papers & Articles

White Paper

Designing Application-Specific Processors for Wireless 5G SoCs

Download Now

Article

Under the Hood of ASIP Designer - Application-Specific Processor Design Made Possible by Tool Automation

Read now

White Paper

Designing ASIPs with Confidence: A Perspective on the Verification of Application-Specific Instruction Set Processors

Download Now

White Paper

Softening Hardware: Using ASIPs to Optimize Modern SoC Designs

Download Now

White Paper

SDKs for Proprietary Processors �C Why They Matter, What It Takes to Develop Them

Download Now

White Paper

Rapid Architectural Exploration in Designing ASIPs

Download Now

White Paper

Designing ASIPs in Multicore SoCs

Download Now

Customer References

ASIP Designer accelerated the design process of Viettel��s first 5G digital front-end SoC to save time and to achieve the desired performance. With ASIP Designer, we can easily adapt designs to new versions of our 5G algorithms with minor changes in the architecture. (��) The compiler-in-the-loop and synthesis-in-the-loop design flows that come with ASIP Designer save us months in multiple iterations of change and optimization. Furthermore, ASIP Designer opens multiple rooms and options for us to optimize the 5G signal processing stream during the SoC development phase and after silicon readiness. We wouldn��t have been able to achieve this flexibility with hard-wired designs."

Le-Thai Ha

Ph.D., Principal Engineer at Viettel High Tech

Looking for more? Want to stay up to date?

Looking for more??
Want to stay up to date?

Read archive issues

Subscribe to newsletter

��������

ASIP eUpdate, October 2023

ASIP Designer

Technology Feature: Example Processor Models

Tsec: An ASIP for Post-Quantum Cryptography

What��s New: ASIP Designer U-2023.06 Release

Example Processor Models

Processor Modeling

C/C++ Compiler

ChessDE GUI, Instruction-Set Simulation and Debugging

RTL Generation, Verification, and Synthesis Support

Additional Resources

Intro to ASIP Designer

ASIP Designer Online Training

ASIP Designer Tutorial Videos

Events and Webinars

ASIP Virtual Seminar (English)

Webinar

ASIP Virtual Seminar (Chinese)

More Events

White Papers & Articles

Designing Application-Specific Processors for Wireless 5G SoCs

Under the Hood of ASIP Designer - Application-Specific Processor Design Made Possible by Tool Automation

Designing ASIPs with Confidence: A Perspective on the Verification of Application-Specific Instruction Set Processors

Softening Hardware: Using ASIPs to Optimize Modern SoC Designs

SDKs for Proprietary Processors �C Why They Matter, What It Takes to Develop Them

Rapid Architectural Exploration in Designing ASIPs

Designing ASIPs in Multicore SoCs

Customer References

Looking for more? Want to stay up to date?

��