草榴社区

AI-Accelerated: Migrating Synaptics’ Quad-Display SoC to ARC HS58x3 with 草榴社区 QIK and DSO.ai “Warm Start”

Rob van Blommestein

May 21, 2024 / 5 min read

In this fast-paced digital age where speed, performance, and time-to-market are king, chip designers are under pressure to deliver high-performance computing that doesn’t compromise power efficiency. The constant demand for instantaneous data processing and sharing is pushing the boundaries of innovation in chip design. With this context, we revisit and revamp the insights from the 草榴社区 User Group (SNUG) Silicon Valley event to explore how AI-driven Electronic Design Automation (EDA) is revolutionizing chip design and migration processes, making them more efficient and cost-effective.

奥别’惫别 already shared how AI has enabled digital designs to be retargeted to more advanced process nodes, helping to limit or remove the need for redesign. But what about migrating existing designs that are still viable to newer processors that have greater power capabilities? Can the same AI technology be applied to this challenge? Synaptics did just that.

Synaptics is a developer of DisplayLink technology that makes it simple to connect any display to any computer that supports USB or Wi-Fi and provides universal solutions for a range of corporate, home, and embedded applications where easy connectivity of displays enhances productivity. The company’s latest DisplayLink DL-7400 is a universal display dock-on-a-chip that supports ultra-high resolution and refresh rates of up to 4K @ 144 Hz with four simultaneous display outputs from a single PC. It boasts 2x 8K, 4x 4K, 5K/6K on any USB even older GPUs, one dock for all IT needs, 2.5G Ethernet with IoT engine, and signed encrypted firmware.

To achieve this high-performance computing demand, Synaptics designed DL-7400 to run on the ARC HS58 32-bit processor that is based on the ARCv3 instruction set architecture (ISA) that features up to 12 core coherent cluster with up to 16 HW accelerators; coherent, high bandwidth interconnect (800GB/s); and 150+ DSP instructions. However, previous generations of DL-7400 that required less processing power ran on the ARC HS38 32-bit processor that is based on the ARCv2 ISA that offers single issue, 10-stage pipeline, and dual/quad-core implementations.

arc processor node migration

Optimizing for ARC Processor Migration

The upgrade to ARC HS58 provided significant improvements in several areas necessary for DL-7400:

  • HS58 came with a 2.5X gain in memory bandwidth. 
  • Compression speed was measured at more than 1.25X faster while decompression speed clocked in at 1.12X faster. 
  • The processor offers intensive computation for large data arrays stored in memory with a 50% better performance executing the same number of instructions.
  • Finally, the HS58 provided an average of 48% better performance when measured on silicon.

For the migration to be successful, several software and implementation challenges had to be met. First off, the software needed to be preserved. Minimizing changes was crucial because a significant amount of investment was placed into developing unique and complex software. On the implementation side, the footprint had to match that of the HS38 including the physical area, pins, location, and power distribution. The I/O timing and clock latency needed to meet the existing SoC while dynamic leakage power needed to be reduced. To achieve all the benefits of the ARC HS58 processor without having to perform a complete redesign, Synaptics turned to 草榴社区 QuickStart Implementation Kit (QIK), Fusion Compiler?, and DSO.ai?.

The 草榴社区 QIK is a complete solution consisting of the best-in-class IPs, libraries, tools, and methodologies, services, and support. QIKs were made to implement 草榴社区 IP as they are created in close collaboration with IP design, R&D, and methodology experts, provide a fully worked example that meets QoR targets, include a recommended flow with scripts (implementation, ECO, signoff, formal verification, core configuration, and constraints), incorporate either flat or hierarchical flows, and are easy to customize for project-specific needs.

草榴社区 DSO.ai autonomously explores multiple design spaces to optimize PPA metrics while minimizing tradeoffs for the target application. It uses AI to navigate the design-technology solution space by automatically adjusting or fine-tuning the inputs to the design (e.g., settings, constraints, process, flow, hierarchy, and library) to find the best PPA targets. 


Explore AI-Driven Design with 草榴社区 DSO.ai

Leverage 草榴社区 Ai to enhance power and performance. James Chuang describes how to achieve new productivity levels.


For new designs, 草榴社区 DSO.ai takes these inputs in what is called a “cold start” to identify those prime targets. However, the solution learns from the initial design optimization and can apply these learnings to derivative designs so that rather than starting “cold,” the AI engine gets a “warm start” when it comes to figuring out the best optimization strategies to meet target specifications, saving 5-10X compute resources. This technique was utilized in the migration to the new processor; the HS58 configuration was compared against the original HS38 implemented in the SoC. Then, the HS58 design was run through the QIK flow and then through the 草榴社区 DSO.ai "warm start" to achieve the best targets for that processor.

processor migration process

草榴社区 DSO.ai flow for “cold start” vs. “warm start.”

Using 草榴社区 DSO.ai not only dramatically reduced turn-around-time, but also yielded significant reductions in timing requirement violations while improving power consumption and power leakage beyond what the 草榴社区 QIK would have produced on its own. 草榴社区 DSO.ai reduced WNS by 23%, TNS by 61%, and Hold TNS by 92%. Total power was improved by 2.2% and leakage power by 19.6%.

ARC processor quick implementation kit

Synaptics’ results from using 草榴社区 QIK with DSO.ai

By leveraging AI-driven optimization from 草榴社区 Fusion Compiler and DSO.ai, the team was able to significantly reduce design turn-around time, improve power consumption, and minimize timing requirement violations. The migration of existing designs to more advanced processors can provide a viable solution to meet the increasing demands and shrinking market windows in high-performance computing.

For a deeper dive into Synaptics’ journey, download the SNUG presentation.

草榴社区.ai: AI-Driven EDA

Optimize silicon performance, accelerate chip design and improve efficiency throughout the entire EDA flow with our advanced suite of AI-driven solutions.

Continue Reading