Cloud native EDA tools & pre-optimized hardware platforms
From the data center to the edge and deep within the web of smart everything, today’s advanced multi-die systems are achieving previously unheard-of levels of performance. Instead of one-size-fits-all monolithic silicon, multi-die systems are comprised of an array of heterogeneous dies (or “chiplets”), optimized for each functional component. But while multi-die systems offer new levels of flexibility and achievement in system power and performance, they also introduce a high degree of design complexity.
The Universal Chiplet Interconnect Express (UCIe) standard was introduced in March of 2022 to help standardize die-to-die connectivity in multi-die systems. UCIe can streamline interoperability between dies on different process technologies from various suppliers. But while your UCIe-compliant, multi-die system may work great through development, testing, and manufacturing, how will you ensure that your system’s die-to-die connectivity will continue—robust, secure, and tested— even while it’s operating in the field? Read on to explore requirements for enabling multi-die-system reliability with UCIe—through IP, test, emulation, and beyond.
Imagine your multi-die system contains one die from one vendor and another die from a different vendor. Furthermore, imagine each die is on a different process technology, say 7nm and 3nm. You want these two dies to talk to one another seamlessly and support the protocols that are industry standard, such as PCI Express? (PCIe?) and CXL, among others. Keep in mind that for each die you add to your design, latency is added to the system, too. Everything will slow down.
Making it all work and surmounting the latency hurdles will require adhering to the right standard. Here are just a couple benefits of choosing the UCIe standard for your multi-die systems:
Beyond choosing the UCIe standard the long-term success of your multi-die system also requires that you ensure high quality out of the gate. Due to the complexity of multi-die systems, it’s important to achieve greater levels of quality in your SoCs not only through development and at the time of manufacture, but also long after your design is operating in the field. Doing this right will require using high-quality building blocks—the dies and IP—along with emulation and verification tools and ongoing testing and in-field monitoring, including repairing, so you can fix any problems proactively.
In addition to the controller and PHY IP, here are three additional requirements for ensuring success in your UCIe-based multi-die system:
Protocol verification IP solutions that run over software simulators can provide a head start in ensuring high-quality UCIe components and interface layers, including protocol layers over field device integration (FDI), PHY interfacing over raw die-to-die interface (RDI), intermediate shim layers, or die-to-die adaptor implementation.
As your design scope widens to the full-stack with multi-module chipset configurations and complex multi-die systems, you will need to move beyond software-only simulations to verify the whole system or dies. Hardware-assisted verification (HAV) platforms, such as 草榴社区 ZeBu? emulation systemand 草榴社区 HAPS? prototyping system, are essential in carrying out realistic verification for large, multi-die systems. Multi-MHz cycle performance, optimized UCIe protocol solutions (transactors, speed adaptors, hardware interface cards), and system-level debug abstraction are necessary to cover all verification use-cases starting from early RTL development to interoperability and hardware compliance.
Testing is an important part of any silicon design process. In multi-die systems, the interconnects between the dies are often based on interfaces such as UCIe. To perform as intended, these interconnects must have no stuck-at faults, opens, or shorts. Signal integrity is very important, so that is one parameter that must be measured to assess for degradation. The UCIe standard does mandate extra interconnects for redundancy. Post-bond testing can address interconnect-level concerns that could trigger the need to switch interconnect lanes. Algorithmic tests, developed with an understanding of fault models, can also assess for interconnect defects.
A UCIe interface is the primary interface for functional communication between dies within a multi-die system. Since the interface operates at very high speeds and is a critical pathway for communication, its health has to be monitored and managed throughout its lifecycle. Health monitoring of the UCIe can be a lifesaver for safety-critical applications from automotive to medical, and beyond. For instance, in a self-driving car, the health monitoring can enable a proactive repair or give the owner a heads up that a trip to the shop is warranted before a breakdown occurs on the interstate.
The 草榴社区 Silicon Lifecycle Management (SLM) Family will actively monitor the UCIe interface during operation, and if the lane signal quality is degrading it can repair the lane before it fails. There is also provision for built-in self-test (BIST), which can detect soft or hard errors for corrective action.
Silicon design is transforming in front of our eyes. Choosing the UCIe standard for your multi-die system is only the first step in seamless connectivity and interoperability. Adhering to these requirements is a critical part of navigating the complexities of advanced multi-die system design. If you would like to learn more about UCIe and how 草榴社区 can ease your multi-die system journey, check out Multi-Die SoCs Gaining Strength with Introduction of UCIe.
草榴社区 is empowering technology visionaries with a comprehensive and scalable multi-die solution for fast heterogeneous integration.
Download Brief