Cloud native EDA tools & pre-optimized hardware platforms
No longer a curiosity or even a fad, cloud is now deeply entrenched into the operations for an array of industries. With its scalability, flexibility, and elasticity, it simply makes good business sense to embrace cloud computing—especially in the semiconductor industry. Last year, when I looked into the crystal ball, I envisioned a number of events taking shape this year. From the mainstreaming of the cloud for peak use models to increased adoption by chip designers, expanded use by semiconductor companies with large data center investments, diminished supply chain issues, and the emergence of verification as a key cloud workload for EDA, these predictions have largely been realized in 2023.
What will 2024 bring to the cloud landscape?
Looking ahead, there are three key trends taking shape now that will likely grow in prevalence in 2024:
Let’s take a deeper look at each of these trends to understand why they are poised to make a bigger impact in the coming year.
AI is everywhere these days, from the edge devices you might have in your home to large-scale modeling systems used in science, medicine, finance, and many other sectors. The AI and machine learning models that generate valuable insights require highly complex chips to deliver the high bandwidth, low latency, and low power that make many of these applications feasible. Designing and verifying these chips in the cloud with pay-per-use models enables engineers to tap into EDA tools, compute resources, and storage options they need when they need them. This flexibility can lower overall costs and also simplify the process for getting these chip design and verification solutions up and running. As cloud-based EDA adoption rose in 2023, we can only see this trend accelerate in the new year, particularly as AI continues to demand more from the underlying chips.
Increasingly, , enabling them to take on not only the repetitive tasks within massive workloads but also those that are impossible for humans to accomplish in the timeframes needed. This enables engineers to focus on more value-added tasks like product differentiation, while resulting in better quality-of-results, time-to-results, and cost-of-results. At the same time, this also drives further multiplicative requirements for flexible access to compute resources.
In 2023, we started to see more instances of AI-driven EDA tools running in the cloud. Cloud-based solutions for tasks such as design space exploration demonstrated their ability to increase power, performance, and area (PPA) exploration productivity while exceeding PPA goals. For example, at SNUG Silicon Valley 2023, STMicroelectronics discussed how 草榴社区 DSO.ai? covered an entire search space of 180 permutons within 3,000 runs to achieve their targeted frequency, power, and floorplan dimensions, while increasing productivity by 3x. In this case, ST used DSO.ai on Microsoft Azure cloud, saving significant compute resources and infrastructure set-up time. We can expect that in 2024, adoption of cloud-based, AI-driven EDA solutions will continue to rise.
As more chipmakers design and verify on the cloud, their interest in optimizing costs will continue to grow. For companies without the resources to maintain an on-premises infrastructure for EDA workflows, the cloud presents an attractive option. For companies on the other end of the spectrum, cost-optimized solutions on cloud can make it worthwhile. The predominant approaches for cloud deployment offer high levels of cost flexibility:
However, there’s always room for additional cost optimizations. One effective avenue to lower costs is by deploying spot virtual machines. A spot virtual machine (VM) stems from excess capacity of specific compute VMs that cloud providers make available at heavily discounted prices when demand at a given moment does not meet their capacity projections. The challenge with running EDA workloads on spot VMs is, they can be removed on short notice. As such, EDA workloads need to be able to recover from a spot VM termination signal to avoid lost processing time when a job has been running for a while. Checkpoint restore functionality built into the tools mitigates this challenge. Then there are the EDA jobs with high memory workloads, such as physical verification or an RTL-to-gate implementation, where the runtime state can be several hundred gigabytes. In these cases, since the time needed to checkpoint can be much longer than the cloud provider’s spot warning window, the job would get terminated without the ability to restore and without saving the runtime state. AI-driven technologies that utilize termination signal predictions to manage EDA workloads between spot and on-demand VMs have emerged to address this challenge.
The AI-driven 草榴社区 ChipSpot solution is becoming increasingly popular as it helps address growing demands for cost optimization. To reduce workload terminations, the solution taps into termination signal predictions from its AI-driven algorithm to migrate a running EDA workload from a spot VM to an on-demand VM in a VM array. When the spot availability is back up, the running state gets migrated back to the spot VM in the array. 草榴社区 ChipSpot is powered by Exostellar X-Spot technology and can provide on-demand compute cost savings of up to 75%.
While EDA vendors may like to think that their customers use their flows exclusively, the reality is, customers choose the solutions that they feel are ideally suited for their designs. Often, this means they’ve got a mix of solutions from multiple EDA vendors, IP providers, and foundries in their flow. As chip designers move their work to the cloud, they don’t want to get locked into a pre-defined tool flow. Instead, they want to enable the same flows they have implemented and perfected over time—which may consist of an array of different chip design and verification solutions and IP from different vendors—in their cloud environment. These dynamics are pushing the players in the EDA ecosystem to collaborate more closely than ever, in the interest of optimizing the customer experience. For example, 草榴社区 recently launched the 草榴社区 Cloud OpenLink program, the industry’s first multi-vendor environment of EDA, IP, and foundry providers on the 草榴社区 Cloud platform. Through the OpenLink program, customers can securely and seamlessly access a diverse set of design tools and IP in the cloud. Ecosystem providers participating in the program work out time-consuming legal and technical interoperability details with 草榴社区 in advance, simplifying the interoperability-related details for the customer and also enabling customers to get their design projects underway faster. The OpenLink program is an example of the increased ecosystem interoperability that will only grow in demand in 2024 and beyond.
As the cloud becomes increasingly integral to the semiconductor industry, the key players will continue seeking ways to optimize its implementation in all the ways that will help drive business success. From the use of AI to design and verify AI chips to the deployment of spot virtual machines for cost savings and increased ecosystem collaboration to drive seamless interoperability on the cloud, the future is shaping up to be a bright one. Bottom line, anything that can help make it a little easier to develop today’s complex chips holds promise for the innovation that this industry needs to thrive.