草榴社区

Optimizing Your Hardware IP Verification in the Cloud

Meghana Bellumori

Mar 14, 2023 / 8 min read

草榴社区 Cloud

Unlimited access to EDA software licenses on-demand

To deliver your IP hardware project, you will need a hardware verification campaign that systematically executes verification workloads against a comprehensive verification plan. How do you reach those goals with the shortest time, and the lowest cost, while delivering the highest possible quality? These are the long-established challenges of hardware verification.

Hardware verification engineers are familiar with a range of day-to-day challenges. Over the years, the EDA industry and engineering communities together have evolved effective verification workflows and strategies to address these challenges that enable them to deliver high-quality results within time and cost budgets. But the challenge to deliver quality-of-results (QOR) within both time-to-results (TTR) and cost-of-results (COR) targets is becoming ever harder to meet thanks to exponential complexity and chip design size growth, matched by compute and storage demands that push the successful delivery of complex products out of reach for smaller and some medium-sized hardware IP developers. The dominating cost is tied to hardware verification, with simulation being the dominant hardware verification workflow.

In this article, I’ll explore these challenges and then illustrate how a cloud-based hardware verification flow can help you manage key aspects of the process.

What are the challenges involved in a switch to the cloud as the main execution platform for these highly compute-intensive workflows? What are the key capabilities that must be supported in a cloud environment and what are the new opportunities that arise from that transition? How do you exploit those to improve QOR while reducing TTR and COR?

Let’s start by considering where hardware IP verification engineers spend most of their engineering time and effort. How productive is this and does the cloud both enable and enhance the experience?

1640387059

Mapping Hardware Verification Best Practices into a Cloud-Based Execution Environment

The top four hardware verification challenges are widely recognized as:

  • Planning (and signoff)
  • Coverage
  • Execution
  • Debug

Challenge No. 1: Test Planning & Signoff

Hardware verification can be a very open-ended process and so it is vital that a highly measurable approach is adopted. This invariably begins with test planning and ends with signoff. It’s an iterative process of measurement, analysis, and refinement, not an A-to-B journey. Verification results must meet signoff requirements and quality targets at key stages before a product can progress to the next stages of the development lifecycle. This discipline of planning and signoff is at the heart of what verification engineers do day to day. A transition to a cloud-based operating environment does not change this requirement, so your cloud environment must support this.

The 草榴社区 Cloud Verification Instance (VI) supports an integrated test planning and signoff workflow. Users can create and manage verification and coverage plans from the VI Home Window and subsequently analyze and report on testing results and coverage thanks to the integration of the 草榴社区 Verification Planner. Verification Planner enables full visualization of test and coverage progress throughout the development lifecycle, management of waivers and exclusion files, and analytics and reporting that support the eventual signoff of results versus plans.

Verification Instance Home Window | 草榴社区

Challenge No. 2: Coverage Analysis & Closure

Coverage is the key metric from which verification progress is measured and verification signoff is achieved. It is not the only metric, there are many others, but a coverage-driven approach to verification is essential and widely accepted as best practice.

Meeting your coverage goals does not always mean that you are done, and no further bugs are left to be found. For example, it does not tell you that your checkers are correct, or that random testing has exhaustively exercised all possible paths, but it does demonstrate that the verification stimulus goals are met. Hence much valuable verification engineering effort is consumed by coverage specification, implementation, coverage extraction, and coverage analysis.

In a cloud environment, these needs do not change, and the collection, presentation, and analysis of coverage results are essential for any cloud-based verification environment.

The 草榴社区 Cloud Verification Instance supports the collection of coverage. 草榴社区 VC Execution Manager schedules jobs and allows the user to control the collection of suitable coverage metrics such as code and functional coverage, and even to enable System Verilog testbench line coverage so that coverage of the testbench code can also be analyzed. It will automatically merge coverage results with back-annotation into the verification plans. By default, coverage merges will fail if there are any failing tests in the run, unless the user selects “Skip Merging Failed Tests,” as shown below.

Verification Tasks: Enabling Coverage Metrics | 草榴社区

Challenge No. 3: Testing Execution Management

This is where verification teams can spend vast efforts in simply managing the necessarily high volumes of testing required to achieve QOR and meet test plan and coverage goals.

Ask any seasoned verification engineer how much time they spend setting up regressions, and aligning the compute and storage needs to the job types, and then nursing the regressions through to completion, often with many out-of-hours interventions. This is followed by time-consuming triaging of the results to identify the root causes of failure, analyzing failure signatures, and identifying those jobs that need to be re-run for full debug and investigation.

There can be a lot of wasted compute-intensive resources, which has a direct impact on verification cost. When submitting large batch regressions to the job queue, any failed jobs that are not recognized early enough might unnecessarily consume compute hours or storage capacity. In addition, regressions that are repeatedly run on the same codebase with the same conditions are unlikely to stimulate new paths and expose new bugs.

Sometimes, with an on-prem environment, verification engineers become infrastructure and resource managers. They need a detailed understanding of their compute and storage environments to ensure they are using the best resources for each job type, tuning the requirements to optimize turnaround time and infrastructure costs. They are often competing for resources with other teams that are sharing finite resources. With fixed resources your TTR depends on the width of your execution pipeline and the speed and efficiency of the compute resources. The turnaround time may be predictable, but what difference would it make to your development lifecycle and the effectiveness of your engineering team, if you could significantly reduce TTR with a wider and faster execution pipeline? The total resource consumption may be the same or less (because of improved efficiencies of the infrastructure), but engineers gain a productivity uplift as they can respond to and validate design changes more quickly, enabling more iterations. This, in turn, enables the engineering team to deliver a higher quality product. It could be a game-changer.

The 草榴社区 Cloud Verification Instance integrates VC Execution Manager to deliver efficient regression automation and high productivity in the cloud. This fully integrated environment ensures that the user can focus on the verification task and is not burdened with cloud job management. Cloud Verification Instance takes care of all job execution details such as executing simulations, merging coverage, invoking 草榴社区 Verdi? Automated Debug System for debug and generation of reporting and analytics. 草榴社区 cloud portal provides an option to provision dynamic clusters which have an ability to auto-start or stop dynamically based on a pre-set idle time (5 minutes) and auto starts when jobs are submitted to that queue.

Challenge No. 4: Time-To-Debug (TTD)

It is well understood that debug can be a significant verification bottleneck when it comes to making verification progress. Time-to-debug depends largely on the capabilities of the engineering team, their familiarity with the codebase (both the design and the verification environment), ability to reproduce failure, the effectiveness of the verification failure reporting, the ability to diagnose waveforms, and the effectiveness of the debug tooling.

“Verification progress” is determined as much by debug turnaround time, as regression throughput time. If you are stalled by a problematic debug situation, the engineering priority shifts to debug productivity rather than testing throughput.

Debug efficiency can be dramatically improved when using a fully integrated source-level debug environment, especially one that assists the user with automated debug reruns and failure analysis. Whether running in the cloud or on-prem, the debug need is the same, so any cloud-based verification environment must provide debug capabilities that enable debug to take place in the cloud environment. Flip-flopping between a cloud environment and on-prem for debug will have a negative effect on productivity.

Using the 草榴社区 Cloud Verification Instance, regressions are launched from VC Execution Manager, and can be monitored through to completion. Failed tests are automatically rerun to capture debug data, so that interactive source-level debug and waveform analysis can be fully performed inside of the Cloud Verification Instance environment. Debugging tasks can be launched from buttons to invoke the debugger and waveform viewer, or to perform root-cause analysis tasks such as trace-back of X-propagation errors. No flip-flopping is necessary. All execution and debug tasks are handled seamlessly from within the users’ cloud environment. In addition, the cloud instance also provides design-under-test (DUT) root-cause analysis (DUT RCA) that automatically compares a failed test with a test that passed in a previous session, analyzes all suspect points that are making the regression fail, and finds possible root causes.

草榴社区 Verification Instance

An Integrated Verification Environment in the Cloud

For smaller organizations that do not have a specific team managing an engineering platform, building an integrated tools environment with defined workflows from scratch is a monumental task. The advantage of a cloud instance is the up-front investment by 草榴社区 to tackle this integration and workflow definition task and present a “verification factory” in the cloud for design verification engineers.

Cloud Verification Diagram | 草榴社区

Summary

We started by outlining a verification campaign that systematically executes verification workloads against a comprehensive verification plan, reaching those goals within the shortest time and the lowest cost, while delivering the highest possible quality IP.

Much time, effort, and cost is consumed in the provisioning of compute and storage resources, infrastructures, and EDA tool licenses. A software-as-a-service (SaaS) approach can solve many of these problems and transform the productivity of verification engineering. A key benefit of this SaaS approach is to ensure that engineers can reduce overall chip design cycles by leveraging cloud-optimized tools and compute through a one-stop, browser-based experience. Engineers can then focus on the intricacies of their design, not on the infrastructure configuration, compute selection, or license management.

The 草榴社区 Cloud Verification Instance offers an integrated environment which is scalable and by its very nature avoids excessive platform and workflow development by design verification engineers. With the cloud instance, engineers can focus more time on their primary objective, which is the efficient set-up and execution of their verification campaign.

Continue Reading