How Qualcomm Accelerated Coverage Closure with AI-Driven Verification

Malay Ganai, Will Chen

Aug 24, 2023 / 4 min read

Of all the steps in the chip development process, verification may be one of the most time-consuming and labor-intensive. As chips grow in size and complexity, there’s an almost endless number of design state spaces that must be evaluated. The goal, of course, is to ensure that the end design will work as intended and to avoid costly respins. The faster that coverage closure can happen, the faster you can get to market.

Manual analysis of design state spaces can only get you so far in terms of actionable insights. If you can get to 100% verification coverage, then you can be assured you’ve found all the design bugs. But how do you know exactly what to write in the coverage definition for your testbench? Did you establish the right coverage goals? And how can you be sure which tests contribute the most to the coverage? So, the cycle ends up involving a lot of repetition: find the bugs, fix the bugs, repeat the cycle until you’ve achieved full coverage and can finally sign off on the RTL.

As Qualcomm highlighted at this spring’s DVCon US 2023 conference, the process gets even trickier for graphic processing unit (GPU) architectures due to their high levels of parallelism. The chipmaker’s engineers found that they were engaging in a substantial manual effort to write semi-directed test cases to hit corner cases at the block and cluster levels.

Fortunately, automation and intelligence are converging to accelerate verification coverage closure. Read on to learn how Qualcomm sped up its time to market using AI-driven chip verification. 

ai chip verification

Too Much Time on Testbench Tuning

Verification engineers report spending 60% of their functional verification time on testbench development and debug and up to 25% of their time on testbench bring-up and coverage closure. Given the complexities of today’s chips, chip verification can consume thousands of CPU hours, with closure taking longer than ever.

In an ideal world, verification engineers would be able to shift-left their functional coverage and benefit from three key improvements to the traditional process:

  • Faster stabilization and maturity of their testbench
  • Accelerated coverage and bug detection rate
  • Faster closure, with time saved in writing directed tests

In this world, challenges related to the testbench would be resolved. The reality is, testbench tuning can be quite laborious and risk prone. Scenarios such as missing and illegal stimuli, over bias, and under bias—which could cause missed bugs and/or longer cycles—are not uncommon.

Bringing intelligence into the verification process can alleviate these concerns by enhancing the quality of the testbench stimulus and providing valuable testbench analytics. Ultimately, this can lead to detection of more bugs, greater coverage, lower costs, and faster turnaround times. 

AI-Driven Chip Verification Saves Time and Effort

Qualcomm discussed its use of AI-driven verification, via its use of Synopsys VCS® Intelligent Coverage Optimization (ICO) technology, at DVCon US 2023. VCS ICO uses reinforcement learning and can be deployed at all stages of testbench development, accelerating and improving coverage, exposing more bugs, and reducing regression turnaround time.

At the conference, Qualcomm highlighted an intricate GPU architecture for which finding corner case bugs and achieving full functional coverage were particularly challenging. The architecture relied on a high amount of parallelism to achieve its high performance. The parallelism, however, makes it much more challenging to find bugs in the small corners of complex logic, as well as to close functional and code coverage at the block level on time.

The Qualcomm team had to manually create semi-directed test cases to hit corner cases at the block and cluster levels. To achieve a shift left on functional coverage closure, Qualcomm explored various functional verification solutions, with these criteria in mind:

  • Faster and earlier discovery of corner-case RTL, testbench, and constraint issues
  • Accelerated, automated functional or code coverage
  • Reduction in manual effort to write test cases of direct scenarios for hard-to-hit scenarios
  • Elimination of manual effort to rewrite or change functional coverage models

Ideally, the solution would fit seamlessly into the organization’s regression environment. Deploying VCS ICO technology in multiple real-life scenarios across multiple blocks and analyzing the data over a two-year period, Qualcomm experienced benefits at the early, mid, and late stages of its projects. The chipmaker compared bug rate and coverage rate results with and without using the VCS ICO solution.  Here’s an overview of the results from their three key projects:

  • Project A at late stage: An improved testbench resulted in the discovery of 30% more testbench issues, discovery of an RTL issue, and full visibility into the testbench. There were also fewer regressions, saving several days.
  • Project B at early to mid stage: Even while the testbench and design-under-test (DUT) were deemed stable, there was a big surge in bugs detected. Ultimately, all DUT bugs were exposed in fewer iterations.
  • Project C at mid stage: The team achieved a 10-15% reduction in grid usage per block, while also saving time in writing directed tests.

“After enabling VCS ICO in random regressions for this block, we witnessed dramatic improvement, a shift left, in functional coverage closure by 1.5 weeks,” said Srikanth Vadanaparthi, a senior staff engineer at Qualcomm. “We observed that most of the complex bins which need tweaking of constraints or the delay profile are now getting hit quickly. Consequently, we now need to spend very minimal effort (maybe for less than 100 bins instead of 200 to 500 bins) on directed /constraint tweaking or delay profile tweaking.”

Increasing Verification Productivity and Efficiency

As Qualcomm and other design houses have experienced, AI-driven verification with the VCS ICO solution can increase verification productivity and efficiency. At the early stages of a design, the technology exposes testbench issues and gaps, supports root-cause analysis, and provides testbench insights. At the mid stage, it yields better coverage, optimizes regressions, and helps detect hard-to-hit bugs (and more of them). At the late stage, the solution exposes omission bugs, improves diversity, and enables signoff confidence.

AI is being integrated into more of the electronic design automation flow, enhancing productivity as well as power, performance, and area (PPA) for chips that are becoming increasingly complex. Compute-intensive applications such as hyperscale data centers, automotive and, yes, AI, are driving up bandwidth and performance demands. AI offers a way forward for design teams tasked with ever-higher levels of achievement. As Qualcomm found, AI-driven verification can shift verification left from day one. 

Continue Reading