Insight Home | Next Article
Issue 3, 2012
Achieving Faster Design Closure with Early RTL Exploration
RTL exploration can accelerate the development of high-quality RTL and constraints, enable rapid what-if analysis, and facilitate a better starting point for RTL synthesis. In this article, Girish Prabhakara, member of technical staff, AMD, Shivaramakrishna Uddanti, engineering contractor, AMD, and Ramakrishna R, senior application consultant, Synopsys, discuss early RTL exploration and how Synopsys’ DC Explorer has helped AMD converge on QoR goals faster than previously possible. This article is condensed from a SNUG India 2012 paper entitled, “Productivity Improvements and Faster Design Closure with Early Exploration.” To access the original paper and presentation, please visit the SNUG India 2012 Proceedings.
Meeting aggressive design goals is quite a challenge at 32 nm and below, especially for large designs with complex floorplans. At AMD, we need to be able to generate—as soon as possible—a netlist for early what-if analysis and physical exploration to help us determine if the design can meet our QoR goals. Performing these steps in the early phase of the design cycle leads to faster design closure since it provides us the opportunity to make the necessary modifications to the RTL before extensive time and resources are committed to physical implementation.
However, during the early design stage the RTL is constantly changing, the constraints are not well-defined, and the libraries are often incomplete. Since RTL synthesis requires these inputs to be complete and consistent, designers are compelled to delay what-if analysis and physical exploration for weeks until all the design issues are resolved. They could be much more productive during this period if there were a way to quickly explore the design, perform what-if analysis, and get a head-start on floorplanning and physical feasibility analysis—even while the RTL and constraints are still being developed.
RTL designers at AMD have improved productivity and achieved faster design closure by performing these early analyses with the help of Synopsys’ DC Explorer. The tool enables early RTL exploration that’s tolerant of incomplete or inconsistent design data and constraints, yet tightly correlated with DC Ultra with topographical technology and significantly faster.
RTL Exploration Flow
Figure 1 highlights the basic RTL exploration flow in the context of the implementation flow. The design data inputs to the tool are the RTL, constraints, logic library, and, optionally, the physical library and floorplan. Because DC Explorer runtimes are very fast, even for large designs, it is feasible to iterate through the entire RTL exploration flow multiple times and perform what-if analysis to determine a design’s “sweet spot” for optimal QoR. In addition, designers can use DC Explorer to close-in on potential timing exceptions, access floorplanning in IC Compiler, and generate an early netlist for physical design exploration.
Figure 1: DC Explorer is used by RTL designers in the early phase of the design cycle to converge more quickly on an optimal design.
Tightly Correlated but Faster than Synthesis
When we evaluated DC Explorer at AMD, we compared timing QoR for DC Explorer against DC Ultra with topographical technology for various designs with cleaned-up RTL and constraints. As shown in Figures 2 and 3, the timing and area correlations were within 10% for most designs. In addition, DC Explorer’s runtimes were typically 2x-5x faster than RTL synthesis, as shown in Figure 4.
Figure 2: Worst negative slack correlation of DC Explorer with DC Ultra (Topographical)
Figure 3: Area correlation of DC Explorer with DC Ultra (Topographical)
Figure 4: DC Explorer runtime improvement versus DC Ultra (Topographical)
Tolerance of Incomplete Design Data
When the RTL for a recent design at AMD was being integrated at the top-level, some of the modules and IP blocks still had missing or inconsistent design elements. These issues resulted in LINT-xxx errors or warnings and RTL synthesis failed to proceed further. Some of these design mismatch scenarios are summarized in the table below:
Table 1: Summary of some design mismatch scenarios.
In contrast, DC Explorer ran through the entire design without stopping, tolerating the design mismatches and generating summary reports of all the issues it encountered.
For example, the tool was capable of resolving case mismatches in the interface port names of the RTL instantiations. Although a module port named “DataBus” was instantiated with an inconsistent case name (“dataBus”), DC Explorer still accepted it and proceeded with mapping the interface nets properly. Similarly, when “DataBus” had a port width mismatch (declared as “input DataBus [15:0]” in the module but mapped to the “input DataBus [25:0]” signal at the top-level), DC Explorer accepted it and proceeded with a proper mapping. For missing blocks, DC Explorer created a black box and continued with the dirty linking.
Tolerance of Incomplete Constraints
Our design constraints were also under-developed at this early phase of the schedule, to the extent that we had specified only the clock definitions and the approximate I/O budgets. Timing exceptions (multi-cycle paths and functional false paths) were not completely defined and clock domain relationships were not defined at all. Despite the incomplete constraints, during compilation DC Explorer inferred infeasible paths and reported which paths in the design should be set as timing exceptions.
We have observed that infeasible paths inferred by DC Explorer closely match actual timing exceptions used in clean designs, so it’s possible to continue with RTL exploration and perform what-if analysis even without further refinement of the timing constraints. However, you can tell the tool to generate a HTML-based categorized timing report to analyze the critical timing paths and specify the timing constraints in great detail. Clicking the “Create Exceptions” button generates a corresponding entry in the Synopsys design constraints (SDC) file.
Faster What-if Analysis
We typically sweep the frequency during synthesis to determine the highest frequency at which a design can reliably operate. Doing so reveals a “sweet spot” above which RTL synthesis produces sub-optimal results even at the expense of additional runtime. The problem is, with standard RTL synthesis the runtimes for large designs are too long and such experiments cease to be practical after awhile.
Using DC Explorer, we could iterate through the same what-if analysis much faster and this gave us the flexibility to experiment multiple times. To illustrate, Figure 5 shows the runtime comparison between DC Explorer and DC Ultra with topographical technology on a 600K-instance design. We started with a fairly achievable frequency and then iterated through the design with the frequency increasing in steps of 10%. Runtime versus frequency was essentially flat using DC Explorer, whereas the RTL synthesis runtimes increased substantially.
Figure 5: Frequency sweeps: Runtime comparison of DC Explorer with DC (Topographical)
With these experiments, we arrived at a sweet spot where the area increase and operating frequency were optimal, consistent with our expectations. Referencing the categorized timing reports for the higher-frequency run, we were able to understand the critical parts of the design which, once re-coded, would help us achieve a much higher frequency of operation.
Physical Design Exploration
Despite the presence of incomplete design data, DC Explorer generates a workable DDC or ASCII netlist (with or without anchor-cells for the unconnected ports/pins), which RTL designers can use to begin physical design exploration. In addition, DC Explorer provides direct access to IC Compiler design planning from inside DC Explorer so that RTL designers can conveniently create an initial floorplan and make modifications to it without leaving the RTL exploration environment.
We used this capability to quickly generate a floorplan based on the incomplete data, tweak it to fit our requirements, and take it back to DC Explorer to run RTL exploration with the updated constraints. The floorplan was worked on for several iterations in this manner until the desired refinement was achieved. These steps can be carried out by RTL designers even if the physical design flows are not yet stable enough to produce the final floorplan.
We also passed the incomplete data from the DC Explorer flow for initial design planning and pipe-cleaning through IC Compiler’s feasibility flow. Feasibility analysis in IC Compiler performs simultaneous placement and optimization (place_opt_feasibility) when the design data and constraints are not yet complete. Feasibility analysis can be timing and/or congestion driven and can be tailored to run to an intermediate stage such as initial placement, high-fanout-net synthesis, first-stage placement optimization, and so on. This process helps designers identify potential timing and congestion issues in the floorplan so they can take corrective action much earlier in the design flow, even before the RTL is deemed complete. We have observed that place_opt_feasibility saves 3x-7x of runtime, compared with standard placement optimization, with tightly-correlated timing.
Using the incomplete data from the DC Explorer flow, we performed feasibility analysis through first-stage optimization. The fast feasibility runtime let us iterate through the flow several times and converge quickly on an optimal floorplan by the time the RTL data was cleaned up. Only two iterations through the regular DC Ultra (Topographical)-to-IC Compiler implementation flow were then required to meet our QoR goals.
We have used DC Explorer to speed RTL and constraints development during the early phase of the design cycle. Because it is tolerant of incomplete design data, faster than RTL synthesis, and correlated with Design Compiler, it is an effective tool for enabling quick what-if analysis to assess the sweet spot for maximum operating frequency. In addition, physical capabilities supported by the tool such as the link to IC Compiler design planning and the ability to create and early netlist for physical feasibility analysis have helped us converge faster on an optimal floorplan that supports better quality-of-results.
- More Information:
About the Authors
Girish Prabhakara, member of technical staff in the CAD team at AMD, develops new methodologies and seamless end-to-end front-end flows for the advanced 28nm process node. He has nine years of experience in the semiconductor industry and holds a B-Tech in Electronics and Communication from BMSIT Bangalore, and a Masters in VLSI-CAD from MIT, Manipal.
Shivaramakrishna Uddanti, engineering contractor at AMD, facilitates design teams’ adoption of new flows and methodologies.
Ramakrishna R, senior applications consultant at Synopsys, supports the adoption and usage of the Design Compiler product family at the AMD Hyderabad site. He has nine years experience in front-end flows and has been with Synopsys for four years. Ramakrishna completed his Masters in Engineering from Osmania University, Hyderabad.