SNUG Singapore Abstracts  

Friday, August 16, 2013
10:15 AM - 10:45 AM
Timing Sign-Off Methodology in 20nm Technology: Process and Design Perspective
Altera Corporation
In small process geometries, particularly 20-nm and below, accurate process modeling is crucial to get optimal performance in IC design. In this paper, we present optimum Advanced On-Chip Variation (AOCV) application methodologies from process and design perspectives. This paper also proposes design solution for 20nm double patterning variation effect. Performance concerns associated with aging, its timing constraints generation and usage model are also discussed. The suggested methodology is applied in PrimeTime as the timing sign-off tool.

An Overview of Quad-Core A7 CPU Implementation
MediaTek Singapore
High performance processor implementation is a very complex and challenging task. This paper will discuss the challenges of implementing 28nm Quad-core ARM A7 CPU using Synopsys tools. . We will discuss key techniques in DC and ICC starting from synthesis to place and route that enabled the high performance. . Key index such as Performance, Power & Area target will be discussed in details. Floorplan, placement CTS, route challenges in ICC and timing closure techniques as well as DVFS STA signoff using PrimeTime will also be introduced. . Lastly, the paper will attempt to explain the importance of Dynamic IR drop on silicon speed.

A Realistic Approach To Account LDE Effect In Timing Library Characterization
eASIC (M) Sdn Bhd
With the advances to deep sub-micron process and higher speed design due to SOC requirements, it is important to ensure that the underlying cells contain realistic timing information. This is to avoid silicon versus simulation mismatches when the library’s timing is too optimistic or unable to achieve timing closure, wasting many engineers’ hours, when the library’s timing is over pessimistic. It is the objective of this paper to drill down to how LDE affects timing and how we can account for it in a hierarchical custom design based on an automatic approach that pushes down from the top level floorplan via a custom layout tool. This paper will also suggest a set of validation flows involving StarRC (to extract the underlying cell with proper parameters) before comparing timing data using HSPICE to ensure the most realistic value is being captured and validated in the timing library.

Tutorial: Accelerate Functional ECO Implementation with Formality Ultra
Synopsys
Late stage functional ECOs are often a nightmare for the frontend designer. There are challenges to identifying the logic to be ECO, and it is an even more daunting task to correctly modify the netlist yet stay functionally equivalent to the ECO RTL. In this session, you will learn about the innovative Formality Ultra technology which accelerates designers’ productivity in performing late stage functional ECO.


Friday, August 16, 2013
10:45 AM - 11:15 AM
Cone Extraction Technique for Incremental Static Timing Analysis
High performance processor implementation is a very complex and challenging task. This paper will discuss the challenges of implementing 28nm Quad-core ARM A7 CPU using Synopsys tools. . We will discuss key techniques in DC and ICC starting from synthesis to place and route that enabled the high performance. . Key index such as Performance, Power & Area target will be discussed in details. Floorplan, placement CTS, route challenges in ICC and timing closure techniques as well as DVFS STA signoff using PrimeTime will also be introduced. . Lastly, the paper will attempt to explain the importance of Dynamic IR drop on silicon speed.

Sign-Off Accuracy Hierarchical Design Interface Timing Budgeting Flow
Altera Corporation
In the domain of physical design, especially for SOC designs, hierarchical design integration methodology is often used instead of fully flat. There would be multiple partitions embedded into a top level. All the partitions, together with the fullchip, were design in parallel. One of the complications of hierarchical design was to divide the work between the partitions and fullchip level. A hierarchical design timing budgeting flow was developed to overcome this complication. Without a good automated budgeting flow, manual efforts to derive the budget would be intensive and time consuming. The existing solution was to use ICC-DP to perform budgeting. However as the design moves forward, re-budgeting with new data without going back to the planning stage is required. In this paper, the detail of a sign-off accuracy timing budgeting flow is discussed in the areas of flow stages, algorithms, limitations and its workarounds.

Automating SoC Based UVM Verification Environments
STMicroelectronics Asia Pacific
Functional verification is a straightforward problem to state but a difficult one to address. The increasing size and complexity of designs and shorter time to market window means that functional verification engineers must verify bigger and more complex designs in a shorter span of time than before. Hence there is an increasing need to minimize the amount of manual effort required for completing verification by improving the verification flow. At the SOC level using an advanced verification methodology like UVM, provides an effective solution for bug free design, but it consumes a lot of man days. To achieve complete verification closure an effective solution must ensure building reusable verification environments through an automated flow for faster and efficient development. This paper attempts to address the drawbacks in the current SOC-based UVM verification flow and proposes improvements to the existing flow. This work has been particularly challenging since very little work has been reported on UVM based automation at an SOC level.

Tutorial: AMS Designs in Advance Node
Synopsys
Analog and Mixed designers are facing tremendous challenges when process technologies are moving below 20nm. In this tutorial, design methodologies for advanced nodes are demonstrated for circuit design and mask layout design engineers. Full solutions are proposed to address challenges such as double patterning and FinFETs.


Friday, August 16, 2013
11:15 AM - 11:45 AM
Design and Implementation of Custom On-Chip Clock Controller
Altera Corporation
At speed testing is no longer regarded as optional, but a must for current deep sub-micron technology in which marginal defect is becoming a major quality concern that needs to be screened at actual operating frequency particularly on the fast growing SOC design. While there are explorations to enable high test clock frequency directly from tester, these methods requires expensive equipment as well as tedious board timing considerations which is challenged by production noise and environment control. One of the most effective and low cost test methods is to utilize the on-chip PLL for at speed testing. By utilizing the on-chip PLL, test could also leverage the user path as much as possible to ensure the quality of the design. At speed testing require a test pattern with two parts; the first part launches a logic transition while the second part captures the response at a specified time which determines by the on chip PLL clocks. This paper presents a custom on chip clock controller (OCC) which is able to provide a deterministic number of pulses from the on-chip PLL to support at-speed testing. This custom OCC can be integrated as part of the RTL design to provide more flexibility as well as controllability. Besides that, it also provides test clock enable support for slow-speed capture timing closure and could also be merged with functional clock gating which could help to reduce the clock latency on user clock path

Assistive Split Load Utility for High Fanout Clock Maximum Transition Violations
Clock Maximum Transition (clock maxtran), one of the project design rule constraints (DRC), has been increasingly stringent due to the high frequency clock design. . Although the internal and external timing specification of the design is converged, the design often has residual DRC violations that require several iterations of engineering change order (ECO) fix to obtain project sign-off quality. Fixing the high fanout clock maximum transition violations takes long turnaround time due to the load splitting process. In the current IC Compiler environment, load splitting is not straightforward and involves manual command entry. In addition, mistakes in manual fix can cause Formal Equivalence Verification (FEV) failure and result in longer debug time. With the motivation of reducing the turnaround time and smooth execution, this paper presents a semi-automatic, user-friendly utility script for resolving bad clock transition time. Demonstration of results is also included.

Analog Mixed-Signal Simulation in VCS
Lite-On Singapore
This paper is to share a simple method to run an analog mixed-signal simulation with a digital simulator like VCS without any analog mixed simulation tools required. Conventional mixed-signal simulation runs either using a mixed-signal simulator which supports Verilog A syntax or modeling simple analog block with pure Verilog code. There are many methods and languages supported by different tools on mixed-signal simulation. This paper will simplify the issue by focusing on using data type ‘real’ in the Verilog language to model an analog block accurately without any mixed-signal simulator solver. A simple integrating ADC that consists of reference voltage, capacitor, resistor and current are modelled using datatype ‘real’ together with a digital block will be used to explain the detail in this paper.

Tutorial: Quick FPGA Prototying Platform Bring-up and Design Debug
Synopsys
This tutorial highlights how to effectively employ best practices and design automation tools to accelerate the bring-up and debug of IP and SoC subsystems with an FPGA-based prototype system.


Friday, August 16, 2013
11:45 AM - 12:15 PM
Design Explorer - The "Little" Design Compiler That Punches Above its Weight
Lantiq Asia Pacific
Traditionally, RTL that is released from the frontend team needs to be completely syntax clean before any meaningful synthesis activities can begin. The RTL often includes many 3rd party IP deliveries, which the frontend design team has to integrate into the whole design. Most of the time the first few revision of the RTL release is likely to contain some syntax error such as “Port Width Mismatch” or the instantiated IP module may not have 100% pin matching to the physical model. When such RTL is released to the backend team, the synthesis effort that is meant for early RTL exploration and floorplanning is likely to fail.

With DC Explorer introduced into the early development phase, a complete compilation all the way to the end is more likely to succeed. . DC Explorer tolerates some data incompleteness, allowing early RTL exploration to take place. The tool generates reports which identify the shortcoming of the RTL quality, and in turn help to improve the quality of the subsequent revision. The generated netlist from the compiler has an area correlation of 10% and serves as a very good starting point for early floor planning.

This paper will share the experience of the user while evaluating the tool for new requisition. It highlights the advantages and other shortcoming of the tool.

DFM Driven Scan Failure Analysis Using Synopsys Yield Explorer
GLOBALFOUNDRIES Singapore
Silicon yield debug at advanced nodes like 28nm and below technologies is getting more challenging as well as more time consuming. The long product yield ramp cycle is hurting time to market. One method proposed and tested within GLOBALFOUNDRIES to alleviate the long yield debug cycle in narrowing down the root cause of systematic yield loss is to leverage on the process / design marginality interaction from Design for Manufacturability (DFM) simulation results.

Hotspot markers are used to correlate /overlay with Design for Test (DFT) failure candidates markers derived from Synopsys TetraMAX scan diagnosis. Synopsys Yield Explorer is a suitable platform used to perform correlation and yield impact assessment. This paper provides the details of accelerating yield debug through the DFM driven scan correlation flow using Yield Explorer. Subsequently, the output from Yield Explorer can be used to identify critical systematic Physical Failure Analysis (PFA) candidates.

Product and test engineers can apply Yield Explorer correlation results to extract design-critical features and their corresponding failure rate based on the failure nets or instances. The derived critical features can be a catalyst to drive design enhancement and process improvement to achieve the desired yield ramp.

The paper showcases the case studies of this flow application based on our 28nm silicon debug data to identify key yield detractors.

Python Based Layout Automation Practice with PyCell Studio
GLOBALFOUNDRIES Singapore
Reuse and automation in layout design is always the catalyst in a project. Almost all IC layout tools in the market have embedded a programming language interpreter to support automation. Synopsys PyCell Studio has a Python language support and a set of powerful API functions for layout creation. Python is a widely used programming language Compared to TCL, it is still young and in fast development. There are a lot of methods and libraries that we can borrow from Python in the Software development side. This paper shares some layout automation practices that we make in Python with PyCell Studio. At the end, a case shows how to create technology independent test structures with Python and PyCell.


Friday, August 16, 2013
1:15 PM - 2:15 PM
Technology Keynote: Advanced Design, Regardless of Process Technology Node
Mr. Don Chan, Vice President, Research and Development - Synopsys Inc.
In his keynote address at the IDF 1997, Intel's Dr. Gordon Moore said “what we end up doing is really selling real estate. We've sold area on the silicon wafer for about a billion dollars an acre as long as I've been in the industry.” This is well known as the “corollary” of “Moore’s Law”: the number of transistors per unit of area doubles from one technology node to the next, but the cost of a unit of area stays the same. Unfortunately, this is less and less true, and according to an IBS report entitled “The Economic Impact of the Technology Nodes”, 100 square millimeters of silicon at 20 nanometers will cost 60% more than at 28 nanometers. As a consequence, only those who have a product that really needs the ultra-high performance and ultra-low power of 20 nanometers and below, and have the manufacturing volume to justify the investment will keep rushing to the “emerging” technology nodes. The rest of the industry will hold at more “established” ones.

Design, though, can be “bleeding edge” regardless of the manufacturing process, and will be more and more *the* differentiator, and EDA a critical ingredient of successful design. The final result of combining “more than Moore” and “more of Moore” can be surprisingly more advanced than what is allowed by the simple progression of the semiconductor roadmap through scaling.


Friday, August 16, 2013
2:15 PM - 2:45 PM
Better ICC to PrimeTime Correlation with Pseudo AOCV Table
MediaTek Singapore
This paper presents a new way to reduce the timing correlation gap between Primetime and ICC. Conventional ICC flows do not support path-based AOCV. This is understandable as the compute resource required to handle path-based AOCV calculation will drastically reduce the runtime and efficiency of ICC. In contrast, graph-based AOCV is supported in ICC. However it is too pessimistic and will increase the utilization rate as ICC over-fixes the design.

Parallel ECO Work Model
Time to market is one of the most crucial elements in order for us to succeed in SoC design.  Some design blocks can be extremely complex which might require multiple designers to work in parallel to increase the throughput and shorten the time required for convergence. Traditionally, designers working in parallel are limited to providing netlist ECO (Engineering Change Order) changes to one primary designer who will then reconcile the changes. This method has flaws, causing more loops in the ECO mode and taking more time and effort for convergence. The method that we will present in this paper, “Parallel ECO Work Model” will resolve some of these inefficiencies in the designer’s work model.

Abstract PCell Schematic
eSilicon Corporation, Vietnam
Today, one of the challenges in IP design is to implement a design platform agnostic to foundries. This general design can be used with different technologies from various foundries. Parameterized Cell (PCell) is a powerful way to create a design once and reuse it to reduce re-works. Ciranova is a very useful tool to develop PyCell (Python PCells). This paper proposes an implementation of a general design agnostic to foundries that consists of a PyCell schematic library compiled by Ciranova API. This library in Open Access can be integrated with Custom Designer as an abstract library of which instances are user-defined by a configuration file containing information of foundry, technology, devices mapping. The extent of this research also addresses solutions for exporting the netlist correctly, mapping data for SDL generation and building a mechanism to make the abstract library comfortable with multiple PDKs.


Friday, August 16, 2013
2:45 PM - 3:15 PM
A Novel Approach of FPGA-Based ATPG Scan Compression
Altera Corporation
The amount of device logic to be tested per pin increases exponentially for each technology node and this causes a significant impact on automatic test pattern generation (ATPG) test time. In standard ASIC design, the common approach to address this issue is to implement scan compression logic as part of the design-for-test flow with a portion of the silicon die used to form scan compression logic. This paper proposes an alternative to adopt scan compression for FPGA testing. Instead of pre-built hard logic, scan compression will be formed by an FPGA programmable resource, with zero die size impact. In addition, this approach can be applied on FPGA products which are already in the production pipeline. This approach has been successfully used on the Altera FPGA products. The details of the setup and results are further described in the paper.

Optimum Design Planning with DC-Graphical and ICC-DP
Altera Corporation
Design planning is critical not only to ensure good design Quality of Results (QoR) and runtimes, but it also impacts project planning. Both flat and hierarchical designs have their inherent benefits and challenges. This paper presents an optimum design planning methodology from physical design partitioning, hierarchical pin assignment, timing budgeting to design integration. Faster design partitioning turnaround time is achieved by leveraging DC-Graphical floorplan exploration capability. This methodology also recommends timing re-budgeting during 1st pass top level design integration to overcome timing closure challenge and obtain optimized QoR. The suggested flow is demonstrated with an Altera multi-channel transceiver block, and supported by test-case data.

Tutorial: UVM Best Practices
Synopsys
Functional verification tends to be the most resource-intensive phase of the design cycle. Adopting an industry-standard verification methodology like Universal Verification Methodology (UVM) can play a significant role in speeding up the verification process, paving the way for faster time to market. The steep learning curve and uncertainty about the best approach to implementing certain functionality can discourage the adoption of UVM. In this tutorial we will outline the best practices that will enable you to quickly put together a reusable UVM testbench. We will start off by describing the UVM template generator, then summarize key coding guidelines for reuse and faster compile turnaround time and finally delve into coverage plan integration.


Friday, August 16, 2013
3:30 PM - 5:00 PM
Tutorial: Galaxy-ECO: The Fastest and Most Effective Solution for DRC, Timing and Leakage Power Closure
Synopsys
This tutorial will cover the basic and advanced options available in PT-ECO fix_eco_timing, fix_eco_drc and fix_eco_leakage for ECO closure. Designers will also understand how PT-ECO works with ICC Minimal Physical Impact Flow to achieve fast ECO convergence.

Tutorial: Easing Floorplanning with Data Flow Analyzer
Synopsys
Shrinking geometries in the latest process nodes results in metal stacks that are highly tapered. Traditional ways to pre-route parasitic estimations no longer work well to predict accurately what happens during the routing. This may cause timing divergence after routing and results in a long loop back process to converge on a timing clean design. This tutorial will discuss several techniques and tool capabilities to address this issue, including preroute layer assignment, pattern based RC scaling and more.

Tutorial: An Integrated Approach to Designing the Right SoC Architecture, Starting Software Development and SoC Validation Earlier Using Prototyping
Synopsys
For new SoC development, software represents half the cost and half the time to market. Software impacts SoC development from architecture design throughout the delivery to customers. To be successful, development teams must get the architecture right from the start, which means starting software development earlier and in parallel with hardware design and accelerating hardware/software integration and system validation. This tutorial will highlight, through use cases and examples, how prototyping delivers better, earlier, faster SoC development. We will cover the topic of prototyping for architecture design, software development, hardware/software integration and system validation. Multicore architecture optimization, virtual prototyping and FPGA-based prototyping technologies will be highlighted, as well as how these technologies can be connected with each other to provide companies with optimized solutions that deliver the right design and accelerate time to market.


Friday, August 16, 2013
4:15 PM - 5:00 PM
Tutorial: Pre-route Layer Optimization and Correlation To Post Route
Synopsys
Shrinking geometries in the latest process nodes results in metal stacks that are highly tapered. Traditional ways to pre-route parasitic estimations no longer work well to predict accurately what happens during the routing. This may cause timing divergence after routing and results in a long loop back process to converge on a timing clean design. This tutorial will discuss several techniques and tool capabilities to address this issue, including preroute layer assignment, pattern based RC scaling and more.


Improved Fiducial Cells Placement For Optical Probing Efficiency

Parametric On-chip Variation Analysis
PMC

How We Saved Over a Half Million Dollars in Mask Costs Using the Power of IC Compiler’s Z-route
Synopsys, Plano Texas, Texas Instruments Inc.