SNUG India Abstracts   

Wednesday, June 13, 2012
10:30 AM - 12:15 PM
WA1 Tutorials & User Session
Achieving Rapid Verification Convergence with AMBA ACE VIP
Varghese Ray [Synopsys]
Over the last couple of years, new interface protocols have been added to AMBA4, including the AXI Coherency Extensions (ACE) for system level cache coherency across multi-core processors in SoC designs. Traditionally, cache coherency management has largely been performed in software, adding to the software complexity and development time. With ACE, system level coherency is performed in hardware, providing better performance and power efficiency of complex SoC designs. This additional capability increases the complexity in functional verification of such complex designs. In this tutorial, you will learn how the latest generation AXI Verification IP (VIP) from Synopsys can help accelerate and simplify the adoption of the ACE protocol.

Improved X-propagation semantics: CPU server learning
This paper emphasize on the need of modeling and simulating silicon like behavior at RTL level. It describes the limitations with the regular 4-value Verilog/System Verilog based RTL simulation and also cover the specifications for enhanced simulator semantics to overcome those limitations. It also highlights design issues that were found in their next-generation CPU server project using simulator that implemented these enhanced semantics.

Integration of Legacy Verilog BFMs and VMM VIP in UVM using Abstract Classes
Santosh Sarma [Wipro]
This paper presents an alternative approach where Legacy BFMs written in Verilog and not implemented using Classes are hooked up to higher level, Class based components to create a standard UVM VIP structure. The paper also discusses an approach where existing VMM Transactors that are tied to such Legacy BFMs can be reused inside the UVM VIP with the help of the Accellera/VCS provided UVM-VMM Interoperability Library.

WB1 Tutorials & User Session
An enhanced STA methodology considering simultaneous multiple input transition effect on complex gates
In deep Submicron technology the gate delay is impacted based on the adjacent input transitions for the multi input gates. For the multi input standard cells, the arc delay from input to output has large impact on adjacent inputs transitions. The delay impact could be as large as 50% at 22nm process node. If a timing path consists of multiple complex gates it could result in setup/hold timing failure in silicon. This paper talks in detail about effect of multi input switching on delay and details a flow which takes care of this effect for most reliable STA sign-off.

Early Power Estimation Using PrimeTime-PX
Thenappan Meyyappan, Rajagopal K.A [Texas Instruments]
Power estimates are required at every stage of the design cycle to enable co-development of various design activities. RTL development, Floorplan and Powerplan development, IO Ring, Package and System development happens in parallel and will need reliable power estimates at various stages. Estimating power when all the inputs are available and reliable, is a relatively an easier problem. This paper addresses the problem of reliable power estimates with the available information when the inputs are themselves being defined (RTL, Netlist, activity vectors etc), with a PrimeTime-PX based framework. While accuracy is a matter of concern, what is more important in such a setting is establishing upper/lower bound for the estimate with the available information and narrowing the spread as the design matures.

WB1 User Session and Tutorial
Double-Patterning Aware Extraction and Timing Signoff at 20nm
Ananda Veerasangaiah

WC1 User Session
Reducing Scan ATPG Overhead: Generation of Internal Scan Enable Control and its Qualification with ATPG Tool
Rajesh Mittal, Puneet Sabbarwal, Rubin A. Parekhji, [Texas Instruments], Charles Kurian [Mirafra]
This paper talks about reducing scan test time by having scan enable generated internally thus reducing the dead cycles between shift and capture and also eliminating the need of an extra pin at top-level.

Techniques to Improve Quality of Memory Interface Tests in SoCs Using Synopsys TetraMAX’s RAM Sequential ATPG
Sanjay Krishna H V, Srivaths Ravi [Texas Instruments]
This paper talks about generating high quality memory interface test using TetraMAX native support, DFT hooks that assists the tool in ATPG and proposes a flow for generating such patterns.

Malthesh HG, Jay Shah [Open-Silicon], Hayk Chukhajyan, Gurgen Harutyunyan [Synopsys]
This paper explores the testability aspects of TCAM and lays out strategy for on-chip BIST for the same. It also gives an overview of Synopsys STAR Memory System and its usability

A Unique DFT Clocking Scheme to Reduce the Peak Power consumption during scan shift
Jasvir Singh, Renuka Deshpande, Manoj Kumar Yadav [STMicroelectronics]
This paper discusses a unique method of reducing peak power during shift that also helps in reducing tester time. The method talks about handling clocks in specific way to achieve that.

Wednesday, June 13, 2012
1:15 PM - 2:45 PM
WA2 User Session
Gaps and Challenges with Reset Logic Verification
Deepak Jindal [Freescale Semiconductors], Mayank Digvijay Bindal [Synopsys]
Power-on-Reset (POR) is a key functional sequence for every SoC design and any bug undetected in this logic can lead to dead silicon. Complexities in reset logic pose increasing challenges for verification engineers to catch any such design issue(s) during RTL/GL simulations. This paper talks about these reset logic simulation challenges in details giving references to real silicon issues and shares the experience of evaluating a new VCS technology with a potential to catch most of these POR bugs/issues during RTL stage itself.

Integrating SystemC OSCI TLM 2.0 Models to OVM based SystemVerilog Verification Environments
This paper explores the successful integration of SystemC TLM2.0 components in OVM based verification environments and also highlight how the TLI (Transaction Level Interface) adapters help TLM2.0 sockets in SystemC to communicate with those in SV and vice versa. These SystemC models are created for early performance analysis and accelerated software development. In OVM-based verification environment, these models are reused as reference model for scoreboard and thus helped in reducing overall environment bring-up time.

SOC Compilation and Runtime Optimization using VCS
Santhosh K R, Sreenath Mandagani [Mindspeed], Gaurav Chugh , Ritesh Sharma [Synopsys]
The complexity of SoC has forced engineers to verify large number of scenarios to uncover all the bugs, which in turn has increased the verification time, cycle, and the size of the regression test suites drastically. This paper focuses on the methodology to improve total TAT (turnaround time) and memory reduction using v2k configurations, Partition compile flow, parallel compilation, and various performance optimization switches of VCS.

Wednesday, June 13, 2012
1:30 PM - 3:00 PM
WB2 User Session
GCA methodology to improve the constraints quality and hence silicon failures
Venkat Vemulapally, Ashish Mehta [Ikanos Communications], Natarajan Sridharan [Synopsys]
Perfect timing constraints are mandatory for the expected silicon functionality and writing them correct the first time is always a challenge; more so with the complex designs with multiple clock domains with complex inter-clock relationships. Bottom up implementation methodology is followed generally for all the bigger chips, to get an improved turnaround time, but the biggest risk is the consistency of the constraints between the blocks and the top.

Synopsys GCA (Galaxy Constraint Analyzer) is helping in catching the constraint problems in the early stages. We have evaluated this tool on one of our new designs, which is complex enough with 10M instances, 150 clocks, and 20 hierarchical blocks. This paper is an attempt to share our experiences during this evaluation.

Vector based Reliability Signoff for High Speed Serial IO Interfaces
Reliability checks are an important part of silicon signoff through which we address issues like Electro-migration, S-Factor; Self-heat for Power and signal routes and the IR drop requirements for power nets. Due to the application use-cases for blocks with complex high speed interfaces which typically work on high speeds of 2.5-3 GHz range and handle continuously streaming data packets handling multimedia, the signoff criteria can be very stringent on these blocks. This paper discusses how to use vectors to achieve realistic estimates for toggling of these blocks.

Dynamic IR-Drop impact on High Speed Designs
Arindam Dutta, Vishal Srivastava, Rajat Kukreja [ST Microelectronics]
In high-speed designs, it would be critical to consider the impact of dynamic IR-drop on design frequency. It is possible that the circuit will be operating at a lower frequency because of IR drop as compared to the one on which signoff is closed. This paper the impact of IR-drop on timing analysis using a test-case design (memory + BIST). It is found that IR-drop based timing analysis not only worsen setup problem but can also create hold violations. The paper will also talk about the flow used to undergo such analysis which primarily includes IR-drop based static timing analysis in primetime. Finally we would conclude with certain recommendations based on our findings.

WC2 User Session
HAPS Based FPGA Prototype of Industrial Communication Subsystem for IP validation and Firmware Development
Kanad Kanhere [Texas Instruments]
In order to support various industrial communication protocols, like EtherCAT, Profinet, on their Sitara™ devices, TI undertook the design of the Industrial Communications Subsystem IP. The ASIC implementation had to meet tight real time requirements demanded by these protocols. Real world testing, thus, was of prime importance, given that the IP was being designed from scratch. Interfacing with actual devices, at speed, mandated an FPGA based prototype. To meet these constraints, HAPS-64 was chosen as the prototyping platform. The paper discusses HAPS-64 based FPGA prototyping experience for the Industrial Communication Subsystem IP. It discusses a synplify_premier and Certify based partitioning and synthesis flow that was used for this exercise. Finally it highlights the success of the prototype in terms of enablement of exhaustive IP validation and extensive firmware development that resulted in very fast silicon bring up.

Effective Strategies for Bringing Up and Debugging an FPGA-based Prototype
Prasad Kadookar [Synopsys]
SOC’s in today’s world are complex, prototyping them on FPGA is mandatory step in development of these SOCs. Prototypes also helps software development team to begin the software development before the actual silicon is available. This tutorial provides strategies and techniques using Synopsys FPGA based prototyping products to address the key challenges faced in bringing up an FPGA prototype and also provide insights on debugging these FPGA based prototypes

Wednesday, June 13, 2012
3:00 PM - 4:00 PM
WA3 User Session
SoC Verification with Embedded Processor
Vikash Dwivedi [Maxim Integrated Products]
Traditional simulation methods for verifying SOCs with integrate processors, such as writing directed C test cases fall significantly short of expectations in context of debug-efficiency, simulation-speed and generating complex stimuli. This paper describes an alternative approach using constrained random stimulus generating Transactor replacing Embedded Processor model. This methodology enables generation of complex stimuli with minimal overhead, enabling reuse of block-level verification environments into chip-level environment and eliminates need for instruction-accurate processor model increasing simulation speed.

Verification methodology for dual NIC SoC using Verification IPs
AV Anil Kumar, Mrinal Sarmah , Sunita Jain [Xilinx Technologies]
This paper exploits various features of nSys PCIe and Ethernet Bus Functional Models (BFMs) that enable verification of the DUT in various traffic scenarios and explores how easily nSys BFM Application Programming Interface (API)s can be integrated into the verification test cases to get better functional coverage in very short duration. The implementation also brings forth how a dual NIC verification environment can effectively use Ethernet BFM APIs to test various Media Access Control (MAC) features.

WC3 User Session & Tutorials
Pre-silicon Verification and Validation using CHIPit
D Naresh Kumar, Naveen Prasad [Mindspeed], Rajkumar Methuku [Synopsys]
This paper discusses about successful implementation of a packet processing IP and an embedded processor based SoC into HAPS600 platform and outlines the various steps involved in full SoC prototyping. Porting of SoC RTL to FPGA requires lot of customization such as conversion of ASIC libraries to FPGA libraries, handling of gated clocks etc. This paper describes how to address above issues by leveraging the synthesis tool options and shares various techniques adopted to fit bigger designs across multiple FPGAs. The prototyped design consisted of two Ethernet interfaces and one PCI interface. Off-the-shelf HAPS600 extension boards were used to implement the above interfaces which helped to bring up the design quickly. This FPGA based verification platform was used to run time consuming and application scenarios which helped to find bugs that would have been not possible to find using pure simulation test cases. The Software team was able to develop and validate their code on the above platform which helped to reduce the silicon bring-up and validation time.

Pin reduction methodologies and novel approach for gated clock conversion while using multi FPGA platform
Saransh Mehrotra, Charul Jain, Ambrish Pal [STMicroelectronics]
With increasing gate counts in ASIC, its prototyping for pre-silicon validation and software development is spanning over multiple FPGAs. Having option to map design over multiple FPGAs has helped the prototyping designers mitigate the issue of area. But, it has given rise to issue of pin count due to limited number of inter FPGA traces. Also, there are challenges related to clock gating conversion of this big design on individual FPGAs. We discuss different measures which could be followed to reduce this pin count and make design compatible for porting on given multiple FPGA platform, in terms of both - area and pin count. Each measure introduces its own challenges and overhead, which will also be highlighted during the course of paper. We will also discuss the warnings related to clock gating which we come across while synthesis of these partitioned individual FPGA projects in Synplify Premier, their significance, their impact on design and how to have a work around for each of them.

Managing challenges in large FPGA designs
Suresh Kumar [Synopsys]
The availability of very large FPGA’s (2M FPGA gates) allows implementation of large complex designs with processors, communication IP’s, DSP’s running at 200+ MHz frequencies in FPGA’s. FPGA’s are increasingly used across many diverse applications in automotive, communication, networking, consumer electronics, industrial, medical and defense domains. The combination of design and FPGA architecture complexity is driving the need for software tools to have smart features to deal with different issues related to design implementation or debug. This tutorial discusses issues related to large designs like design setup, design preservation, high reliability, managing implementation runtimes, timing closure and debugging functional issues in the hardware.

Wednesday, June 13, 2012
3:15 PM - 5:00 PM
WB3 Tutorial & User Session
Performance and Productivity Improvements in PrimeTime 2011.12 & 2012.06 release
Natarajan Sridharan [Synopsys]
The performance and productivity improvements in latest PrimeTime releases enable users to address signoff challenges more efficiently. This tutorial will cover new capabilities, including best usage methodology to take advantage of improved runtime and ease-of-use in debugging timing issues. The topics presented are suitable for all users of PrimeTime.

Faster Turnaround using PT-ECO
Vikas Sharma, Atul Nauriyal, Dhiraj Jaiswal [ST Microelectronics]
Long turn-around time is the reality for Design-STA closure, especially when doing technology migration to new technologies (40nm->32nm, 32nm->28nm) or at signoff stages of Complex designs. "PnR-Signoff" loops are becoming more costly due to corner explosion (150 corners), more variability and more complex designs. PrimeTime’s latest technology, PT-ECO, comes handy for such project scenarios, performing mutli-mode multi-corner analysis & ECO capabilities in single run. The paper would discuss parameters to determine PT-ECO usage in design development flow, the advantages of using PT-ECO.

'Go Beyond' Standard Power Computation & IR drop Methodology for DDR PHY (IP): A way to ensure first pass Silicon
Keshav Chintamani, Dhanapathy Krishnamoorthy [Texas Instruments]
Typical DDR PHY (IP) performs write or read on both the edges of the clock to memory in bursts or idle for long durations. This read: write: idle: pwrdn ratio varies w.r.t every design’s use case. This interface is expected to hit the data rates as high as 2133Mbps on using the DDR3 protocol defined by JEDEC, which inturn poses a challenge in terms of dynamic IR closure as these PHY blocks will be instantiated multiple times at SoC level. This interface has critical link budget paths in both Transmit and Receive side. Hence, a challenge is involved in transforming the IR drop effects into link budget and make sure positive margin even for weak silicon. This paper delves deeper into the above aspects of power computation and IR drop closure that were beyond the standard practices.

Thursday, June 14, 2012
10:15 AM - 12:15 PM
TA1 Vision and User Session
Designing 100 Billion Transistor Chips
Brent Gregory [Synopsys]
At the current rate, we'll be designing 100 billion transistor chips within the decade. This talk will review the technologies that will get us there and how those technologies will alter every step of the RTL-to-GDS flow for all chips. We need new technologies for evolving chip physics, computer hardware, and design styles. Learn how this will drive future EDA software and the way designers work with the software to deliver these new chips.

Productivity Improvements and Faster DesignClosure Using DesignExplorer
Girish T P, Shivaramakrishna Uddanti [AMD], George Jacob, Ramakrishna R [Synopsys]
Early Synthesis on any design cycle is an iterative process with RTL, libraries & constraints still evolving. For a faster turnaround time and design closure, there is need to generate the netlist quickly for early analysis and physical exploration. Also netlist need to correlate very well at final implementation stages. The new “Design Explorer” tool from synopsys eases the flow in gaining advantages through course of changes. The flow/methodology to achieve a seamless execution, best results to correlate well at final stages and it’s on value on project is detailed in this paper.

Improving TAT of DRC fixing by eliminating manual intervention using Auto DRC Repair flow
Amar Chand Pallam [AMD], Ananda Veerasangaiah [Synopsys]
The DRC design closure on sub-nanometer node is always painfull with lots of manual intervene for last mile signoff DRC cleanup. The TAT very much depends on individual skills, manual errors and multiple iterations. The ICC-ICV ADR flow is a Sign-off driven DRC fixing solution on complex large designs for accurate results and better TAT. This paper details about flow or methodology used and its key benefits from a real design experience.

TB1 User Session and Vision
Forward looking Quick-PNR flow for early power estimation
This paper talks about the forward looking Quick-PNR flow gives the RTL/front-end designer a push button flow to take the RTL through the implementation flow which is not as involved as the POR flow but has enough details for the design team to quantify new architecture and had a very fast TAT. The advantage of this flow is it can be run by the front-end designers, get quick feedbacks on power saving and thereby expedite RTL development/ freeze before handing off the RTL.

Low Power Design: How Long Until We Hit the Wall?
Godwin Maben [Synopsys]
This session aims at covering multiple Green strategies right from Architectural Level to Transistor Level; deployed currently, in the Past and Future, to minimize Power consumption.

TB2 User Session and Vision
UPF-aware Formality Flow for Large Complex 40nm SoC
Akhilesh Shukla, Avani Deshpande, Devendra Deshpande [LSI]
This paper discusses how UPF-based formality flow was used for Low power equivalence check to verify implementation of multiple power domains and power states of a high performance complex 40nm SoC.

TC1 Vision & User Session
Semiconductor trends and challenges for the coming years
Narendra Shenoy [Synopsys]
Semiconductors play a critical role in our lives today. We start out by identifying four key areas that offer a great opportunity for betterment of human society : mobile consumer electronics, healthcare, transportation and energy. The talk then explores the requirements on semiconductors from these market segments. Next we investigate the key design and process trends that affect these solutions. We finally conclude with a short summary of how the trends drive EDA requirements.

Single digital test bench for functional verification of multiple analog and digital blocks embedded deep inside mixed signal designs using XA-VCS cosim flow
Sean Sequeira [AMD]
This paper illustrates using PLL as a case study: (a) how best we could use a single test bench to verify the complete mixed-signal design with a focus on verification of embedded digital and analog structures that previously required dedicated test benches and abstraction tools, and (b) tips and tricks that would help set up the co-simulation quickly and address other co-simulation challenges. The belief is that this paper will help achieve faster time to market with high-quality mixed signal designs by shortening the chip design-verification cycle and making it more robust and thorough.

A solution for quick verification of net-matching constraints
This paper illustrates a layout verification, parasitic resistance extraction and simple DC circuit simulation based solution to verify net matching constraints. The solution is targeted to work with polygon based layout data formats such as GDS, OASIS etc. and not just with object based layout formats such as OpenAccess etc. The paper will finally demonstrate as how the measured resistance data can be interpreted to verify the net matching constraints to first degree of accuracy.

TD1 Tutorial & User Session
Enabling Early Software Development for ARM-Based Designs Developing Software for ARM big.LITTLE Based Designs Running Android
Asheesh Khare [Synopsys]
As devices get more and more complex, developing software for those devices becomes increasingly complex. While big.LITTLE processing offers a way to balance high performance through the use of the ARM Cortex-A15 MPCore processor with power efficiency by switching to the ARM Cortex-A7 to extend battery life, these processors challenge software developers to keep up and both utilize the available compute power while being power conscious. With the right set of models, virtual prototypes offer a unique platform view to the software developer to ease the software development for these multicore designs. Moreover, they provide unique capabilities to make sure that software developers correctly utilize the available Android control functions to deliver a smooth user experience while minimizing power consumption.

Recipe for flavored RTL & Packaging
This paper talks about automating IP packaging and Integration of IP deliverables coming from multiple IP providers using core tools. This method addresses the challenges that arise due to inconsistent IP collaterals & it helps in reducing integration cost of IPs from several suppliers.

Hardware/Software co-simulation using Virtualizer
Sameer Kumar Gupta, Anil Kamboj, Anuj Kumar, Hemant Nigam [HCL Technologies]
This paper explains the co-simulation approach to verify complex hardware and software components of an embedded system in relatively shorter time. This paper explains how users can import HDL modules in Virtualizer and create a platform comprising of HDL modules connected to SystemC modules.

Thursday, June 14, 2012
1:15 PM - 3:15 PM
TA2 User Session
Harnessing the flexibility of ICV for Methodology checking flow in advanced technology nodes
Anand Kumaraswamy, Pardeep Saini, Dharmendra Varma, Harshit Agnihotri [IBM]
This paper details about methodology checking flow developed over ICV for the advanced technology nodes. "Methodology checking" flow is a cell level verification used to identify physical layout characteristics of StdCells, IOs and macros that could potentially create problems during Chip-level Floor planning and Physical Design. This paper is also talk about ICV key benefits in flow development and its advantages on easy methodology.

A Novel Implementation Approach for SoC Pin Timing Closure
Deepak Tottempudi, Saj Kapoor, Rugmini K, Shrinivas MV [Analog Devices]
This paper details about methodology checking flow developed over IC Validator for the advanced technology nodes. "Methodology checking" flow is a cell level verification used to identify physical layout characteristics of StdCells, IOs and macros that could potentially create problems during Chip-level Floor planning and Physical Design. This paper is also talk about IC Validator key benefits in flow development and its advantages on easy methodology.

Template based flow to resolve Congestion for High pin-Density cells using ICC
Rajesh Arimilli , Murali Seshadri [Qualcomm]
As technology shrinks and density increases, congestion issues become prominent. This increase in congestion due to pin density and the alleviation is the main focus of this paper. In lower technology nodes, since metal width is not reducing with respect to cell area , more nets converge in smaller areas resulting in increased pin density and fewer routing tracks. This paper covers the non-traditional but powerful techniques for optimizing congestions for high-performance and high density design using IC Compiler.

Implementation of ARM Cortex-A9 Quad Core Processor with Synopsys Hierarchical Flow
Venu Gopal , Prem Kishor [Open-Silicon]
This paper details about challenges in implementation of QUAD Core CortexA9 with 512KB L2 cache memory & controller. It details about floorplan requirements of blocks due to SOC IO constraints, Top level IO budgeting, multiple iterations on attaining best memory placement and performance. The flow or methodology of using DCT-SPG & ICC is discussed with various key techniques.

TB1 User Session and Vision
Low Power design Implementation challenges & solutions with UPF
Nitin Raverkar, Pradeep Sreenivasa [LSI]
This paper describes the low power implementation flow and challenges for a design with multiple nested power islands and heavy data transfer at the interface. Paper discusses design complexity, challenges in writing a robust UPF and how various Synopsys tools help in finding and fixing the power bugs in the design at different stages of the flow.

TB2 User Session and Vision
Optimal Dynamic vs Leakage Power Trade-off using Statistical Intelligence
This paper describes the trends of dynamic and leakage power dissipation as we step towards smaller device geometries and how the contribution of leakage power surpasses dynamic power. It delves into how they individually contribute to the total power and proposes a method to achieve an optimum trade-off between the two so that we may reduce the overall power dissipation.

Handling Power Challenges for Orphan IPs
Mahendra Singh [ST-Ericsson]
The power optimization challenge is monumental for IPs which are sort of "Orphan" with being devoid of designer support and are not well understood. This paper discusses the power gains achieved and the methodology used to take head-on such Power optimization challenges. It discusses how ESP-CV is used effectively to ascertain the partition boundaries, do module boundary shifting and eventually prepare the gated RTL with associated control infrastructure and how to define UPF power intent file for the higher power saving operational mode.

Challenges in designing LP SoCs using hierarchical UPF
This paper discusses in detail about the bottom-up UPF approach for power intent definition and the solution to the challenges faced to stitch the power intents hierarchically to enable implementation & verification of the full chip at different stages of design flow from RTL2GDSII.

TC2 Tutorials & User Session
How to Get the Most from Your Circuit Simulation
Jayanthi Kasarala, Sateesh Chandramohan [Synopsys]
This tutorial provides useful tips and tricks to reduce simulation time without compromising accuracy. Starting with HSPICE, the tutorial will cover tuning for better performance using the Runlvl command, convergence, RC reduction and other good practices. Then, for CustomSim (XA), we’ll reveal performance and ease of use enhancements targeted for simulation of memory designs.

Fast spice simulator CustomSim XA for IO functional verification and reliability check
Srinivasa Rao Kenguva, Srihari Mallavarapu, Sreenivasulu Ramavath [LSI], Sateesh Chandramohan [Synopsys]
This paper highlights that by adopting CustomSim simulation times reduce on an average of ~200X(5X-1000X varies from GPI/Os to DDRI/Os, based on complexity of I/O) compared to HSPICE without sacrificing accuracy. This has made a big impact on the I/O functional verification efforts. The I/O verification turn-around time reduced significantly which gives designers more time and flexibility to analyze other aspects of the design.

CustomExplorer Waveform Comparison in a Fully Automated Silicon Device Models Validation Flow
Vani Priya, Branimir Ivetic [STMicroelectronics] , Rakesh Shenoy [Synopsys]
This paper targets to describe the activity done to develop and automate an efficient and reliable flow for the validation of Smart Power Silicon device models. The paper also illustrates the importance of using Synopsys CustomExplorer Waveforms Compare tool for the most time consuming and critical flow step, which is the automatic comparison of hundreds of waveforms, coming from different simulators.

TD2 Tutorial & User Sessions
Best Practices for Implementing Memories and Libraries to Deliver Superior PPA and Embedded Test & Repair
Chris (Chao Sheng) Wu [Synopsys]
Selection of memory compilers and logic libraries has significant impact on the power, performance and area of SoC designs. This tutorial presents best practices for implementing the optimal combination of memories, libraries and embedded test and repair to meet your design requirements. Also learn how the DesignWare Memory Compilers and Logic Libraries are used in conjunction with Synopsys tools, including IC Compiler and DC, to deliver a high-performance, low-power and differentiated SoC design. Benchmarks on CPU and GPU implementations will also be shared.
Target audience: Intermediate; Design engineers, system architects.

Generic MLM environment for SoC Performance Enhancements
Igal Mariasin, Jayaprakash Naradasi, Dayananda Yaraganalu [Sandisk]
This paper describes a novel approach that can be adopted in the very early stages of the design cycle to analyze the performance of SoC with SystemVerilog based VMM environment. This approach helps in optimizing the datapaths to achieve the required bandwith & uncover "performance bugs" that can’t be easily caught using the conventional methods.

Achieving First-pass Silicon Success using Synopsys Platform Tools
Kiran S J, Manjunath Varadannanavar [AppsConnect Technologies]
This paper shares the challenges encountered in the design and development of a complex, hierarchical SoC which achieved first pass silicon success. The SoC was developed using the full suite of Synopsys tools (Platform Architect to Hercules). The paper covers the architectural development to meet multi-function performance goals, IP Integration including Ethernet, PCIe, DDR, USB and AMBA Bus , Core Hardening to achieve 400+ MHz at 65nm GF process, and Handling I/O and packaging issues on a 928-pin pin limited package with CUP I/O.

Thursday, June 14, 2012
3:30 PM - 5:00 PM
TA3 User Session & Panel
Garuda: Implementation of ARM CPU development chip
Dipesh Bajaj [ARM]
Any new ARM CPU development is accompanied by a series of implementation exercises aimed at understanding the capabilities of the design in terms of clock speed, area on silicon, power consumption at all the leading manufacturing technology nodes of the time. Along with this there is also a need to prove the design in silicon to build up customer confidence and also provide a platform for rapid software development for the core. Project Garuda is one such project designed, validated and implemented a development chip for ARM latest CPU offerings and proven the big little concept of ARM.

Ask the Experts Panel: Best Practices for High Performance Processor Core Implementation
Don Chan, Harissh Swaminathan, Lup Meng Lam [Synopsys], Vijaykishan Narayanan [ARM]
In this interactive session, Synopsys experts with hands-on experience implementing processor cores with high performance (and low power) will share their insights and best practices applicable to all stages of implementation – from synthesis and floorplanning to placement, CTS, routing and signoff STA closure. For example, have you wondered why it is important, and how you can use your design and EDA tool knowledge to manage cell density and congestion to achieve better QoR? Don’t miss this opportunity to hear answers to this and other challenges from Synopsys experts and your peers. Following a short recap of the new Synopsys high performance core (HPC) methodology targeted for all high performance processors we will quickly transition to an interactive “ask the experts” panel discussion. Audience participation is encouraged.

TC3 User Session
Advanced Regression & Verification of Mixed Signal Designs using CustomExplorer Ultra
George Kuruvilla [AMD], Manu Velayudhan Pillai [Synopsys]
This paper discusses how to overcome challenges in the existing mixed-signal verification environment using CustomExplorer Ultra, a GUI- and netlist-based verification platform that helps automate the regressions without manually creating different configuration files or scripts. This paper highlights CustomExplorer Ultra’s easy-to-use debugging features and advanced analysis environment, which includes a result analyzer and support for multiple configuration set-ups, multiple-corner set-up, waveform cross-probing, netlist debugging, building complex equations, and measurements using appropriate test cases. The paper intends to create a high-quality mixed-signal verification environment and at the same time maintain high levels of verification productivity by utilizing the advanced features and debugging capabilities of CustomExplorer Ultra.

Hybrid Full Chip SPICE Simulation Methodology for Complex Low-Power Mixed-Signal SoC
Kumar Abhishek, Sunny Gupta, Nari Reddy, Kushal Kamal [Freescale]
This paper discusses a method to fully verify the complex mixed mode protocols and Low Power mode entry/exit protocols on a Mixed Signal SoC. The approach is to verify such critical SoC analog application features at a pre-silicon stage only. This gives the users high confidence of replicating the same behaviour at a post-silicon stage without encountering any design issues. This enables on-time delivery to customer with a part that can showcase the basic functionality of the chip for their board bring-up.

Standard Cell Model Qualification Improvement using ESP-CV
Rohit Bapna, Sarika Jain, Srihari Mallavarapu [LSI] , Rakesh Shenoy [Synopsys]
ESP-CV is a symbolic simulation based functional verification flow which checks consistency between SPICE and gate level HDL models. This paper will discuss the ESP-CV based functional verification methodology.

TD3 Tutorials & User Sessions
Early SoC Architecture Performance Analysis SoC Architecture/Performance Modeling using SystemC/TLM 2.0, a Case Study using Synopsys Platform Architect
Asheesh Khare [Synopsys]
SystemC/TLM2.0 has attracted a number of SoC design companies for architecture and software modeling in the recent years. Developing architecture models at an early stage of the design cycle and re-using them for software and verification reference models can increase productivity and reduce product development time. One of the challenges in architecture analysis is modeling the application workload with reasonable accuracy during the architecture exploration and design trade-off analysis. Synopsys Platform Architect allows modeling of hardware topologies and application workloads using a SystemC-based platform. This presentation discusses some of the basics of SoC modeling, and presents a case study for architecture analysis using Synopsys Platform Architect.

A Novel Approach Targeting Zero Timing ECO Cycle and Timing Sign Off within PnR Tool
Nagarajan Venkatachalam, Parimal Das, Bamane Rupesh [Qualcomm India]
"Today's nanometer geometry designs put tremendous pressure on IC design houses to maintain advanced design flows to achieve satisfactory level of predictability and productivity in the chip design process. Existing EDA design tools and methodologies are competing to address issues effectively. To ensure silicon success designers have to perform timing signoff at many scenarios. Often implementation tools use a small subset of the total signoff scenarios to tradeoff between runtime, memory Vs accuracy. This poses a huge challenge in timing signoff across all the scenarios leading to multiple ECO cycles. This paper discusses way to reduce the number of ECO cycle for achieving design closure."

Power - Being savvy at every step of implementation!!!
Ishaan Biswas, Concept2Silicon Systems, Muniaswamy M [Cypress Semiconductor]
"In SoCs that are meant for handheld and mobile applications, it becomes essential that the device is extremely ‘power savvy’. In an effort to minimize both active and standby/hibernate power in such low power designs, there is a need to have optimal power modes to enable effective trade-offs between power consumption and response time. This requirement leads to multiple power and voltage domains in the design which in turn poses numerous challenges to implement and verify such intent. This paper tries to address the unique issues faced in various phases of design implementation and techniques to address each of the issue in the entire implementation cycle using Synopsys suite of tools."

UPF-based Implementation of a 40nm Large SoC with Dis-joint Voltage Areas
Anurag Mishra, Devendra Deshpande, Pritesh Pawaskar [LSI (India) R&D Pvt. Ltd]
"In this paper, the authors discuss how UPF-based ICC flow was used to implement the power domains on a high performance 40nm SoC with 40M gates. The chip features 600MHz bus architecture with CPUs running at 1.2GHz and DDR3 clocked at about 1.8GHz. There are six core level and four IO level power domains supporting various power saving modes. The chip floorplan was constrained such that the components belonging to each power domain were spread across the die. In addition, there were many always-on components that logically belonged to the switched domain hierarchy. This posed some unique implementation challenges that were addressed at various stages of the flow."

Hierarchical UPF flow:A case study
Dhruthi Uday, Debashish Sarkar [Cypress Semiconductor]
This paper provides an overview of bottom-up flow using Hierarchical UPF in low power de-signs. It will highlight aspects to be considered while creating hierarchical UPF flow , the advantages and limitations as a methodology. The purpose of this study was to understand and prove the methodology to re-use the block-level UPF (which are written to implement the blocks) and thus avoid creating a separate flat top-level UPF file with contents merged across all block-level UPF files .This flow will ensure no block-level power specifications is lost and avoid re-work on top-level UPF.

Behavioural Modeling of Split Capacitor SAR ADC for Predicting Non-Linearity accurately
Srinivasan Gopal [Intersil Corporation]
This work details out the methodology in developing the behavioral model for the Split Capacitor SAR ADC. The model derives the effects of capacitor mismatch, parasitic capacitors and non-linear voltage coefficients for the Split Capacitor SAR architecture to be able to accurately pre-dict the ADC’s INL/DNL. The Capacitor mismatch error calibration scheme is explained and incorporated into the model as per the design. The model is based on the analytical methodology to give a realistic estimate of the ADC’s INL/DNL with calibration. The reported results were useful in the debug of Silicon issues and provide an insight in the choice of the design parame-ters and architecture that can predict the linearity performance limitations.

Efficient, Reusable Methodology for developing automated Analog Mixed Signal Verification Environment
Sunny Gupta, Kushal Kamal [Freescale]
"This paper discusses about reusable AMS verification IP which consists of generic, simple, parameter-ized Verilog-AMS components that facilitates even a novice user of Verilog-AMS language to quickly develop his AMS testcases by plug-and-play of these components. This will be a common pool of all VAMS based components that have been created, validated and matured across pro-jects. Similarly, common scripts used in AMS verification flow will also be made part of this re-pository. This common VIP will mature through the collective efforts of AMS verification across NPIs over time. All AMS testbenches will reference this common repository, managed through the standard practices of configuration management and version control."

System Level validation of DWC 10/100 Ethernet IP core
Prasanth Rajagopal [Analog Devices Inc.]
This paper is about Post Silicon Validation of DesignWare Cores Ethernet MAC peripheral. The validation process was part of validation efforts for a new processor from Analog Devices Inc. The paper will cover board-level test methodology of important features of the IP Version 3.61a. Some of these features include: "Address Filtering", "Checksum Offload Engine", "Control Frames", "Inter Frame Gap", "VLAN", "Packet sizes", "10/100 speeds", "Loopback" "Half Du-plex" etc. It also covers the challenges and ideas for efficient verification of the module before bringing up the chip to end user. This paper is relevant to users of DWC EMAC IP who are in-terested in verifying their module in depth.

Addressing DFT Challenges in a Low Power Design
Poovaiah Palangappa, Praveen Sanjeev, Shrinivas M V [Analog Devices]
This paper describes various techniques used on our design to reduce the power consumption during full scan ATE testing. We were able to achieve average power reduction of up to 60% using these techniques. Some of the techniques used were:
Clock staggering, Q-gating, Keeping Analog hard macros in power-down state and TetraMax Low power ATPG options.

Integrate UVM Based Verification Components in Non- UVM Testbench
Sidhesh Patel, Pratik Vasavda [LSI Corporation]
"UVM provides a robust, extendable framework for building a testbench for functional verification of the design. Migrating a complex testbench into a UVM-complaint testbench requires careful planning. There can be several methods for migration: based on the existing testbench structure and complexity, based on the available migration time and resources, and based on the availability of the UVM-based verification IP. One of the methods is, migrating the testbench in different phases, for example, in first phase replacing the existing verification IP (VIP) with the UVM-complaint UVM-based verification components (UVC). The challenge in the phase-wise migration method is to integrate the UVM-based components in a non-UVM based testbench. This paper presents a method to integrate UVC in Non-UVM testbench."

A “Fool-Proof” Method to Negotiate Parasitic RC Delay in Memory Validation
Harsh Rawat, Atul Bhargava, Sachin Gulyani, Rakesh Shenoy [STMicroelectronics]
Till 65nm the Memory IP developers were signing-off timing using only parasitic capacitance and resistors were considered only in the power nets for IR drop analysis. But with newer technologies the impact of the parasitic resistance on the signal delays of the circuit is substantial and cannot be ignored. To provide sign-off quality timing numbers to customers, apart from delays due to signal net RC, it is also important to quantify the impact of IR drop on the signal delays. For example, in older technologies the impact on resistive drop of signal-net routing delay was negligible (less than 2%) but in shrink technologies this number is rapidly increasing (more than 10%). In this paper, we will be discussing about our flow to qualify memories with accurate simulation and reasonable runtimes.

Hierarchical Static Timing Analysis (HSTA) on Complex Memory Physical Layer Designs
Gundlapalli Shanmukha Srinivas, Norman Chan, Mahabaleshwara, Jammalamadugu Srinivasa Rao [Rambus Chip Technologies India (Pvt) Ltd.]
"This paper focuses on creating and implementing one of such hierarchical full chip methodolo-gies in Static Timing Analysis, for verifying timing in some of Memory Physical Layers (PHYs), along with the usage of modelling techniques in NanoTime & QTM modelling in PrimeTime. It concentrates on creating models at lowest cell levels and takes it up to the higher levels of hier-archy. To achieve this, many modelling techniques for complex digital/analog cells need to be exploited. In this approach, timing of all paths can be verified starting from PLL and memory controller to memory channel in most accurate manner. It also reviews the hierarchical parasitic extraction and its back annotation during timing analy-sis. Some modelling techniques for complex digital/analog cells are described with examples."

Simple yet effective techniques for better FPGA prototyping
Umesh Kumar Bhaskar, Arun Jain, Rahul Jain [Nvidia Graphics Pvt. Ltd.]
"Silicon industry is facing an ever growing need of making bigger, faster, and more complex ASICs. FPGA prototyping offers a way of comprehensive validation of such complex designs by interfacing with real world devices. To keep up with silicon requirements, FPGAs are also facing challenge of running prototypes with speed closer to silicon speed, which is especially important to interface with the real world devices. Current FPGA prototyping tools do a reasonable job in meeting timings at high speeds in a bit-stream, but if a design is large and complex, has multiple clock domains, has limited margin in IO ports or is spread over multi FPGAs, the quality of the result is sub-optimal. Few simple yet effective techniques help in improving the Quality of Results."

UVM: an experience of methodology from VIP development to Coverage Closure
Rohit Srivastava, Gaurav Gupta, Nandini Mudgil, Sarvesh Patankar [Freescale Semiconductor]
The paper discusses the capabilities and usage of UVM for IP level verification. This paper further lists the issues faced in building the verification environment and verifying the DUT. Accellera UVM Register modeling methodology is also explored which is critical for reusable and efficient methodology. The major focus throughout the implementation process was to identify various verification bottlenecks and to address them according to methodology. The tool used for complete design, simulation, integration and code coverage is Synopsys VCS while emphasizing how efficiently VCS supports the methodology