SNUG India Abstracts   

Wednesday, June 25, 2014
9:00 AM - 10:15 AM
Keynote Address
Designing Change into Semiconductor Techonomics
Dr. Aart de Geus, Chairman and co-CEO, Synopsys
In our current semiconductor design community, the word "change" is, to say the least, an understatement. From a dizzying array of emerging "smart" niche end-products to major market trend shifts to ecosystem reconfigurations at every level, the world around us is morphing at an unprecedented pace. The challenge for designers to keep pace has never been greater.

The good news is that we're up for it! In his presentation, Aart will give an overview of many years of investment and innovation efforts, resulting in an exciting sweep of major new design productivity advancements in our design and verification platforms.

Wednesday, June 25, 2014
10:30 AM - 12:30 PM
WA1: IC Verification - Synopsys User Sessions
A Novel Approach to Estimate Simulation Acceleration Performance Gain in an Emulator
This paper provides a new methodology to estimate the Simulation Acceleration Performance gain more accurately. In this method, test specific cycle accurate behavioral code can be developed to mimic the DUT behavior for each test. Since the mimicked DUT is a minimal RTL code, it is considered to be equivalent to the accelerated DUT from a performance standpoint as well. The simulation can be rerun with this mimicked DUT and the impact on time spent in TB with DUT acceleration can be inferred. This is a more accurate way of estimating the gain in simulation acceleration using emulators without actually synthesizing the DUT. This paper describes how this was done for a DUT of 10M in size.

WA1.2 User: Transactor-Based Verification of Baseband SoCs
Santosh Kalakonda, Jayaprakash Madhiraju, Raju Manjunath, Sriram Rajagopalan - Broadcom
Multimedia performance has been a key distinguishing factor in a competitive market for smart phones. Multimedia validation using transactor-based verification methodologies provides faster validation cycles, deeper debug capabilities, and a complete platform for software development validation that has reduced the execution time of the silicon bring-up by far reducing the TTM of the SOC. Software development on RTOS and Android is now done parallel before silicon arrival, and this has led to bring-up preparedness that has scaled to limits like never before.This paper discusses the new techniques implemented and used to validate complex dual core and quad core Baseband SOCs using ZEBU Server.

WA1.3 User: An Alternate Approach to Address Emulation of Complex Clocking Systems in FPGA Platforms
Venkatesh Natarajan - Texas Instruments; Ashwani Sharma - Synopsys
In today’s large SoC architectures which are optimized either for performance or power, we will almost certainly see complex clock tree structures. The SoC will typically have a fair number of unique clock sources and then a very large number of clocks derived from these. While ASIC tools have evolved well to handle these right from RTL all the way through the physical design cycle, these clock structures pose a huge problem when the ASIC is taken to a FPGA or a FPGA emulation platform (e.g. ZEBU). FPGAs have a limited number of skew balanced clock lines and these are nowhere near sufficient for the larger and complex clock tree network that exist in today’s SoC. This paper discusses the details of how to go about creating this design transformation, the flow hooks created in the ZEBU for enabling this and also some of the challenges and limitations.

WA1.4 User: eXtinguishing the 'X' Fire from RTL Testbench and Design to Prevent Gate Level Testbench Bring-Up Heartburns
Harsh Garg, Nitin Jaiswal - Freescale Semiconductor
With the ever-increasing size and complexity of modern SoCs, bring-up time of gate level testbench is increasing by leaps and bounds. Most of the testbench setup time of GLS is consumed in finding and initializing the non-resettable flops of design and tackling the X-propagation issues that escaped at the RTL stage due to X-optimism of traditional simulators. Some of these issues can be potential bugs conjuring in the design that can lead to false pass at RTL stage and can cause trouble at a later stage, where it becomes costly to fix them as ECO. With the increasing emphasis on "First Pass Silicon" by design companies, some of these escaped bugs can render whole or a part of the Silicon "dead," leading to multiple revisions. The paper talks about enhancing the RTL testbench, finding the hidden X-optimism bugs, and reducing the GLS setup time by countering the issues of GLS at RTL stage itself.

WB1: IC Design: Signoff - Synopsys User and Tutorial Sessions
WB1.1 Tutorial: PrimeTime Best Practices and Feature Updates
Natarajan Sridharan - Synopsys
PrimeTime performance has been improving consistenly over the years, and the reference methodology flows provide best practices to ensure that runtimes are optimum in the user flows. This tutorial will show you some of the best known methods that we recommend for getting optimal performance from PrimeTime and explain how it helps users extract most from the tool. We will also be covering the latest technology updates in PrimeTime focusing on technologies like Parametric OCV, Physical Aware ECO, and major updates from the soon to be released 2014.06 version of PrimeTime.

WB1.2 User: PrimeTime Based Efficient Approach for CDC and MTBF Checks in a Complex SoC
Saksham Pant - NVIDIA
Traditional methods like simulation and unit level CDC checks alone are not sufficient to verify that the data is transferred consistently and reliably across clock domains. The unit level checks on RTL take care of the intra-module violations only as the unit owners carrying out the checks are familiar with their respective modules only. The tools used for them have user-specified use-cases and are error-prone and tend to miss a few cases. In this paper we discuss how Prime Time is used for performing the clock domain crossing checks. It offers a more efficient, faster, and dependable workflow and is also easier for debugging purposes. Finally, we will also look into how this algorithm and flow can be extended to the MTBF checks too.

WB1.3 User: Next Generation STA Techniques for 140+ Million Gates Multi-Voltage Design
Sachin Gupta, Parth Lakhiya - LSI R&D India Pvt. Ltd., Ramanuj Mishra - Synopsys
While designs continue to grow in size and the number of modes and corners are increasing exponentially for multi-voltage designs, there is a substantial increase in the design cycle time. This paper explains how HyperScale and SMVA can leapfrog this gap on large and complex designs. It shows how HyperScale enables faster top and block-timing convergence with an automated context update mechanism. The paper also elaborates on automated handling of boundary conditions on blocks instantiated multiple times in the design. Significant run time/memory improvement using HyperScale helped reduction of ECO TAT with accuracy close to Flat Signoff STA. This paper also briefly discusses SMVA, a technique available in PrimeTime that helps reduce the number of scenarios to analyze multi-voltage designs. Statistics from a 28nm, 140+ million gate design have been included for demonstration of these techniques.

WC1: Custom Design and AMS Verification - Synopsys User and Tutorial Sessions
WC1.1 Vision: Mixed-Signal Design Solution - At the Confluence of Analog and Digital
Ravi Tembhekar - Synopsys
The talk will explore the challenges of implementing and verifying today's mixed-signal designs against a backdrop of traditional tools. We will provide an overview of the practical solutions Synopsys brings in mixed-signal implementation, layout productivity, and mixed-signal simulation.

WC1.2 User: A Novel Approach Towards Power Characterization of Compiled Memory IPs
The number of memory instances per chip is increasing rapidly. To support the full range of process, voltage, and temperature corners (PVTs) and the sensitivity to process variation, the number of data points per characterization run is growing exponentially. This paper discusses techniques in dynamic and leakage power characterization that have provided 2X runtime improvements over our existing solution with 2-3% accuracy tradeoffs, and capability to simulate bigger memory blocks such as TCAM.

DC initialization in memories is another challenging aspect. We will cover DC issues which resulted in loss of accuracy and how we resolved them. To improve runtime we have tested the back-annotation flow, active-net based flow and multi-CPU capabilities. This paper will also cover different flow and tool options we explored, the results of our experiments, and enhancement requests

WC1.3 User: Runtime Reduction in High Q Circuits Using Harmonic Balance (HB) Algorithm in HSPICE
Shiv Harit Mathur, Anand Sharma - SanDisk, Suresh Vaiyapuri - Synopsys
As a general trend, Crystal oscillator circuits (COCs) are designed and characterized using time domain (Transient) analysis. The setup generally have long simulation window to conclude stable and sustainable oscillation buildup. A single simulation has runtime of around 1-2 week, with high end and memory intensive m/cs. There are a few work-around available which can reduce the simulation time to 3-4 days e.g. by injecting large startup noise, setting initial conditions etc. for jump start. To add to this complexity, the verification should also cover cross corner PVTs in the process. Thus using the above approach, COCs need large cycle time. In this paper, we present the efficient usage of Harmonic Balance (HB) Algorithm which can act as a boon to harmonic oscillator circuit designers. The paper will also cover the challenges, options and limitations of HB approach encountered in practical implementation of Crystal oscillator design in 28nm process.

WD1: IC Design: Test -Synopsys User and Tutorial Sessions
WD1.1 Tutorial: Low DPPM and Low Cost Testing for All Process Nodes and FinFETs
Jyotirmoy Saikia - Synopsys
Process variations at small geometries give rise to physical defects that require additional tests for achieving low defective parts per million (DPPM). At the same time, higher test compression is needed to reduce the test data volume and test execution time. This tutorial will highlight how leading edge capabilities in Synopsys' TetraMAX ATPG, such as slack-based transition delay testing and cell-aware testing, are being used to enable low DPPM testing and diagnostics for both advanced and established process nodes and FinFETs. We will also discuss DFTMAX Ultra, a new add-on to DFTMAX, that enables higher scan compression and faster test frequencies using fewer test pins. Once parts have been fabricated, TetraMAX can be used to perform physical diagnostics for fast and accurate fault isolation, and Yield Explorer can be used to perform design-centric yield analysis to accelerate yield ramp.

WD1.2 User: Serializer Mechanism in Asymmetric Scan Configuration
Mudasir Kawoosa, Rajesh Mittal - Texas Instruments
Multiple choice of package is the preferred approach so as to reduce the cost of the device based on the required feature support. Different package configuration will imply availability of different number of IOs for a particular package. This will demand for different scan configuration. Each of these scan configuration should result in maximum QoR. Asymmetric scan configuration might result in maximum QoR based on certain stump configuration: say 8:4 instead of 6:6. Equivalent serializer configuration for asymmetric scan is not supported by current tool. In this paper, equivalent serializer configuration for asymmetric scan has been explored. The half serializer approach has been explored as well as where either serializer/deserializer exist on either side before decompressor or after compressor.

WD1.3 User: BIST Area Optimization Using SMS 4.x
Abhinand K, Bhavi Panchal, Nachiket Soman - Open Silicon
For current memory-dominant System-on-a-Chips (SoCs), especially for networking SoCs, the test/repair area overhead of embedded memories is a big concern. The test overhead for small embedded memories is found to be relatively large because of their small sizes and numerous instances. We will discuss one of our 65nm SoC having 7500 memory instance distributed among multiple clock domains and 52 switchable power domains.

This paper presents various approaches to tackle the problem using SMS 4.x from a practical point of view. This helped us on meeting test quality as well as the target die-size. By using multi-instance wrapper, aggressive memory grouping, and compromising programmability in processor component, test area overhead was significantly reduced. For example, for one of the subchips, BIST standard cell count was reduced from 44% to 18%. Area recovery due to each technique and its effects on testability, diagnosis capability, and test time is also discussed.

WD1.4 User: Power Reduction Methods in Multicore SoC Scan Architecture
Thirukumaran Natrayan, Shyam Sundar, Satishchandra Rao - Analog Devices Inc
Controlling power in a complex multicore SoC design is a challenging task, particularly in scan mode. This paper is about the architecture and the several methods used to control the switching power in scan mode mainly the peak power which results in dynamic IR drop issues.

Wednesday, June 25, 2014
1:30 PM - 3:00 PM
TA2: IC Design: Implementation - Synopsys User Session
TA2.1 User: Innovative Techniques to Achieve Optimal QoR with Faster Design Closure Cycle on a Multimillion SoC
Sitharam Ayathu, Pranjal Tiwari, Sailesh Vanama - LSI R&D India Pvt. Ltd., Gaurav Ganeriwal - Synopsys
Today’s SOC designs are becoming bigger and more feature-rich, which in turn  increases the complexity of designs. The increasing complexity brings unique design challenges with each design, wherein the standard scripts and conventional techniques are not sufficient to close the design with best QOR while meeting the fast design closure cycle. This calls for innovative techniques to solve various issues faced in the implementation phase, and also to predict issues that arise downstream, to avoid design-cycle resets. The objective is to share such good design practices and techniques we followed to meet the overall design parameters such as performance, power, routability, and faster design closure on a multi-million gate SOC, which has highly intricate net connectivity and channels resulting in huge congestion. This paper also discusses the Synopsys IC Compiler based TIO (Transparent Interface Optimization) flow for faster closure of Block-Interface timing without over-fixing the blocks or top-level.

WA2: IC Verification - Synopsys User and Tutorial Sessions
WA2.1 Vision: Next Order of Productivity and Performance Through Technology and Integration with Verification Compiler
Dr. Arturo Salz - Synopsys Scientist
SoCs are transforming the electronics industry by integrating a staggering amount of functionality into high-performance, low-power, single-chip implementations with embedded software. Inevitably, as SoCs become larger and more complex, driven by the convergence of functionalities, they strain the performance and productivity of verification methodologies and tools.

Verification tools have served fairly well, but are increasingly limited by the complexity of simultaneously validating different abstraction levels. The combination of abstraction levels and verification flows introduces new classes of failure modes impossible to verify by any single method. This has led to a new paradigm of verification by massively integrated techniques. Recently introduced in Verification Compiler, this concept integrates many novel technologies that lay the foundation for future technology breakthroughs. In this presentation, Dr. Arturo Salz, Synopsys Scientist, offers his view on verification technology trends and looks into the crystal ball on how Verification breakthroughs embodies a rethink of SoC verification

WB2: IC Design: Signoff - Synopsys User Session
WB2.1 User: SMVA-Based Efficient Approach for Timing Closure of a Complex Multi-Voltage DVFS Design
Amitesh Khongal, Anshuman Seth, Shilpa Thakur - NVIDIA
The proliferation of embedded systems and mobile devices has created an increasing demand for high performance but low-energy hardware. DVFS is a well-known power management technique which dynamically adjusts the power and performance to the time varying needs of running programs. To this end, the internal logic of the chip is partitioned into multiple voltage regions, each with its own supply or voltage rail. Timing closure of such multi-voltage design has always been challenging as the number of use cases/voltage combinations is very large. The ideal solution will be if the tool can associate each instance to its rail/supply-voltage with minimal user input and simultaneously analyse various scenarios across different voltage rails. PrimeTime SMVA provides such a solution. In this paper, we discuss how SMVA eased the timing analysis on a complex mobile chip with 4 different voltage rails.

WB2.2 User: Faster Timing Closure in Complex SoCs Using Mode Merging
Deepshikha Moudgil, Mohit Verma - STMicroelectronics; Vikas Choudhary - Synopsys
Timing closure in complex SoCs in lower-node technologies has been increasingly challenging with an increasing number of modes/corners based on features now-a-days SoCs have to offer. With multimillion gates, SoCs cater to more applications now, thereby increasing the number of timing modes. With decreasing technology sizes and decreasing operating voltages for low power market, temperature inversion causes more timing corners to be checked for each mode.

This paper suggests an automated way of merging modes in Primetime GCA. With mode-merging we can reduce timing closure and ECO timing considerably. PrimeTime mode-merging is a faster, risk-free solution and takes lesser resources as compared to manual mode-merging. In this paper, we also share comparison between timing results for non-merged modes and merged modes, suggesting considerable timing accuracy achieved along with scenario reduction and much faster timing closure.

WB2.3 Tutorial: Addressing ECO Bottlenecks in Parasitic Extraction Using Advanced Flows in StarRC
Sandeep Parswanath - Qualcomm, Ananda Veerasangaiah - Synopsys
With shrinking technology and increased complexity of designs, there is demand for efficient ECO turnaround time in all stages of design cycle. Complex designs and lower technologies bring a lot of variations in parasitic extraction domain. Faster and accurate parasitic extraction is the order of fast timing closure of designs.

To circumvent timing ECO bottlenecks, the design community needs a better and robust parasitic extraction flow. In this paper, we will discuss multiple flows enabled in StarRC (Synopsys' Parasitic Extraction tool) to address ECO bottleneck in parasitic extraction.

New flows that can be used to gain control over long ECO bottleneck are as below:
  • Metal FILL re-use flow
  • Fast ECO extraction flow
  • Simultaneous multi corner flow

WC2: Custom Design and AMS Verification - Synopsys User Session
WC2.1 User: Solving Challenges in Timing Model Development for Custom Memories Using FineSim
Shrinidhi Bhat, Ravi Kumar D S, Prabhu Mohan - Microsemi
The challenges faced during generation of timing model for custom memory blocks are many. This includes, Custom design which cannot be represented easily in a functional truth table, Identifing critical paths for Bisection, coming up with Bisection Crtieria when probing nodes are at different blocks, and ensuring the accuracy of the timing model.

A hybrid approach utilizing the best practices of Synopsys Finesim along with native flow was used. A robust methodology was defined by locating the critical path through a complex address decoding scheme in the custom memory architecture. Using these critical nodes, Finesim bisection algorithm generated accurate timing models and the native flow used these values in real-time emulation setup. This hybrid approach helped us to overcome the challenges mentioned above and successfully write and read from the memory.

WC2.2 User: DDR4 Functional Verification With XA-VCS
Mukta Goyal, Madhukar Nakka, Siva Charan Nimmagadda - Xilinx, Rohini Nandi - Synopsys
With advancement of faster design cycles, it is becoming increasingly difficult to run spice simulations at chip level/IP level. CoSim is the only choice currently to address this issue, which can run analog sensitive blocks with functional Verilog modules together. Of the existing methodologies available, XA-VCS is seen to be most prevalent across the industry because of its high speed and fine control over accuracy. We came across VPI interface issues when trying to run CoSim with tools from different vendors. In this paper, we emphasize more on the setup followed for PHY verification and few simulation settings which we found useful.

WC2.3 User: Block Level Electromigration for More Effective Reliability Check In Full Custom IPs
Atul Bhargava, Radhika Gupta, Monika Rawat - STMicroelectronics, Mridul Sengupta - Synopsys
With technologies like FDSOI & FinFETs, the high performance transistor devices push very good "Ion" but the metallization is not equipped to handle it reliably for different mission profile needs. Current density is not scaling down proportionally, resulting into more stress on interconnects for these advanced nodes.

In this paper we present a methodology of fixing EM violations at the block level. This methodology is not restricted to memories and can be applied to any custom IP that is hierarchical and is developed top-down. This greatly reduces the effort needed to clean up any remaining violations at the top level. Our results show a good correlation between full cut and block level approach. Running this analysis at block-level reduces any limit on the design size. This methodology saves us ~8X on the run time and 14X on the total memory utilization and cuts down our product validation cycle by 20%.

WD2: FPGA - Synopsys User & Tutorial Session
WD2.1 Tutorial: Putting IP and Subsystem Prototyping on the Fast Track
Suresh Kumar, Didier Leclercq - Synopsys
The number of IPs used in complex SoCs is increasing very rapidly, and hence IP subsystems are becoming an important part of the SoC development and validation design cycle. In this tutorial we will introduce the new HAPS Developer eXpress, HAPS-DX, solution for complex IP and subsystem prototyping and the new DesignWare IP Prototyping kits accelerating IP and subsystem bring-up streamlining IP to SoC integration. The DWC IP prototyping kits accelerate SoC integration by providing a comprehensive reference design with embedded processor, configured HAPS-DX system, FPGA-based prototyping design environment for DesignWare Controller + PHY IP, and software reference drivers with example applications working "out-of-the-box." In this tutorial we will introduce the HAPS-DX and will showcase how the DWC IP Prototyping kits can be used for:
  • Embedded software development as a stand-alone platform
  • DWC IP Subsystem Reference design with Controller and PHY IPs
  • Prototype a large SoC with multiple DWC IP Prototyping kits
  • DWC Controller IP and PHY IP configuration exploration

WD2: User Paper
WD2.2 User: Debugging Complex Run Time Issues Using ProtoLink
Rajesh Udenia, Vipin Verma - Freescale Semiconductor
The standard practice of debugging an issue is to localize the problem into an IP or an area on the Soc as first step. The problem is that we may not be always lucky to limit the duration in which the issue can be reproduced. Alternate pre-silicon environments based on FPGAs or emulation system help overcome or reduce simulation run time constraints. But the challenge is then shifted to the design visibility and scoping for debug. There are few debug tools and methodologies available in the industry and have their own strengths and weaknesses which make them conducive to be used in different conditions. In this paper we are discussing more about ProtoLink and how it has proved to be effective in identification and root cause of such unpredictable issues. The paper will cover a case study and touch upon the advantages gained using this solution over other methods.

Wednesday, June 25, 2014
3:15 PM - 5:00 PM
WA3: IC Verification - Synopsys Tutorial Session
WA3.1 Tutorial: Increase Low-Power Verification Productivity
Amol Herlekar - Synopsys
Low-power design has changed the way verification is being done. Complex power management techniques like retention, DVS, and DVFS add new complexities to logic simulation. With ever-increasing pressure on the total time available for design verification, an efficient low-power verification methodology should involve a combination of conventional functional verification techniques like planning, assertions, and coverage, along with advanced simulation technologies that help engineers more easily find verification holes and design bugs. This tutorial will provide an overview of how various technologies available in the Synopsys verification tool suite can be leveraged to increase the thoroughness and productivity of low-power verification of your design. It will show how technologies like MVSIM-NLP with power aware verification environment (PAVE) capabilities, together with XPROP and Verdi-PA can be combined to provide a powerful low-power verification and debug platform that helps users find low-power design bugs in a quick and efficient manner.

WB3: IC Design: Signoff - Synopsys User and Tutorial Session
WB3.1 Tutorial: PrimeTime Advanced Waveform Propagation
Alireza Kasnavi - Synopsys
In this tutorial we will look into the future of FinFET technology nodes and the impact these new structures have on static timing analysis. These three-dimensional structures offer performance, area, and power benefits but also introduce more complex parasitic characteristics than planar structures. This increased parasitic complexity results in waveform distortions that can have a significant impact on timing.

We will describe how PrimeTime's waveform propagation technology accounts for the sub-20nm effects that are strengthening phenomena such as Miller Effect and long tail waveform distortions. We will also describe how we have made the PrimeTime sub-20nm flow more robust by identifying and handling bad library data.

WB3.2 User: Ensuring Robust Rail Analysis
Venkatesh Bharati Krishnamurthy, Kevin Vaz - ARM
As sub-micron process technology scales down, signal integrity issues due to IR drop are more pronounced. This effect makes power rail chip designing more challenging, so it is important that library vendors provide good quality views for accurate rail analysis. To ensure this, ARM does correlation between IR drop values with power analysis tools and values from a Fast-Spice simulator on complex processor designs. This presentation discusses a study to identify and resolve tool issues, not just on characterization but also on the IR drop analysis tool. The study looks at the setup methodology to ensure that inputs for both tools are consistent and from the same source. The approaches used in reducing runtime by reducing the instance count to the top hundred instances, and VCD profiling and partitioning are also discussed. Most importantly, this presentation discusses the debugging mechanism including battery current matching and pad current matching.

WB3.3 User: Design Closure with PrimeTime Physical-Aware ECO Fixing and MPI
Amar Pallam, Roopesh Paruchuri, Raghu Pattipati, Rajit Seahra - AMD
Every design project is on the lookout for an improved ECO flow, a better method of implementing sign off quality of fixes for various design violations including max_transition, max_capacitance, setup and hold, especially in the final stages of the design closure.

In spite of the technological advancements in the implementation, handling the sheer number of violations to be fixed has still remained a challenge. With sign off tools evolving over time to be able to provide guidance on ECOs, the issue was addressed to some extent, but highly congested designs are still posing challenges in correlation with implementation tools. This paper discusses the new Physically Aware ECO flow from PrimeTime that addresses this problem and helps converge faster on congested designs.

WC3: Custom Design and AMS Verification - Synopsys User and Tutorial Session
WC3.1 User: A Correct by Construction Layout Placement Flow for Hierarchical Mixed Signal Designs
One of the key design enablers in improving time-to-market is automated and correct-by-construction analog layout placement. Earlier attempts in this field required users to input extremely granular constraints for various devices and specify their relationships. For example, we can use different aspect ratios for various transistors and at the same time vary the configuration of transistor fingers. We will use commercial EDA tools to quickly generate a number of optimal layouts in the face of extremely complex design rules of 14nm process node. In this paper, we will also explain various challenges associated with hierarchical analog layout placement and how various constraints can be applied at any level of hierarchy. With this approach, the circuit/mask designers were able to generate multiple layouts and create a high quality final layout for any schematic design hierarchy. We will also show how we reduced the overall analog custom layout time by about 50% by using this approach.

WC3.2 Tutorial: Transistor Level Static and Dynamic Circuit Analysis, an ERC Solution for Deep Sub-Micron Low-Power Custom Digital, Memory and Analog IP Designs
Vivek Sharma - Synopsys
This tutorial is for circuit designers looking for methodical ways to detect design errors in low-power circuit designs. This tutorial describes how to use CircuitCheck to apply transistor level static analysis for efficient design error detection without running traditional circuit simulations. Key topics that will be discussed include essential ERC (electrical rule check) assertion, detecting power induced leakage path in memory designs, and detecting over-bias problems in critical areas of analog circuits. In addition, designers will learn how to manage these design errors using a graphical debugging environment with CustomExplorer-Ultra.

WD3: FPGA - User and Tutorial Session
WD3.1 Tutorial: Better, Faster, Sooner: Tips and Tricks to Efficiently Achieve Timing Performance Goals
Madhav Chikodikar - Synopsys
Quality of Results is a primary objective for many FPGA designers. In this tutorial you will learn about the latest Synplify Premier techniques and methodologies for maximizing the timing performance of your design, while keeping design iterations to a minimum. Topics covered include: utilities and techniques that allow you to assess and optimize your constraints, how to get improved timing correlation, coding styles, project and optimization settings that improve QoR, and how to achieve faster overall timing closure.

WD3.2 User: Composed SoC Validation
Shreepad Hardas - Open-Silicon
FPGA prototyping has become an essential phase gate in the process of SoC development.  In the pursuit of shorter time to market, chip-bring-up and validation readiness take centre stage soon after realization of an SoC design on an FPGA platform. The diversity of on-chip peripherals and their unique interfaces pose equally unique challenges for the validation engineers. Being able to selectively connect, exercise, and control any given set of peripherals while still maintaining a small footprint is the key to successful bring-up and validation. This paper proposes a unique FPGA-based setup that will facilitate automation, selective connection, exercise, and control of peripherals. Segregating the SoC Device Under Test (DUT) and Master subsystem, yet selectively coupling necessary interfaces, it allows excitation control over both the sides.  The solution is based around the HAPS platform, a custom-built yet re-useable daughter card, along with a software state-machine and protocol framework.

WD3.3 User: IP Design - Efficient and Fast Prototyping and Porting to ASIC Using Synopsys Tools
Prasanth R I, Thomas Varghese - Mindtree Ltd.
The article summarizes our IP design lifecycle and some of the IP design strategies we practice, and it describes the various design strategies, optimizations, and techniques we have used for keeping a check on the power consumption challenge and hence achieving industry best numbers in current consumption. This paper also discusses how Synopsys tools were efficiently used in different phases of the IP development cycle.

Wednesday, June 25, 2014
5:15 PM - 7:30 PM
Designer Community Expo

Thursday, June 26, 2014
9:00 AM - 10:15 AM
Keynote Address
System Design Challenges in the Connected World
S. Balajee, Vice President, DS India Labs, Samsung Research Institute Bangalore
Internet of Everything, Wearable Semiconductors and Cloud Computing is changing our day to day lives. Hetrogenous devices accessing data in real time over various connectivity technologies is posing a big challenge with respect to designing a solution which will fit requirements of diverse applications.

Diversity, Cost, Power, Performance and TTM are the key mantras in this new world order. This key note talk would discuss some of the applications, the system design challenges in this new eco system.

Thursday, June 26, 2014
10:30 AM - 12:30 PM
TA1: IC Design: Implementation - Synopsys User and Tutorial Sessions
TA1.1 Tutorial: IC Compiler II and the Power of 10x: A Product Walk-Through
Sanjay Bali, Neeraj Kaul - Synopsys
At SNUG Silicon Valley 2014, Synopsys unveiled IC Compiler II, the game-changing successor to the industry-leading IC Compiler physical implementation system. This session will introduce IC Compiler II and showcase many of the capabilities that will help you meet the growing challenges of IC design. Come learn how the 10x faster throughput in the new IC Compiler II can enable a world of new opportunities.

TA1.2 User: 28nm vs 20nm: A CAD Methodology Perspective
Suresh Raman - Xilinx, Babitha Kunta - Synopsys
The migration to leading and bleeding-edge process nodes brings in new physical effects and challenges wherein learning from physical design flow development and block execution enriches the overall engineering experience. This paper will explore methodology development as well as block implementation at the 28nm and 20nm process nodes. Further, specific experiences at these two nodes will be compared, citing various challenges undertaken to close on physical design as well as timing closure and power targets. Some of the comparison points across the two process nodes delve into pronounced variation effects due to double patterning (DPT) and margining techniques to achieve timing closure across the various MCMM signoff scenarios. The paper will discuss, at length, some of the new effects observed at 20nm along with their implications to physical design as well as the techniques employed to address these challenges.

TA1.3 Tutorial: Advanced Custom Routing Using Galaxy Custom Router
Rajagopal Sundararaman - Synopsys, Girish Prabhu – NVIDIA
This session provides an overview of Synopsys new solution in advanced custom routing, Galaxy Custom Router, and how it can be used in conjunction with ICC to not only improve the quality of routing in various critical applications in today’s complex SoCs but also accelerate the time taken for these custom edits. A real life example of the application where Galaxy Custom Router (GCR) improved the quality of the routing and thus the overall performance of a circuit at Nvidia is also presented.

TB1: IC Verification - Synopsys User Sessions
TB1.1 User: Interrogating Formal Environment: Uncovering Hidden Details and Weaknesses
Venkata Ramanamurthy Barala, Arun Prakash C.S, Jebin Vijai - Qualcomm Ankit Garg - Synopsys
This paper covers the implementation of Certitude on Formal test suite, detection/analysis of faults, result analysis examples, and issues faced during the implementation.The Certitude implementation provided the required coverage metrics and also uncovered holes in test cases. Better coverage was achieved based on Certitude findings by adding some new tests and modifying some of the existing testcases.

TB1.2 User: Unified Verification Flow for Complex Low-Power Designs
Abhijeet Chandratre, Caesar Deka, Anand Shanmugam Sundararajan, Ramachandiran V - NVIDIA
In a decade where smart phones have become a basic necessity of life, common desktops are more powerful than last-generation super computers, and tablets are much sought after by every individual, addressing all of these market segments is the dream of every semi-conductor company. With designs that are diverse and highly-focused on improving the power efficiency, having a centralized power flow that can seamlessly address all projects, across the board, is quite a challenge.

This paper illustrates the challenges that were faced by low power engineers while executing this diverse array of projects and how these were subsequently overcome using VCS-NLP. These challenges include reduction in the earlier effort to maintain two independent flows (one for VCS and one for MVSIM), improvements in coverage with VCS-NLP, and performance improvements of VCS-NLP over MVSIM, etc.

TB1.3 User: A Novel Approach to Significant Reduction in Time to First Test Using Configurable SoC Testbenches and VIP Reuse
In this paper we will introduce a verification methodology that renders a configurable verification environment based on multiple VIPs and enables connectivity to System-C drivers to reuse System-C unit tests with UVM SoC environment. This methodology focuses on encapsulating major components with simple wrappers and significantly reducing dependencies of tests and data checkers/monitors/coverage on VIPs implementation.

TB1.4 User: Accelerated Verification of a MIPI CSI2 System Using CSI2 Verification IP
Akhileshwar Dhiman, Gaurav Gupta, Shreya Singh - Freescale Semiconductor, Dipesh Handa - Synopsys
This paper presents the challenges met in the verification of a MIPI CSI2 receiver-subsystem. It outlines challenges faced while architecting the UVM-based verification environment until the final execution of the verification tasks that include expediting the design and development by using a standard verification IP and addressing the 're-usability' aspects for future devices. It also highlights some of the enhancements done in the Verification IP to cater to our verification requirements, which are applicable in general as well.

All aspects of advanced verification enabled by this VIP will also be covered in detail that would include MIPI CSI2 error injection capability at packet level and physical lane level error checks, score boarding, functional coverage and verification plans, protocol checks, reusable and configurable sequences, call-backs, and protocol analyser support.

TC1: Systems and IP - Synopsys Tutorial Sessions
TC1.1 Tutorial: Physical IP Development on FinFET - There's Nothing Planar About It!
Amit Khanuja - Synopsys
To fully realize the advantages of FinFET devices, physical IP must follow the same trajectory that has benefited digital design. That includes scaling, lower power consumption and higher speeds. To achieve this, analog/mixed-signal development techniques and design styles have to be re-created and implemented with very close foundry cooperation. This session discusses the FinFET characteristics of physical IP design and how they differ from planar devices. It will describe the impact FinFETs have on existing circuit designs and layout topologies for widely used IP such as DDR, USB, PCI Express, embedded memories and logic libraries. In addition, this presentation will highlight the methodologies that incorporate advanced process qualification vehicles.

TC1.2 Tutorial: Performance Analysis for the Synopsys DesignWare Universal DDR Memory Controller Using Synopsys Platform Architect MCO
Asheesh Khare - Synopsys
The Synopsys DesignWare Universal DDR Memory Controller (uMCTL2) provides sophisticated features like bank interleaving and transaction reordering to optimize memory throughput for many parallel transaction streams. Specific latency and bandwidth requirements from different SoC subsystems including main CPU, video, and audio can be managed via advanced Quality of Service capabilities. A sub-optimal configuration can significantly reduce the overall memory efficiency as well as impact the performance of individual sub-systems. In this session you will learn how to find the best configuration for your application by using Synopsys' Platform Architect MCO, a performance analysis environment with a comprehensive SystemC transaction-level model library for workload modeling, traffic generation, and modeling of interconnect and uMCTL2 memory subsystems.

TD1: IC Design: Low Power - Synopsys User and Tutorial Sessions
TD1.1 User: Modeling and Implementation of Low-Power Intent for Complex SoC Using UPF2.0
Shashank Bhonge, Vaishali Huilgol, Vinod Reddy - Xilinx
Low-power design without a UPF is a daunting task and unthinkable in today’s complex SoC designs. To achieve better power savings, different low-power techniques such as multi-voltage, power gating, DVFS, etc., are deployed. Unified Power Format (UPF) is the standardized format that can be used through the entire design flow to ensure that the power architecture specification is intact. The UPF standard has evolved to accommodate the latest features and gaps seen in the earlier version.

This paper describes the learnings in building an SoC hierarchical UPF for complex low-power architecture. The design complexity increases if nested physical hierarchies are part of different power domains. The paper focuses on UPF coding for design-specific requirements such as nested power islands, logic macro handling, different SCMR rails at one level hierarchy, and power connectivity of analog macros. It also highlights the challenges in handling UPF implementation to meet the design requirements.

TD1.2 User: Library-Level Low-Power Verification Techniques for ARM Artisan Physical IP
Venkatesh Bharati Krishnamurthy, Divyeshkumar Vora - ARM
To enable customer implementation of complex low-power designs, ARM physical IP must support the required low-power features and ensure that EDA tools are able to utilize these features. To accomplish this, ARM collaborated with different EDA vendors to identify the low-power features needed for implementation and performed evaluations to check the readiness of EDA tools in supporting new features and cell architectures. However, we found that checking all the cells with all the possible low-power features supported is time consuming and next to impossible with system-level validation.

To streamline the process, we developed library-level validation methods to ensure thorough testing of ARM physical IP. This paper discusses how low-power library validation checks such as LP simulation, power-aware LEC, and static checks help identify major system-level issues during product development. It also covers full feature testing for all cells with cell-level test cases to mimic system behavior and how library-level validation helps with negative testing to ensure tools report unexpected cell issues.

TD1.3 Tutorial: Low-Power Design Implementation
Renu Mehra - Synopsys
This tutorial will review low-power technologies available in the Design Compiler family of synthesis products, IC Compiler place and route, and PrimeTime SI. Along with an overview of core optimization technology, attendees will also learn about key 2013.12 multi-voltage (MV) features such as support for a new "Golden UPF" methodology. For advanced IC Compiler users, enhanced support for controlling the routing topology and buffering (physical feedthroughs) for MV design implementation will also be covered in this session.

Thursday, June 26, 2014
1:30 PM - 3:00 PM
TA2: IC Design: Implementation - Synopsys User Session
TA2.2 User: Advanced ECO Methodology for ARM Core Subsystem
Nitin Kaushik - STMicroelectronics, Vikas Garg - Synopsys
Time to market is a big challenge in industry. This statement looks more prominent for ARM cores design, due to regular errata from ARM, which needs regular ECOs to prevent going back to implementation loop. High-performance targets along with tight area and power requirements are typical to all applications requesting the core.

To achieve the targets, synthesis will be done with various aggressive optimizations like inversion push, register merging, data path optimization, etc., to make the netlist more critical from a traceability point of view. Further, regular erratas from ARM make it more challenging to produce the result within a given time frame. To meet this entire requirement we need to evolve a way out to cope with regular errata without disturbing the timeline. This paper talks in detail about the challenges discussed above and steps we took to resolve them so as to accept the ARM errata on regular basis without hitting design frequency and project schedule.

TA2.3 User: Meeting Clock Requirements for High Frequency Design
Jake Tomy, Manu Mammen Varghese - Broadcom, Gaurav Ganeriwal, Rajesh Patchala - Synopsys

Multipoint CTS is a technique of building and balancing the clock from multiple source points. The primary advantage is reduction of insertion delay on the chip level, as the block level latency reduces drastically based on number of source points. This is made possible because the chip level routes have much less delay, since they are usually routed on the highest metal layers (minimum RC).

The primary target of our experiments is to obtain the minimal possible latency and power, while simultaneously meeting stringent slew and skew requirements of 20ps. The level of gating optimization is also considered. The resultant effect on routability and timing is also evaluated. By using multipoint CTS to distribute the primary clock of the chip (1GHz in 28nm, 3Ghz in 16nm), we obtain minimal latency on chip level and skew of 20ps at chip level,  thereby achieving  better specifications for the chip.

TB2: IC Verification - Synopsys Tutorial Session
TB2.1 Tutorial: Taking Debug Productivity to the Next Level with Your Own Verdi Apps
Rich Chang - Synopsys
This tutorial  talks about increasing your debug productivity using Verdi Interoperable Apps (VIA). VIA opens up Verdi data models and interface for users to create their own applications for design comprehension and validation, tool integration, FSDB investigation, and design manipulation. This tutorial will introduce you to the concept of VIA and how to start programming with VIA. The tutorial will also share some real examples from our customers who have used VIA to improve their debug productivity.

TC2: Systems and IP - Synopsys Tutorial & User Session
TC2.1 Tutorial: Integrating USB 3.1 in Your Next SoC Design
Didier Leclercq - Synopsys
The presentation explores the evolutionary and revolutionary changes between USB 3.0 and USB 3.1 and how they affect host controllers, hubs, and PHY IP. The presentation will also describe the challenges of designing a USB 3.1 consumer SoC, based on lessons learned from real USB 3.0 implementations. Furthermore, the session discusses how applications such as mass storage and communication can benefit from the high throughput of USB 3.1. The presentation will conclude with examples of multi-purpose SoC implementations that incorporate USB 3.1 as well as a range of connectivity protocol interfaces like USB 3.0, SSIC, LLI, UFS, M-PCIe, and PCI Express.

TC2.2 User: Mechatronics System Modeling: Saber
Manish Bansal, Saurabh Srivastava, Rahul Kumar, Akhilesh Chandra Mishra - STMicroelectronics
The electronic content is increasing day by day in Mechatronics system and interaction of different domain cannot be ignored. These design systems depend on the integration of electrical, mechanical, and software technologies to control or replace electro-mechanical (multi-domain) operations. Although Mechatronics systems give significant improvement in system performance and reliability, combining several technologies into one system is quite challenging. To make sense of these system complexities, designers turn to virtual prototyping systems to design, simulate, analyze, and verify system interactions across all disciplines. In this paper we investigate an electro-mechanical system and its impact on noise and different functionality. We propose a methodology for electro-mechanical simulation and mitigation based on converting our design into MAST models to do the simulation through SABER. Experimental results (direct power injection, coupling analysis, and emission test) are obtained on an electro-mechanical gasoline system in the automobile market to demonstrate the effectiveness of the proposed approach.

TC2.3 User: On the Fly Donut Formation in Compiled HD Memory to Enable Analysis of Biggest Instance
Darvinder Singh, Isha Garg, Vineet Sachan - AMD
Timing data collection through memory compiler characterization is an integral part of memory compiler development. Simulations are to be run on an exhaustive instances list, to cover the whole compiler range. Full characterization taxes resources immensely, both in terms of time and disk space. This paper focuses on on-the-fly donut creation methodology for the target memory compiler instance using Synopsys Memory Characterization tool Embed-IT. In Donut Creation Flow, timing non-critical bitcells are removed from the bitcell array, while keeping the critical ones. All the corner bitcells are kept intact with their surroundings.

TD2: IC Design: Low Power - Synopsys User & Tutorial Session
TD2.1 User: Case Study of Using Verdi Signoff LP for Low Power Checks
Hitesh Nijhawan - STMicroelectronics, Vikas Garg - Synopsys
This case study talks about an evaluation done on Synopsys Verdi-Low Power Signoff tool for the PALMA2 SoC design project. This was a tight-scheduled, business critical, six million instances design project. An evaluation license was provided for just 4 weeks. The challenges were low-power design checking using Verdi-LP with migration from CPF to a new low-power standard (UPF v1.0) . Design challenges include verifying low-power Intent, low-power logical and physical implementation for the full flat SoC, and architectural checks for functional low-power verification.

In this paper, we would like to discuss the challenges faced during the migration and evaluation of Verdi-Signoff LP. We would like to share our experience on how the tool helped us identify and debug the LP implementation. We will highlight the advantages, usability, and comprehensive checks supported by the tool.

TD2.2 User: A Recipe to Implement and Verify Low-Power Architecture
Alpana Bastimane, Varaprasad Mailapalli - LSI R&D India Pvt. Ltd.
In this paper we discuss the low-power implementation and verification challenges we faced and the way we have overcome them during the execution of a latest chip. In this paper we will first cover the design changes needed to implement the low-power architecture, then the implementation of power intent during synthesis (Design Compiler), DFT (DFT Compiler), and Physical Design (IC Compiler) tools; finally we will cover the verification of power intent implemented by the Synopsys tools using power-aware formal verification (Formality with UPF) and MVRC Rule checker. While we work on implementing low-power architecture to meet the stringent requirement of power utilization, we also made sure that there is no impact on performance.

TD2.3 User: Efficient Static and Formal Verification Closure of Low-Power Designs
Satyanarayana A, Rakesh Madala, Shilpi Varshney - AMD
This paper discusses some of the techniques and methodologies used for quick closure of Static (VSI)/formal verification (Formality), namely:
1. Complete power intent static verification at RTL and implementation stages
2. SoC-level methodology for complete chip static verification
3. Make use of static checks reports for quick closure of formal verification
4. Speedy analysis and debugging using GUI features of VSI

In effect, we propose a very successful methodology that we have used very efficiently and quickly to solve verification of power intent and sign off the design.

Thursday, June 26, 2014
3:15 PM - 5:00 PM
TA3: Synopsys User & Tutorial Sessions
TA3.1 User: A Practical Approach to Achieve Tighter Correlation and QoR From Synthesis Through P&R for a 28nm Design
Tapan Bhandari, Vishal Jayantilal Katba, Aravind Ramanujam - Qualcomm, Umesh Ravjibhai Gajera - Synopsys
Traditional synthesis approach of using wire load model along with over-constraining of design can often lead to sub-optimal QoR in backend for ultra-deep submicron designs. Hence it is important to have accurate estimates of interconnect resistance and capacitance from the beginning of synthesis. This paper showcases the methodology built around Synopsys physical guidance (SPG) flow using Design Compiler Graphical and IC compiler for achieving tighter correlation and a boost in overall design QoR. This methodology was developed and validated on blocks of a 28nm SoC. Besides reducing RTL-GDSII iterations and making closure more predictable, this methodology yielded smaller stdcell area, better timing, reduced leakage power, and routing congestion.

TA3.2 Tutorial: Emerging Node Design with IC Compiler
Gaurav Ganeriwal - Synopsys
This tutorial is for IC designers planning to use IC Compiler for emerging node designs with emphasis on support for manufacturing compliance (placement constraints, double patterning, DPT aware routing, extraction, etc.). We will highlight the design challenges posed by FinFET design and how IC Compiler can resolve these issues with its convergent flows for timing, leakage, and area. Finally, we will demonstrate the significant productivity benefits of utilizing the in-design capabilities of IC Compiler/IC Validator, coupled with the close signoff correlation and convergence throughout the Galaxy platform.

TB3: IC Verification - Synopsys Tutorial Session
TB3.1 Tutorial: Verification Closure Flow
Amit Sharma - Synopsys
This tutorial will walk through the standard verification closure flow and talk about what is needed at each step. The flow includes planning, metrics gathering, verification execution, analysis, and ultimately shipping a product. From this tutorial, you will discover that Synopsys has unique capabilities which can be applied to your verification project right now to deliver a higher quality product. As the flow is discussed, technologies will be highlighted such as Verification Planner, Discovery VIP, RALGen, Certitude, VCS, Verdi Signoff, and more. Come see how Synopsys enables a complete verification closure solution to create first pass success with your designs.

TB3.2 Tutorial: Advanced Verification Techniques Applied to ARM AMBA 4 / AMBA 5 Protocol-Based SoCs
Satyapriya Acharya - Synopsys
Market windows continue to shrink at the same time as SoC complexities increase, with multiple CPU clusters connected by cache-coherent interconnects and an increasing number of interface protocols. Verification IP (VIP) is at the heart of SoC verification and must deliver simulation performance, ease of use, and productivity features that enable verification teams to hit their schedules. The current generation of wrapped VIP has run out of steam. This session will show how next-generation SystemVerilog VIP for ARM® AMBA® 5 CHI, ARM® AMBA® 4 ACE and  other protocols delivers superior ease of use, performance, and debug, and validates complex cache state transitions for coherent protocols to accelerate schedules and improve product quality. Topics covered include verification planning, scenarios, tests, coverage, error injection, protocol aware debug, and protocol checking and performance analysis. You will learn how you can take a step up in productivity and bug-finding by using the next generation of verification IP.

TC3: Systems & IP - Synopsys User and Tutorial Sessions
TC3.1 Tutorial: Ultra-Low Power Processors and Subsystems for IoT
Hemal Mehta - Synopsys
IoT means many things to different people and companies. Wearable devices and intelligent appliances are just some examples of IoT applications. One common thread for many devices in this space is battery operated portability. That translates into ultra-low power, and small die footprint IP requirements. Synopsys' ARC EM processors provide the embedded processing for IoT devices with industry leading power/performance efficiency and the Sensor IP Subsystem builds DSP acceleration and tightly coupled peripheral interfaces into an ARC EM core to deliver the optimal solution for processing sensor data from the expanding set of sensors typically used in IoT applications.

TC3.2 Tutorial: Best-In-Class Foundation IP for Different Types of Processor Cores
Amit Khanuja - Synopsys
This tutorial will present how Synopsys Foundation IP (Memory and Logic Libraries) combine to deliver superior implementations of different types of processor cores. Custom memory instances along with specialized logic library cells work together to deliver best-in-class PPA. Results on CPU, GPU and DSP core implementations at 40nm and 28nm will be shared.

Publish Only
Beyond CODEC Scan DFT Architecture for Pin-Limited Large SoCs
Tushar Khadtare, Dr. Pradip Thaker - GEO Semiconductor

VCS Optimization Techniques for Multi-Chip Simulations
Debashis Biswas - Cisco

New Methodology for ESP-CV Functional Equivalence Checking for FinFET Based Memories
Himanshu Garg, Pratik Satasia, Abhishek CR - ARM; Dave Hedges - Synopsys

Implementation of VA Feedthrough
Tejkumar Korat, Bhushan Kapadnis - LSI India Reasearch & Development Pvt Ltd.

Unified Environment and Infrastructure for Simulation, Emulation and Silicon Validation
Gopesh Goyal, Rajesh LG, Siva Kumar R, Rajesh GSVR - Cisco

Performance Matters: Why DDR3 Standalone Verification is Not Sufficient
Navajeevan Biswaprakash, Yogesh Mittal - Freescale Semiconductor

Bottom Up Hierarchial Floorplanning V/S Top Down Floorplanning
Vikram Mouneshwar - LSI India Reasearch & Development Pvt Ltd.

TA Analyzer Cockpit
Hitesh Nijhawan - STMicroelectronics

Achieving Extreme Compression for Next Gen SoC Designs
Rajendra Kumar Reddy - NVIDIA

Customized Characterization of Complex Circuits Using SiliconSmart ACE
Anand Sharma, Ramakrishnan Subramanian - SanDisk

Quick, Re-Usable and Cost Effective Approach to Create Accurate Models Using Synopsys Platform Architect Framework for Early System Level Performance Analysis
Saurin Patel, Baljinder Sood - Freescale Semiconductor

Automated Custom Placement and Routing for Advanced Nodes