SNUG Israel Abstracts 

Tuesday, June 19, 2012
11:15 AM - 12:45 PM
A1 User Papers and Tutorial: xProp, Native Low Power, Soft Constraints
Improved X-Propagation using the xProp Technology
‘X values simulation semantics in SystemVerilog have several pitfalls that may result in simulated ‘X being improperly propagated which in turn may lead to initialization and power-related failures in Silicon. This paper describes xProp - a new “real to life” X-Propagation semantics for SystemVerilog simulations.We present the motivation for having xProp semantics and describe it briefly. The process of how xProp was validated on and deployed to a major CPU design is also described, and its merit and its limitations and effort required making xProp standard for RTL simulation runs.

Power Aware Verification for CPU Designs: Challenges and Solutions
Power-aware verification is critical in a complex design with multiple power supplies. Previous projects addressed this as a side activity or last minute blitz. In this paper we describe a robust methodology that enables integrating power aware verification into mainstream verification flows. In particular, we show how we combine the simulator-specific, power aware capabilities with our build and test methodology. We adopted the always alive power aware model approach that includes building dual simulation executables, regularly running some of the tests on the power aware model and performing statics checks for each model change. This methodology has allowed us to keep the power aware simulation and test environments active throughout the project lifetime with minimal cost. We also compare the efficacy of our deployed solution over previous solutions.

Introduction to Soft Constraints in SystemVerilog
Alex Shot, Jason Chen [Synopsys, Inc.]
SystemVerilog provides a rich set of constructs to enable constrained random verification. It not only defines mechanisms to override and add constraints, but also provides a way to control random variables and constraint blocks making them active or inactive during the simulation run-time.

However, in order to cover scenarios specified in test plan, the user should have deep knowledge of the verification environment and all its components. This requirement becomes even harder if verification environment contains tens of such components. P1800 working group approved the proposal to add soft constraint (Mantis 2987 [3]) to the 1800-2012 release of SystemVerilog standard. Soft constraint instructs the constraint solver to satisfy the constraint if possible and disregard if not, making constraint reuse and test creation easier.

This presentation provides an overview of soft constraint syntax and semantics and suggests possible usage of soft constraints in your verification environments.

A2 Tutorial: Considerations in Using FPGAs as System Elements
Considerations in Using FPGAs as System Elements
Andrew Dauman [Synopsys, Inc.]
FPGAs are simultaneously SoCs and elements in large systems. These systems have become increasingly complex both in hardware function and software content. Like software, the FPGA is a programmable element, creating a hybrid bridge from static hardware to pure "c" software potentially adding unique flexibility in the design and debug process. Successful design today requires an awareness of the system operation beyond the chip level. This presentation will consider design methodologies and techniques for the development of increasingly complex FPGA-based products and FPGA-based prototypes.

Target audience: Designers of FPGA-based systems who face challenges due to the increasing system complexity and software content in the commercial deployment of FPGA products and/or the development of FPGA-based prototypes of SoCs.

A3 Tutorial and User Paper: Design Compiler 2012.06 Updates and Multibit FF Inferring
Galaxy RTL: Design Compiler Family Update
Eyal Odiz, Gal Hasson [Synopsys, Inc.]
This tutorial presents the latest advancements including the highlights of the 2012.06 release in the Design Compiler family of products: DC Explorer, Design Compiler Graphical and Formality to help you achieve best-in-class quality-of-results in the shortest possible time. See how you can speed-up the development of high-quality RTL & constrains with DC Explorer for a faster design implementation and generate an early netlist to start physical exploration in IC Compiler even when your design data is incomplete. Learn methodologies to achieve superior design results while streamlining the flow for a faster, more predictable design implementation using physical guidance technology (SPG) in Design Compiler Graphical. Hear about how you can complete verification quickly with Formality equivalence checking without sacrificing quality-of-results.

Inferring Multi (dual) bit FFs in Synopsys RTL2GDSII flow
Oren Unger, Raz Dagan, Omer Niv [CSR - Zoran Microelectronics]
In our article we describe the work we’ve done to integrate dual bit FFs into our RTL2GDSII flow. We focus on the motivation of using dual bit FFs to reduce clock tree power and compare the dual bit solution with other physical based solutions (like the ICC command: set_optimize_pre_cts_power_options -low_power_placement). We describe the way to define multibit FFs in Liberty (including SCAN definitions), and the way to infer multibit FFs in Design Compiler. We also show the way to replace single bit FFs with multibit FFs as an incremental synthesis in DC. We refer to issues related to usage of multi-bit designs, such as equivalence checking, timing constraints and ECO flow. We discuss the limitations that currently exist in DC (Version F-2011.09-SP2) that drove us to create our own post synthesis gate-level based flow. Last, we show examples of dual bit FFs replacement and clock tree power saving on our designs.

A4 Tutorial & Vision: ICC 2012.06 Updates and 2.5D IC Vision
ICC 2012.06 Updates
Ashwini Mulgaonkar [Synopsys, Inc.]
IC Compiler’s latest release delivers significant improvements in the area of faster design closure, high-performance design, and advanced process node support. Faster design closure is enabled through new technologies such as a dataflow analyzer for optimal macro placement, enhancements in top-level closure and fragmented floorplan support, faster in-design metal fill and streamlined ECO closure flows. For high-performance design, this release introduces multisource CTS, a hybrid technology that uses less power than clock mesh but achieves better OCV robustness than CTS, along with improvements in hold and DRC closure. Advanced node support has been enhanced through the introduction of layer-aware buffering, datapath AOCV for pessimism reduction, sensitivity-based optimization for more robust circuits, and 20nm DPT support throughout the physical implementation flow. Learn about these advancements in IC Compiler and how you can use them to implement your most challenging designs.

Vision Session: Advanced Design Integration – 2.5DIC and 3DIC A Silicon Interposer-Based 2.5D-IC Design Flow - Going 3D by Evolution Rather Than by Revolution
Marco Casale-Rossi [Synopsys, Inc.]
While the entire industry is working hard to sort out the manufacturability, cost, and heat/power issues of full 3D-IC integration, there is an intermediate step well within our reach. In this talk we’ll present a silicon interposer-based 2.5D-IC design flow that Synopsys is working on. The proposed solution, developed in close collaboration with Synopsys semiconductor partners, represents the first, evolutionary step towards 3D-IC integration and addresses the implementation and verification aspects of the 2.5D-IC integration flow, yet proposing what the next steps might be.

A5 FastSPICE: Overview and User Experience
FastSPICE Solutions Overview
Dr. Isaac Zafrany [Synopsys, Inc.]
A short overview of Synopsys’ Fast SPICE solutions will be presented at the beginning of this session.

Methods for Running FastSPICE (XA) for SPICE-level Accuracy on Advanced Analog Circuits
Advanced node technology analog circuits are characterized by increase in analog – digital interface, closed loop interaction, and increase in analog content. All these drive the need for speeding up the analog simulation, while accuracy in many cases cannot be compromised due to the sensitive CKT nature. In this paper, we will describe the methods that we used to gain most of the XA fast modes speed while preserving the SPICE-level accuracy.

Utilizing SystemVerilog for Mixed-Signal Validation
The aim of this article is to propose a new approach to storage and post-processing of signals during mixed-signal simulations. The term mixed-signal refers to circuits with both analog and digital designs, known also as custom and non-custom designs. Mixed-signal simulation is becoming more common in the system on chip (SoC) VLSI design world. We will present an overview of several catalysts for this phenomenon and point to a trend of leveraging digital design methods to the analog and mixed-signal design domain.

Many checks in a simulation test bench can be implemented by relating to the value of a signal within a narrow window of time (e.g. checking for an invalid state of a state machine). This is sufficient when checking digital blocks of the design. However, when the intention is to analyze analog signals, the measurements are done on vectors that represent signal values along the entire simulation. This type of analysis cannot be implemented within the testbench and is done after the end of simulation, by storing the signal values in a file (e.g. VCD, FSDB or VPD) and then reading the signals into a commercial waveform browser or by running a post-processing script that reads the signal value from a file.

The method proposed here is to take advantage of SystemVerilog capabilities, which enables defining a hash (associative) array with unlimited size. During the simulation, vectors are created for required signals, allowing them to be analyzed within the testbench along or at the end of the simulation, without need to save these signals into a file. This method accelerates the run time and simplifies the mixed signal validation process.

A6 Tutorial: Best Practices for Implementing Memories and Libraries to Deliver Superior PPA and Embedded Test & Repair
Best Practices for Implementing Memories and Libraries to Deliver Superior PPA and Embedded Test & Repair
Zaka Bhatti [Synopsys, Inc.]
Selection of memory compilers and logic libraries has significant impact on the power, performance and area of SoC designs. This tutorial presents best practices for implementing the optimal combination of memories, libraries and embedded test and repair to meet your design requirements. Also, learn how the DesignWare Memory Compilers and Logic Libraries are used in conjunction with Synopsys tools, including ICC and DC, to deliver a high-performance, low-power and differentiated SoC design. Benchmarks on CPU and GPU implementations will also be shared.
Target audience: Intermediate; Design engineers, system architects.

Tuesday, June 19, 2012
1:30 PM - 3:00 PM
B1 Tutorial: Leveraging Synopsys’ Next Generation SystemVerilog VIP
Leveraging Synopsys’ Next Generation SystemVerilog VIP to Accelerate SoC Verification for the ARM AMBA 4 ACE Protocol
Chris Thompson [Synopsys, Inc.]
As the complexity and number of processor cores in SoC designs increase, so do the verification challenges. One such challenge is verifying hardware-based cache coherency protocols used by these multi-core SoCs. The first part of this tutorial provides an overview of Synopsys’ Discovery VIP and how its 100% SystemVerilog-based architecture provides a foundation for improved performance, support for multiple methodologies (UVM, VMM, OVM) and advanced constraint, coverage and debug technologies.

The second part of this tutorial describes how a reference verification platform built with the Discovery VIP for the ARM® AMBA® 4 AXI™ and ACE™ (AXI Coherency Extensions) protocols can be utilized to accelerate the verification of multi-core SoCs through the use of protocol specific constrained-random sequences, checks and coverage plans. Also highlighted are Synopsys verification technologies like Discovery Visualization Environment (DVE) and Protocol Analyzer.
Target audience: Design and verification engineers and managers.

B2 Tutorial: FPGA Best Practices and Hybrid Prototyping
FPGA-Based Prototyping with Certify & Identify
Yair Dahan [Synopsys, Inc.]
An FPGA-based prototype is a hardware platform which enables pre-silicon software development and hardware/software validation of complete systems or sub-systems at near-real-time run-rates using at-speed real-world interfaces. Today’s designs are more and more complex and the mission to bring the system that includes FPGA’s to work with the ASIC code is not trivial one. Synopsys delivers a complete FPGA-based prototyping solution including HAPS boards, software for board management, multi-chip partitioning, synthesis, and debug, all from a single vendor. The solution provides a high-performance prototyping system for hardware and software engineers enabling integration, development and validation utilizing real-world testing with real-world interfaces. Additional capabilities include pre-tested DesignWare IP, co-simulation for faster development and greater debug visibility as well as Hybrid Prototyping which links FPGA-based and virtual prototyping. This session will introduce Synopsys’ full solution for FPGA/prototyping, from RTL down to binary files. We will also present exciting features of the latest versions of Certify and Identify and demonstrate some of them.

Hybrid Prototyping - Connecting Virtual and FPGA Prototypes
Ohad Amrami, Yair Dahan [Synopsys, Inc.]
Hybrid prototyping expands the usability of both FPGA-based prototyping and virtual prototyping, thus eliminates the weaknesses and risks of prototyping in isolation, and enables earlier software development, HW/SW integration and verification.

This tutorial will present Synopsys’ hybrid prototyping solution based on HAPS and Virtualizer. This solution allows the flexibility to mix and match model abstractions to leverage legacy RTL with SystemC/TLM models that are faster to implement and available sooner in a project lifecycle.

B3 Tutorial & Panel: Optimized Implementation for High-Performance Cores
Techniques for High Performance Cores Using Synopsys Galaxy Platform - ARM® Cortex™-A15 Case Study
Erik Olson [Synopsys, Inc.]
Learn how to predictably achieve high performance while minimizing power. We will present an optimized implementation methodology for an ARM Cortex™-A15 processor core based on Synopsys’ Galaxy™ Implementation platform. This session will highlight the latest technologies/techniques in Design Compiler and IC Compiler used to achieve challenging performance/power targets. These include physical guidance, delay performance vs. area tradeoffs, leakage optimization, innovative methods to reduce slack across register stages during final timing closure, and more. We will examine benefit/cost tradeoffs of each technique; performance/ease of convergence and impact on schedule/turnaround time. We will also share results obtained using this combination of optimized methodology, tools and physical IP.

Ask the Experts Panel: Best Practices for High-Performance Processor Core Implementation
In this interactive session, experts with hands-on experience implementing processor cores with high performance (and low power) will share their insights and best practices applicable to all stages of implementation – from synthesis and floorplanning to placement, CTS, routing and signoff STA closure.

B4 User Papers: Load Density Driven Power Grid; Verification of Layout integration flow; OCC Controller
Load Density Driven Power Grid Design
Farah Jubran [Mellanox Technologies]
ICC does not optimize for power density uniformity thus, a wide spread of current densities are observed on the M1 power stripes of the standard cell placement area. Such large spread forms an extreme requirement for the power grid design which has to be designed to satisfy worst case power/EM/IR drop needs caused by the highest “power density” areas.

We have found that calculation of - “load density” - is correlative to the “power density” and thus we use it as our optimization variable. Few investigated design cases show that the number of “high load density” areas is small and mostly neighboring “low load density” in the surrounding nearby area. A load density trimming algorithm which spreads part of the load of “high load density” areas to their neighboring “under-loaded” areas successfully relaxed power grid design constraints besides improving IR drop performance.

Verification of Layout Integration Flow
A full chip design contains components from different groups across the organization (analog, digital…). One of the key tasks that should be handled with extra care is the components integration and verification. Until today, layout integration done by an additional tool in a different environment left the P&R tool with neither full visibility nor editing capabilities. Today, layout blocks are merged within the ICV tool using gdsin/out commands and verified by icv_lvl tool. This article describes the validation flow of “merged" layout DB as well as additional tool featuresthat have been identified.

On-Chip-Clock controller (OCC): An Alternative Approach
Jalal Abu Teir, Joram Peer [ Nuvoton]
In this ARM-based SoC design, we faced the challenge of implementing three different scan modes and testing two of them with stuck-at/transition models. The transition (at-speed testing) could not be performed by an external clock for frequencies above the 120 MHz range since the IOs did not support these capture frequencies. We used an On-Chip Clock Controller (OCC) to create these capture frequencies from one of the local PLLs. Industry-common practice is to implement one OCC per each high-frequency clock, however in our chip only one single OCC was implemented for the whole chip.

B5 Tutorial: How to Get the Most from Your Circuit Simulation
How to Get the Most from Your Circuit Simulation
Dr. Isaac Zafrany [Synopsys, Inc.]
Do you ever
  • Wonder what the latest news is for Synopsys Custom and AMS solutions?
  • Wish you had insight into the latest advances in Synopsys’ simulation solutions?
  • Want to know how to get the best performance out of your circuit simulator?
  • Care to learn how to characterize your library or memory faster?

This tutorial provides useful tips and tricks to reduce simulation time without compromising accuracy. The tutorial will cover tuning for better performance, convergence, RC reduction and other good practices.

B6 User Paper and Tutorial: Connectivity IPs
Third Party Connectivity IP - Expectations and Experience
Roman Mostinski [Freescale]
Once upon a time, designers used to design 100% of a new IC from a scratch and 100% in-house, but the race speeds-up, moving us to the era of multifunctional Systems on Chip (SoC). Now, a single piece of silicon comprises a variety of processing elements, memories and connectivity interfaces. Even “big players” can neither maintain expertise in all areas required to design all building blocks of modern Application Processors nor have enough resources to sustain the pace of changing. This presentation describes briefly the main issues and challenges related to design and reuse of connectivity IPs, and summarizes our experience of working with Synopsys on connectivity IPs - from “definition to silicon.”

Tutorial: Designing to the New PCI Express 3.0 Equalization Requirements
Rita Horner [Synopsys, Inc.]
PCI Express® 3.0 has changed the type of equalization it uses over previous generations. This tutorial will discuss why equalization is required, the decision feedback and continuous-time linear equalization types used by PCI Express, equalization circuits within the PHY, feedback methods across the PIPE interface and across the Link, convergence algorithms and issues with the current PIPE specification and debug features implemented.

Tuesday, June 19, 2012
3:20 PM - 4:50 PM
C1 User Papers: Verification Abstraction, Reusable Testbench, Design Patterns
The End of Verification?
Kobi Pines [Marvell]
During the decade starting in the mid 90s, the verification gap was one of the obstacles threatening the future of VLSI. Extrapolation of the verification effort predicted that it would explode as chip complexity rises and that the number of tape-outs required for achieving production-worthy silicon would make the ROI questionable.

However, during the last few years, this concern was relieved and verification of logic design became a commodity available to everyone. Methodology cookbooks became widespread and libraries were standardized. Modern verification engines like constraint-based random test generation, functional coverage and ABV were integrated into the standard RTL simulators. The growing market of IPs, VIPs and design and verification services in the global village enabled small companies to design large and complex chips. One paradigm was the key factor for this entire advance and its major enabler. The SoC paradigm enabled the integration of modular units verified on a modular verification unit-level simulation environment using the modular verification methodology, assuming that the system equals exactly (or is less than) the sum of its components. Now that the verification challenge is solved, what is the next challenge? (Answer: Early system-level simulation on virtual platforms validating systems that are beyond the sum of their components)

Truly Reusable Testbench-to-RTL Connection for SystemVerilog
Arik Shmayovitsh [Sigma Design]
One of the challenges in reusing a block-level verification environment in a cluster/system-level environment is connectivity between the testbench and RTL.

We've developed a novel methodology, purely based on SystemVerilog, which allows seamless porting from level to level (vertical reuse) based on the 'bind' construct while maintaining the following principles:
  • Each level should only be aware of its direct sub blocks in connectivity
  • Once connectivity of a specific level is defined, it is automatically reused at all upper levels
  • Multiple instantiation of blocks is gracefully handled
  • Support for hybrid reuse of an environment (parts that remain active when moving up)

Design Patterns in Verification
Guy Levenbroun [Qualcomm]
Many verification engineers today use concepts from the programming world for their testbench. Verification methodologies such as RVM, VMM and UVM, are based on object oriented programming. UVM also introduces some design patterns such as singleton, and factory. However these design patterns aren’t vastly used outside the scope of the methodology. Design pattern is a reusable solution to commonly accruing problems within a given context. The benefit of using design patterns is clear: it gives a common language for designers when approaching a problem, and gives a set of tools, widely used, to solve issues as they rises. In this paper we will explore several common problems that might rise during the development of a testbench and how we can use design patterns to solve these problems. We will also cover design patterns: factory, composite, visitor, template method, strategy.

C2 Tutorial: Software Development for ARM big.LITTLE and Processor Design
Tutorial: Developing Software for ARM big.LITTLE Based Designs Running Android
Robert Kaye [ARM] ; Achim Nohl [Synopsys, Inc.]
As devices get more and more complex, developing software for those devices becomes increasingly complex. While big.LITTLE processing offers a way to balance high performance through the use of the ARM Cortex-A15 MPCore processor with power efficiency by switching to the ARM Cortex-A7 to extend battery life, these processors challenge software developers to keep up and both utilize the available compute power while being power conscious. With the right set of models, virtual prototypes offer a unique platform view to the software developer to ease the software development for these multicourse designs. Moreover, they provide unique capabilities to make sure that software developers correctly utilize the available Android control functions to deliver a smooth user experience while minimizing power consumption.
Target audience: Software developers (intermediate and advanced).

Application-Specific Processor Design
Achim Nohl [Synopsys, Inc.]
This tutorial will address the driving factors as to why an increasing number of SoC designs see the deployment of application specific (or custom) processors. We will address the challenges related to the need for a software tool infrastructure (linker, assembler, simulator), and how a model-based approach can overcome these. Finally, we will analyze the impact of model-based approaches on the traditional hardware design flow, including synthesis, FPGA prototyping, and verification.

C3 Tutorial: IC Compiler Custom Co-Design
C3 Tutorial: IC Compiler Custom Co-Design
Dr. Isaac Zafrany, Mattan Tsachi [Synopsys, Inc.]
In attending this tutorial you will learn how the IC Compiler Custom Co-Design solution enables design teams to easily move between digital and custom implementation flows, while maintaining design data integrity. The unified solution accelerates the design development cycle by enabling quick and reliable custom edits to IC Compiler designs at any stage of development, including the time-critical tape-out phase. See how Galaxy Custom Designer, with tight integration to IC Compiler, enables higher productivity through advanced features such as DRC/LVS correct interactive mixed-signal auto-routing and DRC-aware custom editing.

C4 User & Tutorial Session: Design Planning; Top level closure (TIO); Flip-Chip
Hippo Lake: A Case Study of Automated Design Planning in High-Speed Designs
Automated design planning solutions bring efficiency to full chip floorplanning, assembly, and integration. High-speed designs represent a unique set of challenges: Stringent timing, power and quality specs require many iterations for fine grained optimization, including hand-crafted optimal port placements, routing topologies, and buffering solutions. Can automated design planning tools achieve a similar level of TAT here as ASIC design without compromising on frequency, power and quality? We examine how the latest advances in EDA capabilities such a x-boundary timing optimization, and relative placement for block/top level co-design, can be used to implement a production, high-speed CPU design.

Faster Top-Level Closure with Transparent Interface Optimization (TIO)
Sharon Avital [ Synopsys, Inc.]
Transparent Interface Optimization (TIO) in IC Compiler is a new capability that addresses the challenges of gigascale design and enables faster top-level closure. This tutorial will provide designers technical information on TIO, its usage, current capabilities and roadmap.
Target audience: Design and CAD engineers and managers responsible for physical implementation and verification.

Flip-chip Package Support Solution Based on ICC 2012.06 Release
Moshe Ashkenazi [Synopsys, Inc.]
Flip chip packages are used more and more in today’s chips. This session will provide an overview of Synopsys’ flip chip solution based on the recent ICC release 2012.06.

C5 Tutorials: IT for EDA
Leveraging Adaptive Resource Optimization with Lynx
Glenn Newell [Synopsys, Inc.]
Adaptive Resource Optimization (ARO) is a closed loop system which collects the required “Predictors” of an element being processed by the DRM. ARO applies adaptive predictive techniques to the Predictors to algorithmically determine trending of the job resources and the future processing requirements for memory and/or CPU runtime within a controlled acceptable level of reliability. ARO as implemented with Lynx is a general solution on a project or multi-project basis. ARO enables the designers to do their work without having to build/modify/create scripts and utilities while benefitting from a completely automated and transparent system for maximum productivity and job throughput.

Management of High-Performance Compute Resources - Understanding the Impact of NFS Overhead
Glenn Newell [Synopsys, Inc.]
Newer Linux kernels have now counters that allow users to see per mount NFS latency. We will discuss how to access the counters and demonstrate scripts that will be useful in identifying network/storage bottlenecks and help Engineers and CAD managers see the impact of storage load during EDA tool runs in a more meaningful way.

C6 Tutorials: Enhancing DesignWare ARC Processor Performance; Complete Audio IP Subsystem for Your SoC
Tutorial: Enhancing DesignWare ARC Processor Performance using Custom Extension Instructions
Steve Tateosian [Synopsys, Inc.]
The DesignWare ARC processors offer a broad range of capabilities from very high performance to very small size, and are highly configurable and extensible. The ARC processors can be quickly tailored for a specific embedded or host application, making them ideal for today’s heterogeneous multi-core SoCs. This tutorial addresses the user extensibility aspects of the ARC processors focusing on floating point support, graphics acceleration and operating system performance optimization. Starting from execution profiling information, we will identify heavily used code sequences, which then will be optimized by extending the CPU through the introduction of new application specific instructions. The tutorial will also address aspects such as modeling, profiling, software tool support and hardware design flow.

Tutorial: Create a Complete Audio IP Subsystem for Your SoC
Shlomi Dan [Synopsys, Inc.]
Audio requirements for products continue to grow: devices become internet-connected, multi-channel content is everywhere, plus consumers want features like virtual surround sound. The industry continues to demand shorter time-to-market, lower risk and lower cost. This can be accomplished by using pre-integrated, pre-verified IP subsystems that take away all the traditional hardware and software efforts, and providing seamless plug-in to the application (software) on the host processor. This session discusses how the DesignWare Home Audio IP Subsystem enables designers to create a complete audio solution for their SoCs in a matter of minutes.
Target audience: Intermediate design engineers, software engineers, chip architects, engineering managers.