SNUG Israel Abstracts 

Tuesday, June 18, 2013
9:15 AM - 10:45 AM
Accelerating Innovation in the Era of Exponentials
John Chilton, Sr. Vice President, Marketing & Strategic Development - Synopsys
Technical innovation is increasingly impacting everyone, everything, everywhere as today's consumers want it all: 24/7 connectivity, unlimited bandwidth, data, entertainment, security, portability and more. As this exponential trend continues, engineers designing the chips and systems inside of these electronics must adopt new technologies and strategies in order to deliver ever faster, lighter, smarter and cheaper products in record time. John's presentation will provide an overview of how these new technologies and strategies are helping designers continue to accelerate their innovation in the era of exponentials.

60GHz - a Journey Through the Present and Future Wireless Giga-bps Connectivity
Yaron Elboim - Vice President of Engineering - Wilocity
Since the devices are mobile, Wi-Fi is expected to handle the majority of the increase in data traffic. Wi-Fi usage trends force Wi-Fi to improve in capacity and in architecture. Wi-Fi will grow to have better throughput and capacity in all three bands: 2.4, 5 and 60GHz. Building chips that can meet the multi Giga-bps requirement on one hand but still fit to a mobile power and cost target on the other hand is a complex task.

In this presentation, Yaron will give an overview of the Wi-Fi usages trends and the challenges of making chips for 802.11ad which will help support the growing demand of Wi-Fi bandwidth.

Tuesday, June 18, 2013
11:15 AM - 12:40 PM
A1 - Vision Session - Verification
Verification Continum
Arturo Salz - Synopsys
SoCs are transforming the electronics industry by integrating a staggering amount of functionality into extremely cost-effective, high-performance, low-power, single-chip implementations. In their ongoing quest to maximize functionality while optimizing power and performance, SoCs have begun to blur the distinction between traditionally different design domains such as mobile and desktop, or graphics and communications. Designs now commonly include all of these in a variety of configurations capable of adapting to the desired application. Inevitably, as SoCs become larger and more complex, they strain the capacity and performance of traditional verification methodologies and tools.

With the convergence of all these formerly distinct design domains and styles, most modern designs resemble an SoC capable of switching between high-performance and low power. Understandably, the largest and most complex SoCs tend to grab the headlines and drive the verification agenda. But, the efficiencies and capabilities of verification tools needed for SoC designs are applicable to a wide spectrum of design styles, including those typically found in Israel design centers. In this presentation, Arturo Salz, Scientist at Synopsys, looks into the crystal ball to explore the emerging verification technologies and how unparalleled R&D investment at Synopsys, in partnership with leading practitioners, is driving unique technology breakthroughs that are transforming verification and overcoming the challenges of the future.

Target audience:
Design and verification engineers and managers.

A3 - From FinFETs to ECOs
Design with FinFET & Double-Patterning, a Brief History of the Future
Marco Casalle-Rossi - Synopsys
While planar transistors have worked well from 100 microns down to 20 nanometers, revolutionary, non-planar transistors such as FinFET do offer superior attributes and demonstrate much better results in terms of performance, static and dynamic power at 14 nanometers and beyond. Although FinFET is considered as the most promising device for emerging technology nodes, it introduces new design complexity for IP development, RC extraction, electrical simulation, and physical verification. Even interconnect is undertaking a revolutionary evolution. The last pitch manufacturable using immersion lithography with single exposure is 80 nanometers. Starting at 20 nanometers, double-patterning is required as we wait for EUV availability and might be replaced by triple- and even multi-patterning at 14 nanometers and beyond. Like FinFET, double-patterning introduces new design complexity for IP development and physical implementation and verification. Please join this session to hear what Synopsys is doing to hide unnecessary details, whilst enabling designers to get the most out of these emerging technology nodes.

New Technologies for Faster Implementation of Functional ECOs
Mitch Mlinar - Synopsys
Implementing and verifying functional ECOs are time consuming and introduce schedule uncertainty late in the design process. This session will discuss new and exciting equivalency checking technologies that significantly speed up the implementation of functional ECOs in your designs and the required verification step to ensure the ECO’s functional correctness.

A5 - Custom Design – Overview and User Experience
Laker 3 Custom Layout System - "An Advanced Process Node Custom Layout Tutorial"
Uri Golan - Synopsys
In this technology session you will learn about the new layout challenges introduced by 20nm and below process technology. You'll also see the advanced features in the Laker Custom Layout Solution that help with these issues. Laker is extremely fast and has unique automation features that are ideal solutions for those seeking to improve layout productivity. Technologies that will be covered in this tutorial include Laker's rule-based layout, schematic-driven layout, and pattern-based multi-device layout features — which have all been fully updated for process nodes at 20 nanometers and below.

A6 - ARC Tutorial and User Experience:
Efficient High-Performance Processing
Yankin Tanurhan - Synopsys
This tutorial will present Synopsys' new high-performance ARC processor scalable technology for the embedded market.

Tuesday, June 18, 2013
1:30 PM - 3:00 PM
B1 - Tutorial and User Experience: Designing and Debugging with Verdi3
Verdi3 Transaction-based Debugging for SoC Designs
Arnold Sher - Synopsys
Verdi3 Automated Debug System is an advanced open platform for debugging digital designs which helps you comprehend complex and unfamiliar design behavior, automate difficult and tedious debug processes and unify diverse and complicated design environments.

In this session we'll present SystemVerilog Interactive Debug, HW/SW debug, Verdi3’s Transaction Based Debugging technology and Verdi Interoperability Applications (VIA) program.

VIA allows Verdi users to build custom applications by using Verdi3's C and TCL-based programming interfaces and to personalize Verdi3 to maximize their productivity.

A New Automated Gate-Level Waveform from RTL Waveform Generation Methodology - Reduces Power Estimation Time from Weeks to Hours
Ophir Turbovitch - CSR
This paper describes a new methodology that automatically generates a chip design’s gate-level waveform from the RTL design environment without the need to bring up the gate-level environment. The new waveform generation methodology reduces the effort to perform gate-level power estimation from weeks to hours, using established EDA technology from Synopsys and Cambridge Silicon Radio’s established power estimation flow and tools. This major reduction in effort and increase in designer productivity enables CSR to analyze power characteristics much earlier in the design flow than is practically possible using traditional, high-effort gate-level analysis. Moreover, the new methodology produces waveforms identical (or nearly identical) to those generated by gate-level simulation. Consequently, the design can be analyzed and optimized iteratively throughout the post-synthesis design flow, enabling much earlier detection and easier resolution of power issues. The paper discusses: power analysis challenges; a new, automated gate-level waveform methodology; Siloti™ Visibility Automation System; analysis results

Target audience:
Design and verification engineers and managers.

B2 - Tutorial and User Experience: Emulation and Congruency
Transaction-level Verification with ZeBu-Server - What, When, How
Jacob David - Synopsys
In this tutorial you will learn what transaction-level verification means in general and specifically how it applies to the ZeBu-Server emulation platform. Transactors offer a unique combination of performance, accessibility, flexibility and scalability while providing a realistic system-level test environment for the DUT. Transactors allow you to quickly build a high-speed system-level Virtual Platform by surrounding your emulated DUT with Virtual Components that interact with its various interfaces.

The tutorial will describe the inner workings of a transactor with emphasis on the advantages and tradeoffs compared to alternative approaches including the traditional in-circuit emulation (ICE) approach. Step-by-step instructions for creating a transactor will be provided, including an introduction to a high-level SystemVerilog behavioral language/compiler called ZEMI3, conceived for automating the generation of a transactor. Closing the session, a practical application in the wireless space will be described in detail including steps to compile a design into the ZeBu-Server.

Hardware Congruency - Introducing Hardware Semantics for RTL Simulations
Hardware Emulation is rapidly becoming an essential technology for enabling pre-silicon logic validation. For the day-to-day work of design and validation, RTL simulation remains the primary tool for simulating the design. Ideally the transition from RTL simulation to hardware emulation would be a matter of hardware emulation compile time. Unfortunately differences in semantics between RTL emulation and RTL simulation result in long delays between the release of an RTL simulation model and the release of its corresponding hardware emulation model.

This paper describes the Congruency Concept – New semantics for simulating synchronous designs that are coded in synthesizable SystemVerilog and the modeling techniques that are aimed atclosing the semantically and behavioral gaps between RTL simulation and RTL emulation. The process of how Congruency was validated and deployed to a major CPU design is also described, along with its merits and limitations, and the effort required in making Congruency standard for RTL simulation runs.

Target audience:
Design, prototyping and verification engineers and managers.

B3 - Tutorial: Static Timing Analysis - Signoff
PrimeTime - New, Faster Timing Closure Technologies
This session will cover new PrimeTime capabilities for faster timing closure including technology integration from recent acquisitions, GoldTime and Tekton.
Topics that will be covered include:
  • Performance Improvements
    • Faster multicore scalability with new improved algorithms from GoldTime and Tekton
  • Scenario Reduction Technologies
    • Mode Merging to Reduce Scenarios Required for Timing Closure and Signoff: Learn how PrimeTime mode merging reduces scenarios for timing analysis and signoff driven ECO to reduce turn-around-time and compute resource requirements
    • Simultaneous Multi-Voltage Analysis for Faster Timing Signoff: Learn how PrimeTime's multi-voltage aware analysis technology reduces risk and speeds signoff for designs with multiple voltage domains.
  • Signoff-driven ECO Guidance
    • Overview of PrimeTime ECO guidance capabilities
    • Latest addition of signoff-driven leakage recovery
    • New in 2013.06, physically-aware ECO guidance, working with IC Compiler minimum physical impact (MPI) technology. Uses routing and placement information to improve ECO timing closure and avoid large cell displacements and unnecessary rip-up and re-route thereby reducing ECO iterations.
Target Audience:
PrimeTime users and managers responsible for design, implementation, and signoff

B4 - User & Tutorial Session: Hierarchical Design, Multi-source Clock, ICC-2013.03 Update
Feed Through Insertion at Hierarchical Design Flow
Gilad Konsker, Avi Zukerman - CSR
The EDA tools used today achieve reasonable QoR for a design that has ~1.5 million placement elements (cells). In order to implement larger designs, a hierarchical flow must be used. Hierarchical flows add complexities to the design flow. One of the main challenges is the need to feed top-level signals through hierarchical blocks. Synopsys offers a flow for feedthrough insertion within the recommended methodology for the design planning phase. However, the need for a more controlled and reusable flow has driven us to implement a home-grown solution for the feedthroughs inserted into the design. In this paper we will discuss the reasons which brought us to implement our own solution and present the principles of this solution.

High-Speed Clock Tree Implementation Using “Multi Source CTS” Capabilities
One of the biggest challenges in chip and block design is implementing the clock tree. Clock tree implementation has a vast impact on design power and size, and above all on timing closure of high frequency designs. In recent years, as the design size has become bigger and bigger (above 1M flat gates) along with relatively high speed design, a standard CTS solution is becoming non applicable for latency and skew requirements. This paper will demonstrate two techniques of clock tree implementation and will present comparative analysis data on various metrics resulting from each one. The first one is traditional clock tree synthesis (CTS) which is very common in ASIC flows; the other is a unique technique which is based on new “Multi source CTS” capability of IC compiler.

ICC 2013.03 Release Update
IC Compiler's latest release delivers significant improvements in faster design closure, high-performance/low-power design and advanced process node support. The 2013.03 release includes minimum physical impact ECO implementation, useful skew for clock and data optimization, new technologies for reduced power, and expanded double patterning support for a color-ready place and route solution.

B5 - Tutorial: Advancing AMS Verification
Advanced AMS Verification and Techniques using Synopsys FastSPICE and Mixed-Signal Solutions
Learn about the latest technology advances in CustomSim, FineSim and Discovery-AMS to ensure successful AMS design verification for advanced process nodes. This tutorial will address a wide range of challenges including low power, reliability, and mixed signal. You will learn how to further advance your current AMS methodologies for performance and accuracy using new features available in the latest release of HSPICE, FineSim and CustomSim.

B6 - Tutorial: Implementing 10G Backplane Systems
Achieving Predictable and Highly-Reliable 10G Backplane Designs
The session explores the challenges of implementing 10 Gbps backplane systems. These systems can have greater than 30" PCB traces with multiple connectors. It is also desirable to have bit-error-rates (BER) better than 10-12 for high-reliability applications, going beyond the base specification for real-world channels. A system model is described and representative channels are presented. The presentation explores the architectural and circuit techniques required to meet the stringent requirements, including the trade-offs associated with PLL implementation and receiver equalization to enable high-reliability system design.

Synopsys IP Used: DesignWare 10G SerDes

Target audience:
Design Engineers and Managers and System Architects who are interested in implementing 10G SerDes IP into their SoCs.

Tuesday, June 18, 2013
3:20 PM - 4:50 PM
C1 - Tips for Using System Verilog (SV) Interface for Design
Tuning VCS Compilation with Pre-compile IP Flow
Arik Rachevsky - Synopsys
Modern SoC designs are the result of integrating many IPs, several cores and top- level logic. One of the major bottlenecks in the verification process is the compile time of the full design. Furthermore, in a compile per test methodology the compile time becomes a major obstacle. To overcome this growth in design size, we can use the new precompile IP flow available with latest VCS. We will describe how we can precompile the majority of the SoC model, so each test can skip the compilation of the d.u.t, and reuse the precompiled blocks. With the precompile IP flow the elaboration time of the SOC at each test can be reduced by 3X, allowing "compile per test" methodology even in full chip level.

Reducing Gate-Level Cycle Time Using VCS Advanced Features
Ron M. Bar, Eran Glickman, Benny Michalovich, Erez Parnes - Freescale
Gate level simulations have a distinctive benefit but also a distinctive drawback. The complexity of the model combined with timing information demand a heavy penalty both in compilation and simulation. The compilation overhead is purely technical. Re-compilation resulting from a change to the stimulus or an error in the timing checks may cost days. By using VCS's advanced features it is possible to overcome the recompilation issue and create an efficient flow that eliminates the need for recompilation due to stimulus or timing changes. The flow allows the creation of a simple text file stimulus that can manipulate signals and registers using VCS acc capabilities and a method to control timing checks using UCLI and other testbench functionality without compilation. This flow not only saves considerable time and resources using features of VCS but it is also useful beyond the scope of gate level simulations.

C1 - Tips for using System Verilog (SV) Interface for Design
System Verilog (SV) Interfaces for RTL-Design
Guy Nakibly - Annapurna Labs
SystemVerilog interfaces provide the designer with a powerful method of integrating large designs with fewer bugs while reducing the number of lines of code. However, there is currently no formalized standard methodology for using SV interfaces for RTL design. Every tool uses it slightly differently. This paper presents guidelines for using SV interfaces, especially when using:
  • Parameter SV interfaces
  • An array of SV interfaces
  • Top-level SV interfaces for synthesis
Annapurna Labs uses the SV interface extensively for RTL design and has already completed all IC design stages (frontend and backend), as well as FPGA design, with SV interfaces.

Target audience:
Design and verification engineers and managers.

C2 - Prototyping Tutorial
Synthesis Methods for FPGA-based Prototyping and HAPS-70 Family Overview
Yair Dahan - Synopsys
This tutorial is for design and verification engineers who are synthesizing FPGAs that will be used for prototyping an ASIC or SoC device. It focuses on how to quickly bring RTL code written for an ASIC or SoC device into an FPGA(s). Techniques for getting the design into the prototype quickly such as DesignWare synthesis, automated gated and generated clock conversion, HDL code checking with continue on compile error capability, and fast synthesis mode with multi-processing will be discussed in detail. A section will be dedicated to HAPS-70 family.

Target audience:
Design, prototyping and verification engineers and managers.

C3 - Low-Power Experience and DC Updates
Implementing UPF Flow for SoC Design
This paper describes some key learning from the gradual implementation of a full UPF flow for SOC design. This process spanned over several SOC design projects and was very painful. We had to overcome many obstacles such as understanding the UPF spec semantics, understanding the design tools, spec interpretation and limitations of the implementation, dealing with designers’ skepticism concerning the flow/tools maturity, defining design methodologies and more. This paper discusses some of the issues encountered. It may be useful for anyone who plans to use this flow in the future. It may also interest people involved in UPF standardization activity.

Introduction of Multi-Bit Banking Solution
Sharon Avital - Synopsys
Optimization for power is one of the most important objectives in nanometer IC design. Reducing power consumption in chips enables better, cheaper products to be designed and power-related chip failures to be minimized. Clock trees are one of the biggest contributors to power consumption. By keeping the actual length of the clock tree short, we can immediately reduce the overall power consumption. This session will describe how IC Compiler was used to reduce the clock tree length by grouping registers together in banks of registers (the so-called multi-bit banks). By ensuring that several registers are inside one macro, the length of the clock net is reduced, resulting in power savings.

DC 2013.03 Release Update
Gal Hason, Eyal Odiz - Synopsys
This tutorial presents the latest advances and methodologies of the Design Compiler family of products, including DC Explorer, Design Compiler Graphical, Power Compiler and Formality. The session will describe technologies in the 2013.03 synthesis release that deliver better circuit quality, improved quality of results, and enhanced ease of use. Topics include new layer-aware buffering technology to accurately model delays at smaller process nodes, enhanced support for Multi-Bit cell mapping and improvements to power optimization. You will also learn how DC Explorer enables connectivity analysis and floor planning at an early design stage.

C4 - Tutorial: Implementation Flows for ARM Cores
Engineering Trade-Offs in the Implementation of a High-Performance Dual Core ARM® Cortex™-A15 Processor
Joe Waltson, Moshe Ashkenazi, Erik Olson - Synopsys
Learn about the engineering trade-offs and flow development process to balance gigahertz+ performance and low power on a dual-core Cortex-A15 MPCore™ processor implementation. This tutorial will highlight best practices and technologies from the Galaxy Implementation Platform to meet challenging performance targets, while minimizing leakage power. Synopsys' high-performance core (HPC) methodology will be demonstrated through a reference implementation of a dual-core ARM Cortex-A15 processor with ARM POP™ technology for core-hardening acceleration on TSMC 28HPM process. Technologies featured include physical guidance for a predictable implementation flow, transparent interface optimization for faster top-level closure, and final-stage leakage recovery for reduced leakage power. A special section will present the multi-source clock and useful skew technologies used in high-performance cores.

C5 - Tutorial and User Experience: Design for Test
Use of Synopsys Inserted Scan Wrappers in SoC ATPG
Eli Borowitz - Broadcom
The increasing size of VLSI designs makes it challenging to run ATPG and Verilog simulations on full chips. Divide and conquer type approaches such as a scan wrapper insertion flow for core isolation are now part of the synthesis tools. Scan wrappers are also used to generate patterns at IP level and use them at chip level. This is especially useful for multiple instances of the same IP in one chip.

Insertion of scan wrappers is an essential part of the process and other tools are needed to complete it. In this paper we discuss the options available in the scan wrapper insertion flow and their effect on the solution. We also discuss the other pieces of the puzzle required in order to complete the flow - scan shell generation and pattern mapping. Actual results from a large scale design will be presented and compared to conventional full chip flow.

Meeting Test Quality Goals in Hierarchical Designs
Adam Cron - Synopsys
This tutorial will highlight leading-edge capabilities in the Synopsys synthesis-based test solution for maximizing productivity, increasing test quality, and lowering test cost. We will discuss how standards-based DFT has evolved within DFTMAX compression to save time and effort implementing test for extremely complex designs. Next, we will examine several advanced detection mechanisms in TetraMAX ATPG for improving defect coverage. We will then show new features in the tools that lower the cost of testing ARM processor-based designs and other multicore SoCs.

C6 - Tutorials: Hardening DSP Cores and PCI Express IP
Hardening DSP Cores for Performance with DesignWare Logic Libraries and Embedded Memories
Ran Snir, Ceva - Synopsys
In this session we’ll share design experiences with hardening a high-performance core using DesignWare Logic Libraries and Memory Compilers on a 28nm process, along with Synopsys implementation and signoff tools. We will also show how choosing the correct IP and methodology helps achieve optimal results as well as discuss best practices to fine tune the results to reduce leakage power. In addition, best practices will be presented for implementing memories and libraries to deliver superior performance, power and area.

In the Cloud With PCI Express
Michael Chen - Synopsys
With PCI Express continuing to be the de-facto interconnect for Cloud computing systems, there is a growing need for functionality to address the increased storage requirements and the greater path loss and equalization complexity at 8 GT/s. This tutorial discusses how the PCI Express interconnect is addressing storage requirements in server-based SoCs with standards such as SATA Express and NVM Express. The session will also examine the need for active repeaters to help compensate for the significant path loss at 8 GT/s along with other developments in the specification that support the continued development of Cloud-based computing. Since the PCI Express protocol doesn’t stop at servers, this session also examines how the latest PCI Express features help designers address low-power requirements in mobile applications including Optimized Buffer Flush/Fill (OBFF), latency tolerance reporting (LTR), L1 sub-states and the new M-PHY over PCI Express standard.