Insight Home | Previous Article | Next Article

**Issue 4, 2011**

**Technology Update**

High-Performance Analog Simulation with Multicore CPUs

*Analog and mixed-signal design teams are more dependent on verification than ever before. Hany Elhak, a product marketing manager at Synopsys, explains how they can extract more performance out of their HSPICE® circuit simulation environments with Synopsys’ HSPICE Precision Parallel technology, even when performing transient noise analysis.*

Today, analog design teams need to run longer and more frequent simulations while maintaining silicon-accurate results.

Why? Several technology and market pressures are combining to create a perfect storm of challenges in the mixed-signal world. For example, analog and mixed-signal design teams have to cope with designing for high-speed communications standards, the demands of next-generation process technologies, increasingly limited power budgets and demanding project timescales.

While the use of digitally-controlled calibration circuits can help designers compensate for some of the scaling issues as technology nodes shrink, using these techniques puts more pressure on the verification flow. As a result, design teams have to verify more process, voltage and temperature (PVT) corners, and spend more time simulating at each PVT corner.

Speeding Up SPICE

One way to accelerate mixed-signal verification is to enable SPICE algorithms to take advantage of the latest multicore processor hardware. The key to accelerating SPICE on a multicore CPU is to parallelize as much of each individual task as possible without sacrificing accuracy.

A typical SPICE simulation analyzes the circuit on a large number of time steps. Each time step consists of multiple iterations of two major tasks:

- Evaluating the devices in the circuit and loading them into a matrix, and
- Solving the matrix to calculate voltage and current at each node.

The iterations continue until the circuit converges. The simulator then moves to the next time step and repeats the process. How much time the simulator has to spend evaluating devices and solving matrices is dependent on the circuit type.

For small pre-layout circuits, evaluating devices takes up to 75% of the simulation time. This activity scales linearly with circuit size. When traditional SPICE simulators distribute this task on multiple CPUs the level of parallelization is modest: an 8-core machine will yield just a 2-3X speed-up.

For large post-layout circuits, where device evaluation represents less than half the simulation time, scalability is poor and simply solving the matrix can consume more than 50% of the simulation time. Even with 90% efficiency, Amdahl’s law predicts a theoretical speed-up of 5X on 8 cores (see Figure 1). The scaling is further limited by the speed of data as it moves between processors and memory. The actual speed-up can be less than 3X on 8 cores.

To obtain highly-scalable computations, the parallel efficiency of the underlying code must be very close to 100% (see Figure 1).

*Figure 1: Theoretical limits of parallel SPICE simulation (Amdahl’s Law)*

Precision Parallel Technology

HSPICE is the “golden” SPICE standard. All major foundries certify and publish device models on HSPICE first, as there are over 12,500 active licenses in use at over 650 companies worldwide.

HSPICE Precision Parallel (HPP) technology enables analog and mixed-signal design teams to get the best out of multicore computers for their SPICE simulations by enabling a greater percentage of the simulation to be parallelized. In fact, HPP technology delivers up to 7X simulation speed-up on 8 cores and 10X on 16 cores for analog and mixed-signal designs (see Figure 2). Using HPP technology, design teams can accelerate verification of their analog circuits across process variation corners, meet their project timelines and reduce the risk of silicon respins.

*Figure 2: HSPICE Precision Parallel technology speed-up on 4, 8 and 16 cores*

HPP technology is a new multicore transient simulation extension to HSPICE for both pre- and post-layout of complex analog circuits such as PLLs, ADCs, DACs, SerDes, and other full mixed-signal circuits. HPP technology does not compromise HSPICE accuracy. Because HPP technology manages memory efficiently, it has the capacity to simulate post-layout circuits consisting of more than 12 million elements.

Boosting Single-Core Speed

As well as accelerating HSPICE on multicore platforms, HPP technology gives HSPICE simulation a big boost even on single cores.

Today’s analog circuits incorporate components that operate at different time constants. For example, a PLL consists of a voltage-controlled oscillator and divider operating at a high frequency, while other circuit components, such as the phase detector, the filter and the digital control circuitry, operate at much lower speed. The adaptive sub-matrix algorithm in HPP technology manipulates the matrix in such a way that slower parts of the circuit can be solved in fewer iterations than the faster ones, significantly improving the overall simulation speed.

Figure 2 shows the average HSPICE single-core speed improvements over the past five years, alongside other improvements in multicore scalability, capacity, analog analysis features, convergence performance and distributed processing performance. The improvements have resulted in a significant speed-up over the past four years, equivalent to 50X for a 16-core platform.

*Figure 3: Improvements in HSPICE over the past 5 years*

Multicore Scaling

In addition to dividing the matrix-solving stage into smaller tasks that can be efficiently performed on multiple CPU cores, HPP technology parallelizes other small tasks, such as output and time-step control, to achieve parallelization efficiency close to 100%.

Even with almost 100% parallelized code, the actual scaling achieved is limited by cache misses and the time it takes to move data between cache and main memory, which in turn depends on cache size, memory bus speed and code efficiency. HSPICE uses very efficient localization of data in blocks that are comparable in size to the highest-level cache. Generally, larger caches and faster memory buses give better performance and scaling.

Memory Efficiency Benefits Simulator Capacity

HPP technology is capable of simulating post-layout circuits in excess of 10 million elements and 9 million nodes. HPP technology contributes 25% capacity improvement on average over the previous release.

Transient Noise Analysis

Design teams cannot ignore the electronic noise contributed by devices within today’s nanometer technologies. AMS circuits can have substantial nonlinearity and switching behavior that preclude using traditional linear noise analysis approaches (such as .NOISE). HSPICE transient noise analysis allows designers to run time-domain simulations that include random noise contributions from all of the circuit elements. The analysis takes all types of device noise into account, including flicker noise, channel noise and thermal noise, and represents each noise source uniquely. The resulting fully-nonlinear noise analysis allows design teams to simulate complex interactions of signals and noise in the time domain.

HSPICE transient noise analysis is fully multithreaded, and takes advantage of the performance improvements and scalability of HPP technology. Analysis controls allow designers to build ensembles of output waveforms, and to adjust noise source bandwidth to make critical noise predictions. There are no special modeling requirements since HSPICE device models already include noise. The post-processing features in HSPICE and Custom WaveView™ allow designers to measure random jitter and noise energy so they can easily measure noise across a variety of applications.

More design teams are using HSPICE for signal integrity (SI) analysis due to applications in high-speed circuits such as DDR4, DDR5 and SerDes. Synopsys’ HSPICE provides comprehensive support for S-parameter, W-element and statistical eye (StatEye) diagram analysis. It also integrates with the largest ecosystem of third-party electromagnetic solvers and SI flows for board-level analysis.

Summary

HPP technology achieves higher performance on multicore machines by removing a bottleneck that had previously slowed down multi-threaded simulations. HPP technology performance improves on multicore architectures, delivering the best performance on machines with larger caches and faster memory buses.

Efficient memory management allows simulation of post-layout circuits larger than 10 million elements. In addition to the new HPP technology, the latest HSPICE update includes enhanced convergence algorithms, advanced analysis features and foundry-qualified support for process design kits (PDKs). These improvements extend HSPICE gold-standard accuracy to the verification of complex analog and mixed-signal circuits. Design teams can accelerate verification, including advanced noise analysis, of their analog circuits across process variation corners, and minimize the risk of missing project timelines and silicon respins.

- More Information:

About the Author

**Hany Elhak**

Hany Elhak is a product marketing manager at Synopsys with over 15 years of EDA and semiconductor experience spanning both technical and marketing responsibilities. Prior to EDA, Hany worked as an RF designer, designing integrated circuits for cellular and wireless networking standards. He has authored six IEEE papers on RFIC design. Hany has a BSEE and an MSEE from Ain Shams University, Cairo and an MBA with honors from UC Berkeley, Haas School of Business.