Using Foundation IP for Low Power in IoT SoCs

By Ken Brock, Product Marketing Manager, Logic Libraries

Internet of Things (IoT) edge devices, from smart watches, to fitness trackers, to parking meters, are now being designed in record numbers. Designers of these products all have one thing on their mind – a relentless focus on energy efficiency. To meet consumer demand for longer times between recharging, batteries are growing in both size and energy per cubic centimeter for phones and tablets. However, IoT applications have much tighter cost and space constraints so designers of IoT systems-on-chips (SoCs) cannot rely on more expensive or larger batteries alone to increase device use time. This article describes how designers can use foundation IP (logic libraries and embedded memories) on low-power process technologies to reduce the power consumption of IoT designs.

IP & Process Effects on Power Consumption

Many IoT devices spend up to 99.9 percent of their time off. These typically need to wake up quickly, run for a short amount of time, and then shut down or go back into a standby sleep mode. Figure 1 shows the energy consumption impact in using standard IP versus IoT-optimized IP through an IoT device’s start-sense-activity-sleep cycle. IoT-optimized IP completes its algorithms more quickly in active mode and uses less energy per unit of time in both active and standby modes.

Figure 1: IoT applications benefit from cutting power and reducing operating time with IoT-optimized IP 

In addition to choosing efficient IP, designers can use the native capabilities of the process technology and take advantage of the different threshold voltage (VT) process options. A given process technology will have its standard VT, a low VT (with more leakage and higher performance), and some number of higher VTs (with lower performance but also less leakage). Figure 2 shows how the performance and energy curves change as VT and operating voltage (VDD) increase. To minimize the required energy for a logical operation, designers can use the lowest possible VDD while ensuring that they meet their performance/computing speed requirements. Lowering the voltage can result in tremendous energy savings with only moderate performance tradeoffs.

Figure 2: Minimizing Power Through Voltage/VTs (Source: Dr. Jan M. Rabaey, Low Power Essentials)

Logic Libraries for TSMC 40ULP

Designs requiring low leakage and low voltage will benefit from the low-power DesignWare Logic Libraries for TSMC 40ULP. These libraries are designed to support both the standard and the eFlash process options. They're designed for low-voltage operation, so that they can run down to 60 percent of the nominal VDD to find the point of minimal energy use at various performance and activity levels.

The libraries are available in 7-track ultra-high density and 9-track high density architectures and offer both standard and custom low-voltage characterization based on project requirements. In addition, the Power Optimization Kits help designers minimize power while sustaining optimal performance for tasks including shutdown, dynamic voltage and frequency scaling, and establishing multiple voltage domains. Special low-power cells--multibit flip-flops, metastable-optimized flops, and special low-power flops—also help designers minimize power.

Figure 3: Relative performance of 7-track vs 9-track libraries (32-bit CPU on TSMC 40ULP)

Results will be dependent upon the depth of the data path of the CPU being synthesized as well as its operating voltage. In any case, the 7-track library is much more efficient at very low frequencies but as the frequency increases, the 9-track library can be the better choice.

Power Optimization Kits

Each logic library architecture comes with Power Optimization Kits, collections of cells that designers can use to manage power in IoT applications. The kits include cells such as header switches, isolation cells, always-on cells, live latches, and level shifters.

  • Header switches are cells that contain pull-up transistors from an always-on VDD rail to a switched rail, enabling the cells in a power domain to be switched on and off with an external signal in a controlled manner.
  • Isolation cells are used to clamp signals coming from a block that is powered down to a known state to avoid propagation of unknown (“X”) signals.
  • Always-on cells are buffers that connect to an always-on power source so that they maintain power after the primary power to the block is shut off with header switches. These are useful for designs that have a long line going through the middle of a block that is shut down and needs a repeater or to provide specific inputs to a retention register.
  • Retention registers help maintain state when power is dropped. In some systems, the designer will write the entire state of the design to Flash memory for an extended power down. However, if it's for a shorter amount of time or there is no Flash memory or there is only a small amount of registers that need to maintain state during power down, retention may be the answer.
  • balloon latch is a retention register that has a primary, a secondary and a save/restore section, resulting in an approximately 40% increase in cell area over a standard register or flip-flop.
  • live latch is retention register that has a primary and secondary, and the secondary doubles as the save/restore, eliminating the area penalty of a balloon latch.
  • Level shifters are used to translate signals from one voltage domain to another either stepping up in voltage, stepping down, or doing either.

PVT Options for 40ULP

Foundry-sponsored logic library and memory compiler IP for TSMC 40ULP is available in two clusters of process/voltage/temperatures (PVTs) as shown in Table 1. The 1.1V cluster can run down to 0.9V in 40ULP, giving designers the option to manage multiple voltage domains. The level shifters in the Power Optimization Kit enable blocks in the design to signal between the 0.9- and 1.1-volt domains. TSMC accepts sign-off at the FFG and SSG global corners, giving designers the ability to increase performance over conventional total corners.

Table 1: TSMC 40ULP PVTs

Specialty Flip-Flops Improve Performance & Power at Low Voltages

Designers can stretch their IoT design’s power budget through the use of specialty flip-flops and working at lower voltages. The combination of the multi-bit flip-flops and high-performance flops can be applied to cut power in IoT devices.

  • Delay-optimized flops are for launching signals into a piece of critical path logic as quickly as possible.
  • Setup-optimized flops offer very low setup time (or even negative setup time) to capture critical signals, enabling designers to stretch the clocks out and use features within EDA tools (e.g. useful skew) for optimal performance at the lowest voltage and the lowest energy.
  • Figure 4 shows a 2-bit multi-bit flip-flop. The primary and secondary are driven by the internally generated clock and clock bar. Using the two flops together offers the savings of removing two inverters, cutting ~10 percent of the area and ~10 percent of the leakage. However, the true benefit is in reducing the load on the clock tree which can save dynamic power by 25%.

Figure 4: Multibit flip-flops

Ultra-Low Leakage Library for Always-on Circuits

The DesignWare Ultra-Low Leakage Library for always-on wakeup circuits offers the lowest possible leakage without the use of the voltage regulators. It supports voltages from 0.9V to 3.6V, offering designers flexibility in battery options from AA cells to lithium cells by using the thick IO oxide.

This library set includes basic combinational cells for Boolean functions, sequential cells (flip-flops), clock cells and a Power Optimization Kit with level shifters (for switching from high-voltage domains to the regular and low-voltage domains) and isolation cells. The library can connect directly to the battery, helping designers manage power more effectively by controlling on-chip and off-chip regulators (Figure 5).

Figure 5: Ultra Low Leakage Library controlling on-chip and off-chip voltage regulators

Memory Compilers for IoT

Every IoT device needs memory, and one commonly used memory is high-density single-port SRAM. Typically the code is read, brought out of the Flash memory, and loaded into the SRAM while the processor runs out of the SRAM, maintains state and is able to run from the large amount of storage.

IoT designs also require high-density single-port register files. These memories are typically smaller than high-density single-port SRAM. Ultra-low leakage viaROM can be used for small IoT edge devices to store boot code that gets most of its program out of Flash. However, most of the major subroutines and drivers can be embedded into the ROM, and it's much more efficient for storage than Flash or even SRAM. Using an ultra-low leakage viaROM and source biasing in the SRAM can reduce leakage by up to 70 percent while maintaining data.

Power management for memories in IoT applications is critical. Designers can use deep sleep modes and long channel devices to reduce leakage. In addition, ultra-low voltage operation using read and write assist circuitry in 40nm memory dramatically reduces dynamic power dissipation.

Ultra-low-power ROMs provide a shutdown mode for maximum leakage reduction by turning off power to the array when it is not being read. This results in cutting leakage to practically zero. Memories tend to be a large portion of the SoC, so the ability to zero out the leakage associated with the memory saves a huge amount of battery life.

Testing IoT Memories

Memories need to be tested during production and occasionally at power-up. Synopsys offers an embedded and external memory test repair and diagnostic system, STAR Memory System, which supports eFlash for IoT applications (Figure 6). It cuts area use via shared STAR Memory System processors, helping designers reduce the overall size of their ICs and systems.

STAR Memory System on-chip eFlash capabilities eliminate the need to externally test the flash memory via the traditional more expensive tester approach. With test algorithms specific to eFlash, STAR Memory System can address failures unique to embedded flash by simply leveraging any existing IEEE1149.1 standard tester interface. The result is reduced production test costs and faster silicon debug and bring-up.

Figure 6: eFlash support is critical for IoT memory test solutions 

Summary

As designers make critical decisions for their IoT SoCs, they are keenly aware that the IoT space is crowded. The designer who achieves the lowest power for the given functionality is going to win. For a competitive edge, designers need to analyze the available logic libraries, embedded memories, and associated memory test solutions available for the low-power process of their choice to reduce their design’s overall power consumption.