In-depth technical articles, white papers, videos, webinars, product announcements and more.
The UCIe standard has been evolving to streamline die-to-die connectivity, provide high bandwidth to meet scalability and complex performance requirements, and enable reuse of existing dies (also called chiplets) for accelerated time-to-market. UCIe 3.0 continues to promote innovation in multi-die designs, delivering significantly increased bandwidth, enhanced energy efficiency, and an expanded ecosystem.
The UCIe Consortium’s announcement highlights the following key features for UCIe 3.0:
Support for 48 GT/s and 64 GT/s data rates, doubling the bandwidth of UCIe 2.0 (32 GT/s) to meet high-performance chiplet demands
Extended sideband channel reaching up to 100mm supports more flexible SiP topologies
Support for continuous transmission protocols through mappings, enabling uninterrupted data flow in Raw Mode for new applications such as connectivity between SoC and DSP chiplets
Early firmware download standardization using Management Transport Protocol (MTP) for streamlined initialization
Priority sideband packets allow deterministic, low-latency signaling for time-sensitive system events
Fast throttle and emergency shutdown mechanisms provide immediate system-wide notifications
Open Drain Pins to enable low-latency, bi-directional events between chiplets like urgent shutdowns needed or sudden change of speed of lanes
Power saving via runtime recalibration and L2 optimization enables power-efficient link tuning during operation
Fully backward compatible with all previous UCIe specifications for seamless integration and adoption
The new specification presents a number of enhancements, most notably the support for elevated data transfer rates of 48Gbps and 64Gbps. This enhancement addresses key requirements for high-performance computing (HPC)/AI applications. This article highlights the evolution toward 64Gbps UCIe, outlining key challenges and design considerations in die-to-die connectivity for HPC and AI applications.
Supporting higher speeds such as 64Gbps in the UCIe PHY requires the analog input/output (I/O) blocks to operate at higher frequencies, such as 16GHz, when using a quadrature clocking architecture for data sampling. In addition, 64Gbps UCIe uses more power than 32Gbps UCIe, mainly due to its transmitter and receiver I/O circuits. Minimal power utilization is essential for most interface IPs in today’s data centers, due to the high compute demands of machine learning and large language models.
There are various solutions that can be considered to minimize power, including die-to-die link optimization for short channels less than 3mm in advanced package and less than 5mm in organic substrate package technologies. This approach enables the analog driver to reduce power consumption while preserving sufficient performance margins.
Channel design plays a vital role in improving PHY performance, making it beneficial to define the PHY architecture with the various package types in mind. Improvements in channel characteristics, such as minimizing Insertion loss and crosstalk, can also enhance power efficiency. Enhancing package routing can not only help improve system performance but also save on associated packaging cost. Figure 1 shows signal layers connecting two dies on an advanced silicon interposer-based package.
Figure 1: Advanced package cross section using a silicon interposer with 5 layers of a UCIe-A module
Advanced packages are challenging since they are enabled with limited number of signal routing layers as directed by the corresponding package vendor. For example, Chip-on-Wafer-on-Substrate – Silicon Interposer (CoWoS-S) allows up to 8 layers for signal routing on the silicon interposer.
In the UCIe-A bump map, receiver bumps are positioned behind transmitter bumps, requiring the 64Gbps receiver signals to route beneath the transmitter bumps to reach the die edge, as shown in figure 2. This configuration leads to more complex packaging and a longer die-to-die channel. The routing can increase crosstalk and intersymbol interference, especially at higher speeds such as 48Gbps and 64Gbps. To address these challenges, it may be necessary to strengthen PHY capabilities through enhanced equalization, and signal isolation within the package layers, or redesign the bump map to simplify signal routing.
Figure 2: UCIe-Ax64 10 column PHY bump map showing signal escape (Source: UCIe Consortium)
Reducing the PHY jitter requirements, both random jitter (Rj) and deterministic jitter (Dj), can help lower power consumption but may result in higher bit error rate (BER). This increased BER can be compensated by implementing Forward Error Correction (FEC) in the system. A lightweight FEC using Decoder with Test Error Detection (DEC-TED) which can detect up to three errors and correct two, may be an optimal solution for die-to-die channels, minimizing latency and overhead.
Another challenge is timing closure and routing of the digital signals on chiplet. UCIe-A PHY has 64 lanes running at 64Gbps, leading to a bandwidth of 4Tbps. This requires the UCIe controller or protocol layer link to run at 256B and 2GHz frequency. Meeting timing with 4,096 signal wires running at 2GHz can be challenging. In addition to the 4,096 data signals, control signals such as ready/valid and clocks also need to be routed with the narrow 388 um width of a UCIe-A PHY. With multiple PHY and controller links on a single die, the routing congestion can quickly become unmanageable. This creates a need to carefully optimize digital routing between PHY I/Os and multi-die design by fine tuning the system floorplan, placement, and routing. Designers can also implement innovative solutions, such as increasing PHY width with a custom bump map, if the resulting reduction in bandwidth efficiency is an acceptable trade-off for the system.
Operating analog circuitry at higher data rates raises concerns over Electromigration (EM) and IR drop in the PHY. Proper management of these factors is essential for the reliability and performance of high-speed designs. To address this, the 64G PHY bump map defined in the UCIe 3.0 specification includes more power and ground bumps compared to the 32Gbps PHY, resulting in a greater depth.
Additional design challenges include the impact of parasitic capacitance from electrostatic discharge (ESD) protection components in high-frequency circuits, and the need for higher metal stacks counts to support extreme routing density, power delivery, and signal integrity requirements.
Synopsys provides extensive experience in designing cutting-edge die-to-die IP technology that fits the needs of HPC and AI applications. Synopsys’ UCIe PHY and Controller IP, deliver optimal PPA and support data rates up to 64Gbps, offering key features such as:
For more information, visit synopsys.com/ucie
Includes in-depth technical articles, white papers, videos, upcoming webinars, product announcements and more.