UCIe Use Cases for Memory Applications in AI and HPC Systems

Aparna Tarde

Apr 22, 2026 / 5 min read

Synopsys IP
Technical Bulletin

In-depth technical articles, white papers, videos, webinars, product announcements and more.

Artificial intelligence (AI) and high‑performance computing (HPC) workloads are placing unprecedented demands on memory bandwidth, capacity, and energy efficiency. Modern generative AI models often require hundreds of gigabytes of memory to load and execute efficiently, far exceeding what can be integrated on a single compute device. While compute capability continues to scale rapidly, memory bandwidth and proximity increasingly determine system‑level performance. This memory wall is driving fundamental changes in how memory subsystems are architected for multi-die designs.

High Bandwidth Memory (HBM) remains central to addressing these demands due to its ability to deliver extremely high bandwidth within a compact footprint. However, traditional HBM integration approaches are being re‑examined as systems scale to larger models, higher power densities, and more complex packaging. In parallel, new memory extension architectures are emerging that rely on high‑speed die‑to‑die connectivity, standardized through UCIe, to support a broader range of memory and storage technologies. Together, these trends are reshaping how memory is integrated into next‑generation AI and HPC systems.

This article explores UCIe use cases for memory applications, focusing on custom HBM and emerging memory extension architectures in AI and HPC systems.

Rethinking Traditional HBM Integration

In conventional HBM systems, a host system‑on‑chip (SoC) integrates a JEDEC‑compliant HBM PHY and controller that directly interfaces with an HBM DRAM stack through a DRAM base die. This architecture delivers very high bandwidth, but it also consumes significant SoC die area and shoreline, increases power consumption, and constrains architectural flexibility as systems scale.

Custom HBM architecture addresses these limitations by making two key changes. First, the HBM base die transitions from a DRAM‑optimized process to an advanced logic process node. Second, the memory interface between the host SoC and the base die shifts from a parallel JEDEC HBM interface to a high‑speed die‑to‑die interface: UCIe. In this model, the HBM controller and associated logic move to the logic base die, while the host SoC communicates with the base die using a die‑to‑die link.

This architectural shift enables several important benefits. Moving the base die to a logic process improves power efficiency and allows additional logic to be integrated beneath the memory stack. This logic can support extended memory functionality, protocol handling, or near‑memory compute, depending on system requirements. Replacing the JEDEC HBM PHY on the host SoC with a UCIe interface also reduces area and data signals, allowing more silicon area to be dedicated to compute.

Figure 1 shows this transition from traditional JEDEC‑based HBM integration to a custom HBM architecture using a logic base die and UCIe IP.

Figure 1: Custom HBM architecture using a logic base die and UCIe IP

Advantages of a UCIe-Based HBM Datapath

A UCIe-based HBM datapath provides meaningful system‑level advantages beyond area and power savings. By using a packetized interface, HBM command and data transactions are tunneled efficiently between the host SoC and the base die. This streamlined datapath reduces protocol overhead, improves latency, and increases bandwidth efficiency.

The UCIe datapath can be designed with dedicated flow control for command and data channels, improving robustness under sustained high-bandwidth operation. Quality of service mechanisms, such as prioritized read queues, help manage latency sensitive AI workloads and ensure predictable performance under heavy memory traffic. On the host side, the interface may present one of the memory controller-supported protocols such as AXI or HIF, or alternatively, it can utilize a native system interface to eliminate unnecessary protocol conversions and streamline integration within existing SoC architectures.

Another important benefit is scalability. A well‑architected UCIe datapath can support multiple memory types using a common framework. While custom HBM is a primary driver, the same datapath architecture can be reused to connect to other memory or storage technologies, reducing design complexity and improving reuse across product generations.

Figure 2 illustrates an example of a UCIe-based HBM datapath with HIF or AXI interface between the host SoC and the logic base die.

Figure 2: Packet-based die-to-die HBM datapath between host SoC and base die

Memory Extension Beyond Custom HBM

Although custom HBM is a key application, UCIe‑based die‑to‑die architectures are not limited to HBM alone. System designers are increasingly exploring memory extension approaches in which a logic base die acts as a centralized memory hub for multiple memory types.

One such use case is extending access to off‑chip memory such as LPDDR or CXL devices. In these configurations, the base die connects to the host SoC via the UCIe interface while independently interfacing with external memory through standard PHYs and controllers. This approach allows the host SoC to access additional memory capacity without integrating multiple memory interfaces directly, reducing area and power overhead while preserving architectural flexibility.

Another emerging application is high‑bandwidth flash. Similar in structure to custom HBM, this architecture uses a logic base die connected to the host through the UCIe interface, with a stacked flash memory array on top. High‑bandwidth flash targets data‑intensive workloads that require faster access than traditional storage interfaces can provide, such as large‑scale AI inference, analytics, and data staging. While still evolving, these architectures demonstrate how UCIe can be extended beyond DRAM‑based memory systems. Figure 3 illustrates example memory extension configurations using a base die.

Figure 3: Different memory extension examples with base Die

Subscribe to the Synopsys IP Technical Bulletin

Includes in-depth technical articles, white papers, videos, upcoming webinars, product announcements and more.

UCIe Interface Requirements for Memory Applications

Custom HBM and related memory extension architectures impose stringent requirements on UCIe interfaces. These requirements differ between the base die and the host die due to their distinct physical and functional constraints.

Base Die Requirements

The base die sits directly beneath a stacked memory array and is highly sensitive to power density. Excessive heat can increase memory refresh rates, degrading both performance and energy efficiency. As a result, the UCIe PHY on the base die must achieve very high power efficiency, typically measured in picojoules per bit.

Bandwidth density is equally critical. The base die is limited by die edge availability, requiring the interface to deliver high bandwidth within a constrained footprint. At the same time, form‑factor compatibility is essential to ensure interoperability with multiple host devices, limiting how much the PHY width can vary. To manage thermal constraints, PHY depth on the base die is often increased to distribute power more evenly and avoid localized hotspots beneath the memory stack. Transmitters placed near the die edge further improve heat dissipation, while modest channel reach is sufficient for typical package configurations.

Host Die Requirements

The host die, often a large reticle‑limited SoC, prioritizes compute density. The UCIe PHY must therefore occupy minimal area while still delivering high bandwidth and excellent power efficiency. Like the base die, the host requires high bandwidth density and low energy per bit to support AI and HPC workloads.

In contrast to the base die, PHY depth on the host is typically minimized to preserve area for compute. Edge occupancy must still align with the base die for form‑factor compatibility, but overall optimization focuses on compactness and efficiency. Balancing these requirements is a central challenge in UCIe interface design for memory‑centric systems.

Synopsys UCIe IP for Memory Applications

Synopsys provides a comprehensive UCIe IP solution optimized for memory-centric die-to-die connectivity. The solution supports high bandwidth density, low energy per bit, and scalable PHY and controller configurations suitable for custom HBM, memory extension, and emerging stacked memory architectures. By integrating UCIe PHYs, controllers, and IP subsystems, Synopsys enables customers to deploy interoperable, power-efficient die-to-die interfaces while reducing integration risk and accelerating time to silicon.

Continue Reading

ASK SYNOPSYS
BETA
Ask Synopsys BETA This experience is in beta mode. Please double check responses for accuracy.

End Chat

Closing this window clears your chat history and ends your session. Are you sure you want to end this chat?