Contact Sales

Search Synopsys

Multiphysics Fusion Technology for Multi-Die Designs Explained

Unified multiphysics fusion helps multi-die teams validate earlier and sign off faster.

Automotive Executive Guide: Rethinking Automotive Development

A guide to virtualization in software-defined vehicles for automotive leaders.

Mastering AI Chip Complexity

This eBook explores AI chip design trends, challenges,
and strategies for first-pass silicon success.

CXL: How an Accelerator Link Caused a Memory Revolution

Synopsys Editorial Staff

Jul 16, 2023 / 6 min read

Table of Contents

CXL Background
CXL.mem Usage and Motivation
Adopting CXL.mem in SoC Design
Conclusion

In early 2019 a new interconnect called Compute Express Link (abbreviated CXL) was announced to the world. Building on top of the PCI Express (PCIe) physical layer and running at the then-fastest PCIe link speed of 32GT/s, the new specification was touted as being the ideal interconnect for computational coprocessors. Four years later, talk of such acceleration is nearly drowned out by seemingly endless waves of announcements about shared, pooled, disaggregated, and every other flavor of memory one can imagine being NOT directly attached to the CPU. This article will give a brief background and explanation of CXL and help guide decisions on whether it’s the right technology for your next SoC design.

CXL Background

When the CXL 1.0 spec was released, it brought the concept of FLITs (FLow-control unITs) into the PCIe world. Using the Alternate Protocol Negotiation feature introduced with the PCIe 5.0 spec, two PCIe link partners proceed through an otherwise normal PCIe link negotiation sequence, but instead of ending up in the PCIe “L0” state, they enter CXL mode and begin exchanging CXL FLITs instead. With its focus on a host-centric asymmetric cache coherency protocol, latency is absolutely critical to CXL’s effectiveness, and the entire CXL specification is built around minimizing latency. A discussion of the details and tradeoffs made to achieve that goal is outside the scope of this article, but for the 64-byte transfers typical of many CPUs’ cachelines, CXL offers several times lower latency than PCIe equivalents. CXL actually defines three different protocols: CXL.io, CXL.cache, and CXL.mem. CXL.io transports PCIe for backwards compatibility, bulk data transfers, and system configuration purposes – with no gain in performance over a native PCIe link. CXL.cache and CXL.mem are the real substance of the CXL spec – supporting only 64-byte transfers but offering the amazingly low latency that’s become CXL’s hallmark.

Figure 1: CXL Protocol Usage in a CXL System

CXL.mem Usage and Motivation

Originally it seemed like the CXL.mem protocol was defined simply to allow a host CPU to access an accelerator’s local memory, but the spec was wisely written generically enough to enable that host to access arbitrary memory in the CXL fabric. Initially, there was interest in using CXL.mem as a way to add non-volatile memory to systems. Traditional storage interfaces are block-oriented and difficult to adapt to the random access patterns typical of CPUs. While it’s always been possible to map memory on interfaces such as PCI Express, the I/O focus of those interfaces combined with legacy software means that memory attached there cannot effectively be cached by the host CPU. CXL.mem’s ability to access memory from the CPU’s existing cache architecture enables designers to easily attach NAND flash and other non-volatile memories in a way that appears to the host CPU as if it is simply more system memory. CXL is really the first interface to offer widespread access to non-volatile system memory, and that opens up a new frontier for software architecture which blurs the lines between files and memory.

It didn’t take long for system architects to expand this same vision of CXL-attached memory to traditional volatile memory technologies. By moving system memory away from the CPU, it becomes much more flexible. Where traditional architectures can leave unused memory connected to one specific CPU inaccessible to other CPUs, memory residing on a CXL interface can be allocated to CPU A for some period of time, and CPU B at some other time. As CXL fabrics extend to box-to-box environments, one can imagine a near future in which memory from multiple physical servers can be pooled to service a memory-intensive job. Many segments of the datacenter market found this flexibility in memory placement to be extremely attractive, and this market has been exploding over the last couple of years. Such CXL.mem devices are dominating discussion at such industry events as Flash Memory Summit, Open Compute Project Summits, and more.

Figure 2: CXL.mem Enables Memory Everywhere

The availability of CXL-attached memory products also gives designers the flexibility to use CXL interfaces for off-chip memory. In the past, an SoC designer had little choice other than to provision a DDR interface in order to access large amounts of off-chip memory. This is quite limiting when the same SoC might be used in configurations with no off-chip memory. By instead using a CXL interface, the SoC architect can repurpose that interface when it’s not needed for memory attachment.

Designing for 448G Ethernet

Explore host architectures and modulation strategies for next-gen AI and HPC cluster networks.

Download

Adopting CXL.mem in SoC Design

SoC designers looking to incorporate CXL.mem into their architectures have many options available. Naturally the first attribute to consider, regardless of planned usage, is latency. It may not be immediately obvious, but standardized interfaces typically used for on-chip connectivity are usually NOT well suited to CXL.mem. Bridging between CXL.mem and an existing interface is going to add latency – potentially even requiring store-and-forward buffering to account for protocol differences. Because of that, architects and designers are well advised to connect their CXL interface as logically close to their actual buffers and data movers as possible.

Also less obvious is the PHY’s contribution to overall CXL latency. CXL’s low overall latency means that small variations in PHY latency which were little or no concern in PCIe designs can now have a measurable effect. Likewise, most commercially available CXL solutions are going to utilize the PIPE SerDes Architecture (as opposed to the Original PIPE Architecture) because it allows the CXL controller to optimize the datapath from the PHY.

As mentioned earlier, there are two ways in which an SoC can connect to a CXL link. If the SoC is providing memory to the system, it will connect as a CXL.mem Device. If the SoC is connecting to a CXL memory device, the SoC will connect as a CXL.mem Host. For maximum flexibility, designers will likely want their CXL interfaces to be able to operate in either mode, Host or Device, and so the SoC architecture needs to consider data flow and control for both. The CXL.mem protocol is straightforward, so CXL Host functionality can be largely software-driven in most implementations. CXL.mem Devices can range in complexity from simple to extremely complex. Designers do need to be aware that CXL differs from many other interface specifications by including detailed device-level controls as part of the interface specification. Various quality of service monitors and controls are specified, as are address decoding and partitioning, so CXL.mem Device designers need to pay close attention when developing the internal control and status interfaces for their designs.

Conclusion

CXL.mem offers both system and SoC designers significant flexibility in implementing low latency memory solutions. SoC designers need to focus on reducing their overall latency, and plan early for specification-required application logic when implementing CXL.mem Devices. Synopsys offers a wide range of CXL and PCIe controller and PHY IP including Dual-Mode (both CXL Host and CXL Device run-time selectable in a single controller) and Switch port support. Synopsys further has the expertise to help guide SoC architecture and design choices to achieve the best possible latency and performance.

Subscribe to the Synopsys IP Technical Bulletin

Includes in-depth technical articles, white papers, videos, upcoming webinars, product announcements and more.

Related Resources

Article

World's First Standards-Compliant 112G PHY IP for Linear Optics: A Turning Point for AI Interconnects

Read Now

Blog

Synopsys PCIe Leadership Recognized: Paul Cassidy Joins a Track Record of PCI-SIG Contribution

Read Now

eBook

Explore the IP that enables high-performance, scalable AI systems

Download eBook

Continue Reading

4 min read / Jul 15, 2026

World's First Standards-Compliant 112G PHY IP for Linear Optics: A Turning Point for AI Interconnects

By Kant Deshpande

Tags: Silicon IP Technical Bulletin, Interface IP, HPC, Data Center, Silicon IP

Read Article

5 min read / Apr 22, 2026

UCIe Use Cases for Memory Applications in AI and HPC Systems

By Aparna Tarde

Tags: AI & Machine Learning, Silicon IP Technical Bulletin, Interface IP, HPC, Data Center, Silicon IP

Read Article

7 min read / Apr 21, 2026

eUSB2V2 Interoperability in Practice: Validating High Bandwidth Embedded USB with FPGA Based Prototyping

By Morten Christiansen

Tags: Silicon IP Technical Bulletin, Interface IP, Silicon IP

Read Article

ASK

BETA

End Chat

Closing this window clears your chat history and ends your session. Are you sure you want to end this chat?

Legal Disclaimer

NOTICE: You are interacting with an AI-powered chatbot that provides general information about Synopsys, including its products and services, which may be incorrect or incomplete. In the event of any conflict or discrepancy, the terms of your applicable agreements supersede any information provided by this chatbot. These chats may be accessed by Synopsys and its service providers to customize the experience and improve this tool, and your use of this chatbot is an agreement to that data processing activity.

Search Synopsys

Popular Content

Multiphysics Fusion Technology for Multi-Die Designs Explained

Unified multiphysics fusion helps multi-die teams validate earlier and sign off faster.

Automotive Executive Guide: Rethinking Automotive Development

A guide to virtualization in software-defined vehicles for automotive leaders.

Mastering AI Chip Complexity

This eBook explores AI chip design trends, challenges, and strategies for first-pass silicon success.

CXL: How an Accelerator Link Caused a Memory Revolution

Synopsys IP Technical Bulletin

CXL Background

CXL.mem Usage and Motivation

Designing for 448G Ethernet

Adopting CXL.mem in SoC Design

Conclusion

Subscribe to the Synopsys IP Technical Bulletin

Related Resources

World's First Standards-Compliant 112G PHY IP for Linear Optics: A Turning Point for AI Interconnects

Synopsys PCIe Leadership Recognized: Paul Cassidy Joins a Track Record of PCI-SIG Contribution

Explore the IP that enables high-performance, scalable AI systems

Continue Reading

World's First Standards-Compliant 112G PHY IP for Linear Optics: A Turning Point for AI Interconnects

UCIe Use Cases for Memory Applications in AI and HPC Systems

eUSB2V2 Interoperability in Practice: Validating High Bandwidth Embedded USB with FPGA Based Prototyping

End Chat

Legal Disclaimer

This eBook explores AI chip design trends, challenges,
and strategies for first-pass silicon success.

Synopsys IP
Technical Bulletin