In-depth technical articles, white papers, videos, webinars, product announcements and more.
Artificial Intelligence is transforming data center design. As models grow and compute clusters scale, networking inside the rack has become just as critical as the accelerators themselves. Traditional Ethernet was built for scale-out, connecting racks and buildings, but AI workloads demand something different: scale-up networking. This means ultra-fast, low-latency links between hundreds or even thousands of accelerators within a single pod.
AI pods have changed the rules of networking. Inside a rack, accelerators must share data with sub microsecond latency and deterministic performance while keeping tail latency in check. The industry is aligning behind open, Ethernet based approaches that deliver scaleup performance without vendor lock, most notably: ESUN (an OCP workstream), SUE (an OCP published framework), and UALink (a consortium standard for memory semantics).
In this article, we go deeper into each Ethernet-based initiative, its scope, the problems it solves, how it complements the others, and how Synopsys 224G IP provides the common physical layer for all three.
Scale-out Ethernet connects racks and even entire data centers, handling traffic across large distances. In contrast, scale-up Ethernet operates inside a single pod, linking tens to thousands of accelerators over short, single-hop paths. This environment demands extremely low latency and high bandwidth because accelerators need to exchange data rapidly for AI workloads.
Figure 1. To process AI workloads effectively, the entire accelerator cluster must operate as one computer
To meet these requirements, industry groups are adapting Ethernet’s lower layers (L1 & L2) and transport behaviors for scale-up scenarios. The goal is to keep Ethernet’s operational familiarity and broad supply chain while optimizing for small headers, lossless transport, in-order delivery, and tight Quality of Service (QoS)—all critical for collective operations like all-reduce and shared-memory patterns in AI training.
ESUN (Ethernet for Scale Up Networking) is an OCP workstream launched at OCP Global Summit 2025. It’s an open technical forum to advance Ethernet for the scaleup domain, focusing on how switches and NICs handle AI pod traffic: framing, lossless delivery, header optimization, and interoperability across vendors. ESUN explicitly coordinates with IEEE 802.3 and the Ultra Ethernet Consortium (UEC).
Figure 2 . ESUN blog diagram, taken from: Introducing ESUN: Advancing Ethernet for Scale-Up AI Infrastructure at OCP » Open Compute Project
ESUN creates the standards coordination layer many operators want: open, vendor neutral L2/L3 behavior that they can implement on different switch silicon and NICs—without abandoning Ethernet. This complements endpoint/transport efforts (e.g., SUE-T in OCP and UEC profiles) and makes Ethernet the connective tissue for scaleup as well as scale out.
SUE (Scale Up Ethernet) is a framework specification contributed to OCP (v1.0 released Sept 5, 2025) that spells out how an Ethernet based pod should work. It introduces an AI Fabric Header (AFH) to minimize per packet overhead and defines Link Level Retry (LLR) and Credit Based Flow Control (CBFC) for hop-by-hop reliability with deterministic tail latency.
Figure 3. Example use case in mesh deployment from SUE. Taken from: OCP SUE
SUE’s LLR/CBFC mechanisms align with UEC’s Ethernet extensions, which document control ordered sets and preamble signaling for LLR/CBFC at the PHY/PCS boundary—evidence of cross community convergence on scale up reliability.
SUE gives operators and silicon vendors a concrete “how to” for building Ethernet pods today. AFH reduces header overhead; LLR/CBFC deliver lossless, deterministic transport without heavyweight end to end retries; and the single hop model fits the core AI collective traffic shape.
UALink is a consortium standard enabling load/store/atomic operations directly between accelerators and switches—turning a pod into a shared memory domain. The UALink 200G 1.0 spec (publicly released April 2025) supports 200G per lane signaling, x1/x2/x4 link configurations, and scales to up to 1,024 accelerators per pod with deterministic performance.
Figure 4. Scalable multi-node accelerator system with UALink high-speed interconnect. Taken from: UALink™ 200G 1.0 Specification Overview – UALink Consortium
OCP collaboration: OCP and UALink announced joint work to integrate UALink into community delivered AI clusters.
UALink provides memory semantics and TL/Protocol behaviors; ESUN/SUE address L2/L3 framing, reliability, and switch behavior. Many deployments will use UALink inside the pod and ESUN/SUE aligned Ethernet behaviors for interoperability and operations, with standard Ethernet for scale out between pods.
AI workloads increasingly rely on shared-memory programming models, which traditional Ethernet cannot efficiently support. UALink addresses this by enabling load/store and atomic operations across accelerators, making a pod behave like a unified memory domain. This reduces software complexity, improves performance for memory-intensive models, and scales to 1,024 accelerators per pod—all while leveraging standard Ethernet physical components for cost efficiency and ecosystem compatibility.
| Dimension | ESUN (OCP Workstream) | SUE (OCP Framework) | UALink (Consortium Spec) |
| Primary Scope | Open L2/L3 behaviors, headers, lossless delivery, interop across vendors | Pod level encapsulation + reliability (AFH, LLR, CBFC) for single hop | Memory semantics (load/store/atomic), accelerator to accelerator transactions |
| Standardization Host | OCP workstream coordinating with IEEE and UEC | OCP published spec (v1.0) | UALink Consortium |
| Key Benefits | Vendor neutral interop; open headers & QoS for AI pods | Deterministic tail latency; efficient small payload handling | Shared memory programming model; deterministic bandwidth/latency |
Figure 5. Networking layers, showing common L1 leveraging Synopsys IP
No matter which approach you choose—ESUN, SUE, or UALink—the physical layer (L1) is critical. It must deliver high signal integrity, margin, and jitter tolerance, with zero post-FEC BER across real-world channels. Synopsys 224G Ethernet PHY IP meets these demands enabling up to 1.6T Ethernet and supporting UALink 200G signaling, all while complying with evolving IEEE 802.3 and OIF-224G electrical specifications.
Open Ethernet is emerging as the foundation for scale-up AI networking. Together, ESUN, SUE, and UALink form a complementary stack: ESUN coordinates open Ethernet behaviors, SUE provides a pod-level blueprint with features like AFH and LLR/CBFC to reduce overhead and latency, and UALink introduces memory semantics for shared-memory workloads—scaling up to 1,024 accelerators per pod using Ethernet physical components. These efforts deliver lossless, deterministic performance without proprietary lock-in.
At the physical layer, high-speed Ethernet PHY technology enables all three approaches. Synopsys silicon-proven 224G PHY IP solution delivers the performance, interoperability, and robustness needed to accelerate deployment of open, multi-vendor scale-up fabrics—ensuring readiness for next-generation AI clusters.
Includes in-depth technical articles, white papers, videos, upcoming webinars, product announcements and more.