Converging Cloud Computing Infrastructures with IP

Ron DiGiuseppe, Sr. Product Marketing Manager, Synopsys

Data center workload and compute applications continue to migrate from traditional data centers to hyperscale data centers. According to Cisco’s Global Cloud Index , “By 2021, 94% of all workloads and compute instances will be processed in cloud data centers” (Figure 1). While many applications are hosted by hyperscale public cloud operators, many mission-critical workloads and compute instances are hosted by private hyperscale data centers. Private hyperscale data centers are expected to grow at “11% CAGR from 2016 to 2021.” Although mega cloud providers develop customized rack-scale systems, private cloud providers typically adopt converged infrastructure (CI) or hyperconverged infrastructure (HCI) systems to improve efficiency and reduce management costs. CI systems and HCI systems enable private cloud providers to rapidly deploy new systems at scale with automated system configuration and control, virtualize compute, storage, and networking operations. The transition to CI and HCI systems is influencing semiconductor system-on-chip (SoC) suppliers to optimize server processors, low-latency storage solid state drives (SSDs), and networking switch designs. The demand for CI and HCI systems is driving a new class of SoC architectures that require the latest IP performing industry functions such as PCI Express (PCIe), DDR5, cache coherency, NVMExpress (NVMe) SSD storage and highest bandwidth Ethernet networking.

Figure 1: Distribution of Cisco workloads and compute instances between traditional and cloud data centers1

CI systems combine compute, storage, network, and management into a single solution rather than as different data center functions. The comprehensive CI and HCI systems automate overall management and allow IT to focus on managing applications, not the infrastructure. The pre-integrated, rack-level systems reduce overall complexity, integration, and operation costs. CIs and HCIs enable faster system deployment, easier interoperability and consistent management while also reducing training and support overhead. To meet efficiency and performance requirements, the SoC elements, such as design IP, used to build CI and HCI systems are being optimized for processing, memory performance, and connectivity.

Improving Performance with NVMe SSDs and PCIe-Based Accelerators

Server-based SSDs can utilize the NVMe protocol running over the PCIe IP interface to directly connect to the server CPU and function as a cache accelerator, allowing frequently accessed data, or “hot” data, to be cached extremely quickly. High-performance NVMe SSDs running over PCIe with extremely efficient input/output operation and low-read-latency improves server efficiency and avoids having to access the data through an external storage device. NVMe SSDs running over PCIe for server acceleration is ideal for high transaction applications for private clouds targeting database queries.

In addition to database acceleration using PCIe-based NVMe SSDs, CI and HCI systems use PCIe switch architectures to accelerate host processors for artificial intelligence (AI) applications. AI servers require processor acceleration to meet deep learning performance needs. Connecting host processors to GPUs and hardware-based accelerators optimize deep learning algorithms due to the low latency offered by PCIe-based switch architectures as shown in Figure 2. For applications that require cache coherency, the Cache Coherent Interconnect for Accelerators (CCIX) protocol, which is built on top of PCI Express protocol stack, provides a high-speed connection between host processors and hardware accelerators. Running at 25 Gbps data rate now and soon at 32 Gbps, CCIX ensures that the system sees a single memory space by defining commands to update all components across the system when memory is updated, reducing the need to copy. CCIX supports switch topologies, direct connect, and meshed connections.

Figure 2: Multi-host AI server based on PCIe switch architecture

Optimizing Applications with High Memory Performance

Converged compute, storage, and networking systems require the highest performance DRAM solutions to run virtual applications on host processors. The industry is moving from DDR4 DRAM to next-generation DDR5 and HBM2 DRAMs. DDR5 solutions can operate at up to 4800 Mbps data rates to interface with multiple Dual In-Line Memory Modules (DIMMs) per channel up to 80 bits wide, accelerating workloads for functions such as deep learning. DDR5 has additional Reliability, Availability, and Serviceability (RAS) features, including inline or sideband error correcting code (ECC), parity, and data cyclic redundancy checks (CRC), to reduce system downtime. HBM2 is an efficient solution with very high bandwidth and lowest power per bit data access when compared to DDR5/4 DRAM. SoC architects choose HBM2 memories when targeting high-bandwidth applications, DDR5 for high capacity, or a combination of both memory types for applications such as AI acceleration which need both high bandwidth HBM2 and large capacity DDR DRAM.

Simplifying the Data Center Network

Traditional enterprise data centers use a tree-based network topology consisting of switched Ethernet with VLAN tagging. This topology only defines one path to the network, which has traditionally handled north-south data traffic between servers. CI and HCI systems used in private cloud data centers use a flat, two-tier leaf-spine architecture with 25G, 50G, 100G, or 200G Ethernet links that enable virtualized servers to distribute workflows among many virtual machines. The latest 400G Octal Small Formfactor Pluggable (OSFP) multi-mode transceivers using 8 channels of 56G PAM-4 PHY IP can support the data center network topologies targeting up to 400G Ethernet by providing multiple 56G leaf-spine links. The industry is planning the transition to 112G PAM-4 Ethernet links for 400G Ethernet systems and to enable the move to 800G Ethernet applications.

To further simplify the data center network, CI and HCI systems can use a software defined network (SDN) to easily manage the network since control is decoupled from the data path. A common software stack such as OpenFlow provides a consistent industry-wide software environment to control CI and HCI systems. Instead of having a proprietary software stack, SoC designers have an OpenFlow-managed data movement throughout the private cloud data center which allows users to provision the networks very easily (virtually), without requiring physical access to the network’s hardware devices.

Summary

CI and HCI systems bring the three core aspects of a hyperscale data center -- compute, storage, and networking -- into a single solution. The CI and HCI systems replace a mix of diverse and often disconnected systems and management tools. As enterprise data centers continue the transition to private clouds, server and data center consolidation makes use of virtualization to allow many more workloads to operate on far fewer physical servers. System convergence makes use of newest industry IP architectures and interface protocols to optimize applications such as low latency database queries and deep learning. Hardware integration for CI and HCI systems uses a new class of optimized processors, advanced memory technology IP, IP interfaces, NVMe SSDs, and cache coherent accelerators.

To integrate processor IP, advanced memory IP, connectivity IP, NVMe storage, and cache coherent accelerators for CI and HCI systems, SoC designers need to consider technology tradeoffs, such as cost, power, performance, and development schedule. Figure 3 shows an advanced AI server SoC encompassing a host processor, security algorithm, system memory, connectivity, and accelerators.

Figure 3:  AI acceleration/server SoC

Synopsys provides a comprehensive portfolio of high-quality, silicon-proven IP that enables designers to develop SoCs for cloud computing applications supporting CI and HCI systems. Synopsys’ DesignWare® Interface IP, Processor IP, and Foundation IP are optimized for high performance, low latency, and low power, while supporting advanced process technologies from 16-nm to 7-nm FinFET.

 

1 Cisco Global Cloud Index, Feb, 2018