Increased Storage Capacity
HBM3 supports up to 32 Gb of density and up to a 16-high stack, resulting in a maximum of 64 GB storage. This is almost a 3x growth compared to HBM2E, providing more memory for advanced applications.
Faster Data Transfer Rates
With a top speed of 6.4 Gbps, HBM3 is almost double the speed of HBM2E (3.6 Gbps). The market may see a second generation of HBM3 devices in the not-too-distant future. One need only look at the speed history of HBM2/2E, DDR5 (6400 Mbps upgraded to 8400 Mbps), and LPDDR5 maxing out at 6400 Mbps and quickly giving way to LPDDR5X operating at 8533 Mbps. HBM3 above 6.4 Gbps is within reason.
Improved Power Efficiency
HBM2E already offers the lowest energy per bit transferred, largely due to being an unterminated interface, but HBM3 substantially improves on HBM2E. HBM3 decreases the core voltage to 1.1V compared to HBM2E’s 1.2V core voltage. In addition to the 100mV core supply drop, HBM3 reduces the I/O signaling down to 400mV from HBM2E’s 1.2V.
Enhanced Channel Architecture
HBM3 has kept the overall interface size the same for the HBM DRAMs – 1024-bits of data. However, this 1024-bit interface is now divided into 16 64-bit channels, or more importantly, 32 32-bit pseudo-channels. Since the width of the pseudo-channels has been reduced to 4 bytes, the burst length of accesses to the memory have increased to 8 beats – maintaining a 32-byte packet size for memory accesses. Doubling the number of pseudo-channels will be a performance improvement over HBM2E. Combined with the increase in data rate, HBM3 can provide a substantial increase in performance over HBM2E.
Advanced RAS Features
HBM3 introduces several improvements to its reliability, availability, and serviceability (RAS) capabilities, including on-die error correcting code (ECC). HBM3 DRAM devices also support error check and scrub (ECS) when the device is in self-refresh mode or when the host issues a refresh-all bank command. The results of the ECS operation may be obtained by accessing ECC transparency registers via the IEEE standard 1500 Test Access Port (TAP). The HBM3 standard’s new RAS feature supports refresh management (RFM) or adaptive refresh management (ARFM).
New Clocking Architecture
HBM3 changes the clocking architecture by decoupling the traditional clock signal from the host to the device and the data strobe signals. In fact, while the new maximum rate of WDQS and RDQS in HBM3 is 3.2 GHz to enable a data transfer rate of up to 6.4 Gbps, the fastest rate the CK will run from the host to the device is only 1.6 GHz (even when the data channels are operating at 6.4 Gbps). Decoupling the clock signal from the strobes allows the clock signal to run significantly slower than the data strobes. The maximum transfer rate of information on the CA bus is now 3.2 Gbps since the CA clock has been capped at a maximum rate of 1.6 GHz. While HBM2E requires a CA transfer rate of 3.6 Gbps, HBM3 only requires a command and address (CA) transfer rate of 3.2 Gbps.
High-speed Internal Clocking
The new HBM3 clocking architecture enables the user to keep focus on a low-latency, high-performance solution when migrating from HBM2E to HBM3. The highest defined frequency for the CA bus with HBM3 is 1.6 GHz while the data strobes operate at 3.2 GHz. This enables users to implement a DFI 1:1:2 frequency ratio for an HBM3 controller and PHY. In this case, the controller, DFI, PHY and memory clock all run at 1.6 GHz while the strobe frequency is 3.2 GHz. This gives designers a DFI 1:1 frequency ratio for the CA interface and a DFI 1:2 frequency ratio for the data, all of which minimize latency.