The two disparate memory architectures targeting high throughput applications, such as graphic cards and AI, are GDDR and HBM.
GDDR DRAMs are specifically designed for graphics processing units (GPUs) and accelerators. Data-intensive systems such as graphic cards, game consoles, and high-performance computing including automotive, AI, and deep learning are a few of the applications where GDDR DRAM devices are commonly used. GDDR standards – GDDR6/5/5X – are architected as point-to-point (P2P) standards, capable of supporting up to 16Gbps. GDDR5 DRAMs, always used as discrete DRAM solutions and capable of supporting up to 8Gbps, can be configured to operate in either a ×32 mode or ×16 (clamshell) mode, which is detected during device initialization. GDDR5X targets a transfer rate of 10 to 14Gbps per pin, almost twice that of GDDR5. The key difference between GDDR5X and GDDR5 DRAMs is that GDDR5X DRAMs have a prefetch of 16N, instead of 8N. GDDR5X also uses 190 pins per chip, compared to 170 pins per chip in GDDR5. Hence, GDDR5 and GDDR5X standards require different PCBs. GDDR6, the latest GDDR standard, supports a higher data-rate, up to 16Gbps, at a lower operating voltage of 1.35V, compared to 1.5V in GDDR5.
HBM is an alternative to GDDR memories for GPUs and accelerators. GDDR memories target higher data rates with narrower channels to provide the needed throughput, while HBM memories solve the same problem through eight independent channels and a wider data path per channel (128-bits per channel) and operate at lower speeds around 2Gbps. For this reason, HBM memories provide high throughput at a lower power and substantially smaller area than GDDR memories. HBM2 is the most popular standard today in this category, supporting up to 2.4Gbps data rates.
HBM2 DRAMs stack up to eight DRAM dies, including an optional base die, offering a small silicon footprint. Dies are interconnected through TSV and micro-bumps. Commonly available densities include 4 or 8GB per HBM2 package.
Besides supporting a higher number of channels, HBM2 also provides several architectural changes to boost the performance and reduce the bus congestion. For example, HBM2 has a ‘pseudo channel’ mode, which splits every 128-bit channel into two semi-independent sub-channels of 64-bits each, sharing the channel’s row and column command buses while executing commands individually. Increasing the number of channels also increases overall effective bandwidth by avoiding restrictive timing parameters such as tFAW to activate more banks per unit time. Other features supported in the standard include optional ECC support for enabling 16 error detection bits per 128-bits of data.
HBM3 is expected to hit the market in a few years and provide higher density, greater bandwidth (512GB/s), lower voltage, and lower cost.
Table 1 shows a high-level comparison of GDDR6 and HBM2 DRAMs: