Reliability, Availability and Serviceability (RAS) for Memory Interfaces

Reliability, Availability and Serviceability (RAS) for Memory Interfaces

Smaller process geometries and higher Double Data Rate (DDR) Dynamic Random Access Memory (DRAM) interface speeds are driving demand for new and more robust techniques for preventing, repairing and detecting memory errors. Some of these techniques are enabled by features in the latest DDR4 and DDR3 RDIMM standards, and others can be applied to any DRAM type. Collectively these techniques improve the Reliability, Availability, and Serviceability (RAS) of the computing system that adopts them. This white paper reviews some of the ways that errors can occur in the DDR DRAM memory subsystem and discusses current and future methods of improving RAS in the presence of these errors.

  1. Introduction to DRAM Errors
    1. Soft error introduction through subatomic particles and cosmic rays
    2. Hard errors: Stuck-at faults and transition faults
    3. Retention faults
    4. Coupling faults
    5. Signal integrity errors affecting the transmission of a single data bit, a group of data bits, or command failures
    6. Dead DRAM
    7. On-chip failures
  2. Beginning RAS for DRAM: Parity, Hamming ECC and BIST
    1. Parity
    2. Hamming ECC
    3. Built-In Self-Test
    4. On-chip parity and on-chip ECC
  3. Advanced RAS for DRAM
    1. Advanced ECC
    2. Memory sparing
    3. Command retry
    4. Row hammering mitigation
    5. Thermal timing parameter adjustment
    6. On-chip parity and on-chip ECC
    7. Post-package repair

Please complete the following form then click 'submit' to complete the download.
Note: all fields with * are required