DesignWare Technical Bulletin

Configurable & Extensible 32-bit RISC Processors for Next-Generation SSDs

By Mike Thompson, Sr. Product Marketing Manager and Martyn Bronziet, CAE, Synopsys

There is a transition occurring in mass storage used for computers, tablets, games and servers. Traditionally the mass storage used in these devices has been implemented with hard disk drives (HDDs), but this is changing due to the advantages that solid state drives (SSDs) offer. The functionality needed for SSDs is different from that of HDDs and this requires a different approach to the control functions built into the storage device. Typically HDDs were built with dedicated hardware, general purpose processors (GPPs), or digital signal processors (DSPs) as the control element due to the nature of the tasks required. This is changing with SSDs, which require a configurable RISC processor with support for signal processing and can be customized for the desired control functions to maximize performance while minimizing power consumption.

Processing Requirements of SSDs

Although HDDs have been used for mass storage for more than 30 years, SDDs are gaining traction in consumer and enterprise applications due to several significant advantages. Flash memory-based SSDs are faster, more durable, silent, use less power and take up less space than HDDs.

However, SSDs do have some disadvantages compared to HDDs. For example, SSDs are more expensive per GB of storage capacity, although this is changing as higher volumes reduce the price point. In addition, the flash memory used to implement SSDs has a limited lifetime due to its limited number of program and erase cycles, while HDDs have no such limitation. Therefore, SDDs must implement wear leveling to distribute reads and writes as equally as possible across all storage cells on the flash device. As each block in a NAND flash can be subjected only to a limited number of program/erase cycles, new data is often written to a new segment, and the index is updated instead of rewriting over a previously used segment. Data structures such as red-black trees are commonly used in the implementation. A second limitation of SSDs is that flash memories require more sophisticated error checking and correction (ECC) than HDDs. ECC algorithms are essential for maximizing SSD endurance because SSDs become more susceptible to bit errors as geometries shrink and more bits are stored per cell. Both the wear leveling and ECC requirements have significant implications on control processing and must be implemented to ensure that the SSD will be usable over the intended lifetime of the product.

The Flexibility / Performance Tradeoff

Wear leveling and ECC requirements for an SSD controller can be met with RISC processors that offer flexibility in implementation as well as application-specific customization options to maximize performance while minimizing power. GPPs used with HDDs do not offer customization because they lack the specialized data paths and flexibility necessary to carry out efficient processing of some of the key algorithms in SSDs. A GPP does offer flexibility, but will likely fall short when high-demand mathematics come into play.

The hardwired solutions used with some HDDs are impractical for SSDs because they lack flexibility and are not easily adaptable to the range of platforms on which SSDs are used. Hardwired solutions also make it difficult to offer single chips that support a range of products and respond to rapidly changing market requirements.

The DSP solutions used with HDDs could potentially be used for SSDs, but the signal processing requirements are different with SSDs. DSPs offer mathematical computation flexibility but are not as efficient for embedded control.

To address these limitations, a flexible RISC processor and DSP could be used in the design, but this increases power consumption and cost, while negatively impacting design time. Two different processors require two different development tool flows, and tasks between the two processors must be synchronized in user software, adding a level of complexity. For balanced power consumption, cost, and time-to-market, SSDs can use a flexible, customizable RISC processor that offers high performance with some signal processing capability.

Processor Configurability

A processor that can be configured by the user provides a great deal of flexibility. Configurable processors allow designers to make fine-grained tradeoffs to balance performance against silicon area and power. RISC processor IP vendors provide variable levels of configurability. Some processors allow the instruction set to be adjusted so that, for example, DSP features, floating point acceleration, and dividers may be added around a base processor. The processors can also be configured to add and alter architectural components such as context switch acceleration, interrupt capabilities, register file size, internal memory arrangement, and bus interface characteristics, allowing the designer to fine-tune the processor implementation. The greater the degree of configurability, the more finely tuned the processor can be for the specific application, which is an advantage in SSDs where it is desirable to maximize performance while minimizing power consumption.

User-Defined Custom Extensions

For even greater levels of flexibility, SSD designers can use a processor that allows them to add their proprietary hardware to the processor through custom instructions. The capability to extend the processor pipeline with user-defined functionality opens up a wealth of opportunities for performance improvement and power reduction. This ability to add user-defined instructions is especially useful in applications in which wide data paths are required, where a single instruction can operate on multiple data elements simultaneously to gain significant performance benefits and where the program flow needs to change rapidly based on external stimuli.

Figure 1: Extending the ALU on a customizable processor

User-extensible processors can optimize the algorithms typically found in SSD controllers. For example, red-black tree manipulation makes extensive use of comparisons and index manipulation. Custom extensions using dedicated instructions, condition flags and registers have been shown to reduce cycle counts on tree operations, such as searches, by 50% for a modest increase in silicon area but a significant increase in performance or reduction in power consumption.

DesignWare ARC Configurable and Extensible Processors

Storage solutions require flexible processing capabilities that can be tailored for specific performance, power and area requirements. While there are multiple approaches to implement SSD processing capabilities, most approaches have drawbacks that undermine the desired results. Designers of storage devices need solutions that enable them to efficiently achieve their performance goals while consuming minimal power. Because they are configurable and extensible, Synopsys’ DesignWare® ARC® processors offer the high levels of performance efficiency that are required for SSD applications. The DesignWare ARC processors enable engineers to create state-of-the-art SSD designs that are adaptable, scalable, and can be customized to deliver optimal performance across the full range of SSD application requirements.