It goes without saying that any lag in performance can lead to subpar results for various applications. Yet, bottlenecks do happen at various levels in chips. Consider neural network processing. Neural networks are used in deep-learning algorithms that can recognize patterns and correlations in raw data, clustering, classifying, and learning from them for continuous improvement. These algorithms benefit from the efforts of a huge number of parallel processors. The more processors that can be placed on a piece of silicon, the faster the chip can run these massive workloads. But chip designers must address multiple bottlenecks to achieve the PPA needed for SoCs supporting these types of applications:
- At the transistor level, there’s a set of bottlenecks around the interconnects that tie the transistors together.
- At the processor level, there’s a tradeoff between the complexity and number of processors and the amount of interconnect required to connect them, along with the need to move data swiftly between processing elements and system memory.
- At the memory level, there’s a gap because on-chip memory isn’t scaling as quickly as the size of standard cells. As a result, one can only extract so much out of increasingly smaller logic if the memory footprint can’t shrink along with it.
At some point, it might seem easier to have bigger processors that are easier to program and that can do more things. However, this comes with the complexity of designing and manufacturing these larger devices efficiently while simultaneously reducing the amount of achievable parallelism and increasing the power usage for simple tasks.
Angstrom-scale processes are being designed through a massive research and development exercise spanning a large number of technologies across the entire design chain, from core process definition to chip design building blocks to a suite of design automation tools and flows which enable chip design. This is made possible by:
- Augmenting traditional lithography-enabled dimensional scaling with new transistor structures
- Technologies to build digital twins of candidate transistor structures, as well as process definitions to evaluate and select the most promising ones
- New logic library and memory architectures that are building blocks of chip designs
- And new algorithms in electronic design automation (EDA) tools to enable designers to implement and verify chips with exponentially larger transistor counts designed using these building blocks
Advanced lithography tools, such as high-numerical-aperture (High-NA) extreme ultraviolet (EUV) lithography, currently under development and expected to be delivered to fabs in 2025, will enable the printing of smaller structures. Meanwhile, GAA transistor structures allow stacking of multiple channels on top of one another to increase chip density.
Moving power distribution in angstrom-scale architectures from above the transistors to under them is called backside power distribution. Backside power distribution will enable GAA structures to achieve their full density potential. Placing the power delivery on the backside enables designers to shrink the logic cell height because the cells no longer need wide wires, called power rails, at the top and bottom to carry power. It also frees up significant wiring resources on wiring layers above the cells, reserving the front side of the chip for signal routing and preventing the interconnects from becoming a bottleneck. GAA may also allow the memory scaling that is no longer possible with FinFET structures, while reducing leakage current and increasing drive current for better overall chip performance. A more complex version of the GAA, CFETs consist of vertically stacked transistors that deliver significant area and performance benefits, especially for memories. As they are targeted for designs at 2.5nm and beyond, CFETs are anticipated to play an integral role in the angstrom era.
Another innovation that could go hand-in-hand with angstrom-scale dies is the multi-die system, comprised of multiple dies, often referred to as chiplets, stacked on top of one another and/or connected with an interposer, integrated in a single package. This interdependent architecture can be created through disaggregation, the partitioning of a large die into smaller dies for better system yield and cost, or by assembling dies from different process technologies for optimal system functionality and performance. Compared to a large, monolithic SoC, a multi-die system enables accelerated scaling of system functionality, along with benefits such as reduced risk and time to market, lower system power, and the ability to rapidly create new product variants. Angstrom-sized dies could play a central role in a multi-die system, supporting the processing prowess needed for bandwidth-intensive applications, while dies at older nodes enable less taxing chip functions.