How Systems of Chips Take Us From Smart to Smarter

Shankar Krishnamoorthy

Jan 19, 2023 / 4 min read

Advanced robotics that can manufacture autonomous vehicles. Humanitarian mapping that addresses the impacts of environmental injustice, human rights violations, and global pandemics. Digital imagery for diabetic retinopathy screening. We’re already plenty smart, as these examples demonstrate, but how can we get even smarter… to further enhance our efficiency and quality of life?

Artificial intelligence (AI)-based technologies are evolving rapidly, pushing machine learning (ML) down into the tiniest of devices–with many of the advances unimaginable even a couple of decades ago. But progress doesn’t stand still; device makers, data center owners, and the raft of creators trying to make tomorrow happen are demanding leaps in computational performance. The question is: are today’s chips up to the job?

For the answer, one only needs to look at what’s happening inside the electronic design automation (EDA) industry. It’s a resounding “Yes!” Still, while engineering ingenuity is bringing to life an incredible array of transformational possibilities, EDA experts are working furiously in the background to overcome substantial technological challenges. In this blog post, I’ll outline what needs to be done for the semiconductor and system design industry to continue driving innovation over the next decade.

AI Drives Demand for a New Chip Architecture

Back in 2012, a high-end, off-the-shelf desktop graphics card might boast 1.6 TeraFLOPS of computing power to accelerate the convolutional neural networks (CNNs) that were making their way into the industry’s consciousness. Now, we’re heading into ExaFLOPS territory with ML accelerators and super-powerful AI processors with hundreds of thousands of AI-optimized cores tackling large language models (LLMs). These very large transformer neural networks, covering hundreds of billions of parameters, can be trained to write copy, answer questions, translate languages, and more. They’re also sparking the demand for more domain-specific architectures and highlighting how the co-optimization of software and hardware is critical for delivering the future of scalable AI systems.

Indeed, given the rapid progress of ML models, nothing short of a dramatic improvement in the underlying hardware is needed. From generation to generation, Moore’s law has reliably contributed to substantial performance gains and power reductions. But in the AI generation, where performance must double every six months to keep pace, Moore’s law has fallen behind—especially so when it comes to handling LLMs.

As engineers strive to extract more benefits from Moore’s law, the chip design industry is hitting multiple walls:

  • The processing wall, which hampers the scaling of training FLOPS
  • The memory wall, as parameter count far outpaces local memory scaling
  • The bandwidth wall, as hardware far outpaces memory and interconnect bandwidth

Today’s trends are leading us to tomorrow’s challenges—and opportunities. As we approach the reticle limits of manufacturing, density scaling is projected to slow down as costs increase. Moving to larger die sizes isn’t the answer from a cost per yield standpoint. I/O limitations are creating another stumbling block, with only modest improvements in the die-to-die interconnect pitch over recent years. However, high-density integration and packaging advances, including 3D-stacking technologies, are helping to transcend the technical barriers and paving the way for new silicon-to-system design architectures to take the electronics industry through the next decade of innovation.

Why Systems of Chips Are the Answer

The system on chip (SoC) needs to evolve to systems of chips—namely, highly heterogenous multi-die systems. Nothing less than high-density systems consisting of trillions of angstrom-scale transistors will do in an age of AI workloads. By 2030, a typical system for compute-intensive applications will include: multiple dies (some stacked on top of each other), compute, and memory all co-located within the same package. As costs per yield rise with advanced nodes, this strategy allows design teams to determine, subsystem by subsystem, which process technologies to leverage for each function to meet their overall system performance and cost targets.

What does it take to build angstrom-level, trillion-transistor designs? At the angstrom level, we’re talking about the complexity of the process technology, while the trillion-scale reference pertains to the scale of functionality. To address both, you first need to rethink the entire design methodology to build such systems while delivering optimal PPA in a more cost-effective and efficient manner. And to enable that, you need strong, AI-driven, hyperconverged technologies at the individual die level and at the full multi-die-system design level.

While this evolution in chip design is being driven by AI-based applications, along with the hyperscale data center and networking sectors, it’s clear that the use of AI itself will be integral in shaping the revamped design methodology for these multi-die systems. The integration of advanced intelligence into design and verification flows is fast becoming the way forward. The success of hyperconverged designs demands a convergent flow that fuses all the steps needed from RTL to GDSII, enhanced by intelligent search-space optimization and ML-driven big data design analytics.

A Holistic Approach to Tackle System Complexity

Examining the global trajectories in the semiconductor landscape, it’s clear that multi-die system design starts will grow substantially over the next few years. Yet, the process of designing multi-die systems is currently very disjointed. To enable a new era of system design, Synopsys is making significant investments in multi-die technologies. Our full-stack EDA approach with integrated, scalable, and flexible solutions spans architectural exploration through design, analysis, and signoff, enabling multi-die/package co-design. Our multi-die solutions for test, verification, and silicon lifecycle management feature intelligence to drive faster design closure at scale, enabling reliable, secure operation. And our broad IP portfolio includes elements that enable the high-bandwidth, low-latency connectivity that ties all the important pieces together.

The semiconductor landscape that has been dominated by monolithic SoCs is making way for trillion-transistor scale designs. The addition of these multi-die systems requires comprehensive exploration, along with the capacity and scale to support all design styles. It’s a tall order, for sure, but one that companies like Synopsys are poised to fulfill as we continue helping designers define and deliver unique, market-shaping products.

Continue Reading