Go Back

Explore challenges and solutions in AI chip development

Download eBook

Innovate Faster with Synopsys Multi-Die Solution

Accelerating success from early architecture to manufacturing.

Download eBook

Explore Silicon Design, Verification & Manufacturing

Synopsys is a leading provider of electronic design automation solutions and services.

Simpleware Software

Virtual Prototyping

Synopsys Cloud

Unlimited access to EDA software licenses on-demand

Request a Free Trial

Explore Silicon IP

Synopsys is a leading provider of high-quality, silicon-proven semiconductor IP solutions for SoC designs.

Synopsys IP Portfolio

Download Brochure

Synopsys IP Technical Bulletin

Read Latest Issue

Explore Systems Verification and Validation

Synopsys is a leading provider of hardware-assisted verification and virtualization solutions.

System Test Generation

Company Overview

Success Stories

Explore our success stories.

Learn More

Synopsys Blog

Insights that shape the future.

Visit Our Blog

AI Hardware Accelerates Innovation

Stelios Diamantidis

Jan 06, 2021 / 4 min read

Table of Contents

Table of Contents
AI Accelerators Support Data Centers and the Edge
Comprehensive AI Design Tools

While software has long been a driver of innovation in a variety of applications, hardware is fast becoming a core enabler in the artificial intelligence (AI) world. Facial recognition, self-driving cars, virtual assistance, and many others are relying on AI hardware, whose market is projected to reach $65 billion by 2025.

Why has hardware become such a dominant force in the AI space? It comes down to the need for parallelized computing systems, like neural networks, that can process massive amounts of data and train themselves iteratively. This is the design-by-optimization paradigm, and traditional architectures that execute software were not designed for this.

In this environment marked by voluminous amounts of data, hardware systems such as AI accelerators take center stage. As a high-performance parallel computational machine, an AI accelerator is made to efficiently process AI processing workloads, such as neural networks. AI accelerators contribute a number of benefits:

Significantly better energy efficiency compared to general-purpose compute machines
Low latency of computation to enable real-time applications
Scalability to reach a level of performance speed enhancement that can even linearly scale with the number of cores utilized
Heterogeneous architecture, which allows a given system to accommodate multiple specialized processors for specific tasks

Explore AI-Driven Design with Synopsys DSO.ai

Leverage Synopsys Ai to enhance power and performance. James Chuang describes how to achieve new productivity levels.

Watch Now

AI Accelerators Support Data Centers and the Edge

AI accelerators operate in two key realms: data centers and the edge. Today’s data centers—particularly hyperscale data centers that may support as many as thousands of physical servers and millions of virtual machines—demand massively scalable compute architectures. This has prompted some in the chip industry to go big in the name of accelerating AI workloads. For example, Cerebras has created the Wafer-Scale Engine (WSE) for its Cerebras CS-1 deep-learning system. At 46,225mm2 with 1.2 trillion transistors and 400,000 AI-optimized cores, the WSE is the biggest chip built so far. By providing more compute, memory, and communication bandwidth, the WSE can support AI research at speeds and scale that were previously impossible. At the other end of the spectrum is the edge, where real estate for hardware is limited and energy efficiency is essential. Here, edge SoCs with AI accelerator IP integrated inside can quickly deliver the intelligence needed to support applications such as interactive programs that run on smartphones or robotics in automated factories. Given the variety of applications where intelligence is at the edge, AI accelerators that support them must be optimized for characteristics such as real-time computational latency, ultra-high energy efficiency, fail-safe operation, and high reliability.

Not every AI application needs a chip as large as the WSE. Other types of hardware AI accelerators include:

Graphics processing units (GPUs) with temporal neural network processing
Spatial accelerators like Google’s Tensor Processing Unit (TPU)
Coarse-grain reconfigurable architecture (CGRA) systems like Sambanova’s DataScale
Massively multicore scalar processors with vector processing extensions

Each of these types of chips can be combined by the tens or the hundreds to form larger systems that can process large neural networks. For example, Google’s TPU can be merged in pod configurations that bring more than 100 petaFLOPS of processing power for training neural network models. Megatron, from the Applied Deep Learning Research team at NVIDIA, delivers an 8.3 billion parameter transformer language model with 8-way model parallelism and 64-way data parallelism for natural language processing. Executing this model required development of the NVIDIA A100 GPU, which delivers 312 teraFLOPS of FP16 compute power. Another emerging hardware type is the CGRA, which provides nice tradeoffs between performance/energy efficiency and flexibility for programming different networks.

In this discussion of AI hardware, one cannot neglect the software stack that enables system-level performance and ensures that the AI hardware is fully utilized. Open-source software platforms like TensorFlow provide tools, libraries, and other resources for developers to easily build and deploy machine learning applications. Machine learning compilers, such as Facebook Glow, are emerging to help facilitate connectivity between the high-level software frameworks and different AI accelerators.

Comprehensive AI Design Tools

While hardware has become a critical component in AI applications, designing these components continues to be uniquely challenging, especially as the cloud and the edge push the power, performance, and area (PPA) limits of current silicon technologies. For data centers, hardware designs are marked by multiple levels of physical hierarchy, locally synchronous and globally asynchronous architectures, massive dimensions, and fragmented floorplans. At the edge, AI designs must be able to handle hundreds of design corners, ultra-low power requirements, heterogenous integration, and extreme variability.

By offering the industry’s most comprehensive AI design portfolio, Synopsys can help AI hardware designers overcome some of these challenges. Our products run the gamut from IP for edge devices to the Zebu® Server 4 emulation system for fast bring-up of complex workloads to the Fusion Design Platform for full-flow, AI-enhanced quality-of-results (QoR) and time-to-results (TTR) for IC design. Synopsys has also introduced DSO.ai™ (Design Space Optimization AI), the industry’s first autonomous AI application for chip design. DSO.ai searches for optimization targets in very large solution spaces of chip design. It automates less consequential decisions in design workflows and can, thus, substantially accelerate the design of specialized AI accelerators.

As AI applications become more deeply integrated in our lives, hardware such as AI accelerators will continue to be critical to enable the real-time responses that make intelligent devices and systems valuable.