Contact Sales

Search Synopsys

Multiphysics Fusion Technology for Multi-Die Designs Explained

Unified multiphysics fusion helps multi-die teams validate earlier and sign off faster.

Automotive Executive Guide: Rethinking Automotive Development

A guide to virtualization in software-defined vehicles for automotive leaders.

Mastering AI Chip Complexity

This eBook explores AI chip design trends, challenges,
and strategies for first-pass silicon success.

Creating an AI Accelerator Chip in 18 Months with Neuchips

Kevin Wei, Kinny Chen, Rich Collins

Aug 31, 2022 / 4 min read

Table of Contents

Direct-to-ASIC Approach for Data Center Inference
Reducing Chip Development Time by More than One Year

Subscribe to Our Blog

Thanks for subscribing to the blog! You’ll receive your welcome email shortly.

AI-powered recommendation applications are opening up new avenues to enhance the customer experience. With this technology, online stores can highlight other items to add to digital shopping carts, digital music services can suggest songs based on tunes already in the rotation, and social media channels can offer up content that might fit the user’s interests. When these systems work seamlessly and deliver accurate suggestions, they can also bring more dollars to the bottom line. However, a significant amount of challenging engineering work goes on behind the scenes to produce accurate recommendations.

AI accelerators are a critical part of the technology stack for recommendation systems. Their speed and energy efficiency, as measured in inferences per Joule of energy, are key to their prediction accuracy. In 2019, Meta (then Facebook) called on the industry to work on hardware acceleration for recommendation system, based on its open-source deep learning recommendation model (DLRM). That call to action inspired the engineering team at Neuchips Inc. to rally around this problem of providing increased recommender model capacity that scales in an Open Compute Project (OCP) form factor. In the race to meet Meta’s request, the young company announced this summer that it has taped out its first DLRM accelerator, the Neuchips RecAccel™-N3000, in Taiwan.

Designed for data center recommendation models, the RecAccel™-N3000 has achieved one million DLRM inferences per Joule of energy (which translates into 20 million inferences per second per 20-Watt chip). The AI accelerator, developed with support and EDA tools from Synopsys and other semiconductor industry leaders, will be manufactured on TSMC’s 7nm process, with the sample plan scheduled to be ready at the end of 2022.

In this blog post, we’ll provide more details about how Neuchips, with a team of about 30 engineers, was able to tape out its 400mm2 AI chip in just 18 months, a process that would typically require more than 100 engineers over the course of 3 to 4 years. Another opportunity to learn more about Neuchips comes during its presentation, “Design of a High-Efficiency Accelerator for Full-Scale Deep Learning Recommendation Models (DLRM) in the Datacenter,” at the upcoming ARC® Processor Summit 2022 on Thursday, September 8, at the Santa Clara Marriott. The company’s session is scheduled from 2:35p.m. to 3:20p.m. PDT.

Over the shoulder view of a young woman choosing food from the menu on mobile app

Direct-to-ASIC Approach for Data Center Inference

AI recommendation systems, especially DLRMs, are the dominant machine learning application when it comes to cloud resource usage. Novel adaptations of DLRMs are generating more useful predictions, while requiring more compute capacity within fixed energy and space constraints. Neuchips is pioneering a unique “direct-to-ASIC” engineering approach that accelerates software with a purpose-built, domain-specific AI accelerator plus co-designed compiler and runtime software. In the company’s asynchronous, heterogeneous dataflow architecture, each type of IP and processor is carefully tailored to optimize a component of the DLRM logical architecture. The configurable Synopsys ARC® processors, with their low power consumption and high performance, play an integral role in the groundbreaking performance of the RecAccel™-N3000.

Other features of the RecAccel™-N3000 include:

160MB on-die SRAM
4×64 LPDDR5 with inline error correction code (ECC)
Up to 128GB on card DRAM
Up to 16 lanes of PCI Express® (PCIe®) 3.0, 4.0, and 5.0
Embedded secure hardware root-of-trust module

Striving to get to market first, Neuchips sought support, design and verification tools, and IP that could help the company accelerate its design cycle. It found what it needed through the AI Chip Design Lab, a joint effort between Synopsys and the Industrial Technology Research Institute (ITRI) in Taiwan. Many on the team were already familiar with Synopsys technologies, which made it an easy decision to collaborate with Synopsys on the ambitious project.

The AI Chip Design Lab is located at ITRI headquarters in Hsinchu, Taiwan. It receives support from the Technology Development Programs of the Department of Industrial Technology (DoIT) and the Ministry of Economic Affairs (MOEA) in Taiwan. The lab aims to help the country’s semiconductor industry advance through access to the latest design tools and design and verification services. One of the key offerings of the AI Chip Design Lab is a Synopsys system-level solution based on the ARC AI Reference Design Platform, spanning architecture design to virtual prototyping and system verification. The design platform is intended to help lower the barrier of entry into AI and to shorten design cycles.

Reducing Chip Development Time by More than One Year

Based on their unique characteristics, DLRMs can be difficult to accelerate with general-purpose AI accelerators. Neuchips developed its RecAccel™-N3000 with tailored hardware IPs that accelerate embedding tables, matrix multiplication, and feature interaction. Working with Synopsys to implement early hardware/software co-development enabled by the ARC AI Reference Design Platform, Neuchips was able to save more than one year in chip development time. With the design platform, the team was able to develop and verify both the RecAccel™-N3000 domain-specific AI accelerator’s PCIe 5.0 subsystem and its LPDDR5 subsystem early and then integrate them into the whole chip. Synopsys ZeBu® Server 4 emulation system in the cloud was used to verify the subsystems as well as the entire RecAccel™-N3000.

The RecAccel™-N3000 leverages an array of Synopsys IP blocks, including:

Synopsys ARC HS48 Processor
Synopsys ARC EV72 Processor
Synopsys Interface IP for AMBA (Advanced High-Performance Bus, Advanced Peripheral Bus), LPDDR5, and PCIe
Synopsys Memory Compilers with advanced power management features
Synopsys Hardware Secure Modules with Root of Trust to enable security for IT gear in datacenter to ensuring boot code integrity and device authentication

Using silicon-proven Synopsys IP helped the Neuchips team reduce integration risks and contributed to a shorter design cycle. Synopsys application engineers also supported Neuchips in optimizing the code for its cloud-based chip design, configuring the IP, and with simulation and verification on the FPGA-based ZeBu Server 4 system, which accelerated full ASIC RTL simulations from two weeks down to about 20 minutes.

Other design and verification tools that played a part in the development of the RecAccel™-N3000 include Synopsys Design Compiler RTL synthesis solution, Synopsys VCS® functional verification solution, Synopsys SpyGlass® static and formal verification platform, Synopsys Verdi® automated debug system, Synopsys Formality® equivalence checking, Synopsys PrimeTime® static timing analysis tool, Synopsys PrimePower RTL to signoff power analysis tool, and Synopsys IC Compiler™ II place-and-route solution.

Summary

With recommendation systems becoming both more prevalent and more insightful in our digital world, Neuchips’ RecAccel™-N3000 comes at a good time. By accelerating recommendation inference for data centers, the high-performance, energy-efficient, and scalable AI platform is poised to help a variety of industries personalize the customer experience online. Working closely with Synopsys, ITRI, and others in the Taiwan semiconductor ecosystem, Neuchips Inc. has achieved the fast time-to-market needed to get a head start in the race to deliver impactful AI solutions.

Continue Reading

4 min read / Jul 24, 2026

Accelerate AI Design from Silicon to Systems: Building for the Physical AI Era

By Hezi Saar

Tags: Engineering Central, AI & Machine Learning, Silicon IP

Read Article

5 min read / Jul 06, 2026

When AI Gets a Body: From Cloud Intelligence to Physical Systems

By Hezi Saar

Tags: Engineering Central, AI & Machine Learning, Silicon IP

Read Article

5 min read / May 19, 2026

Cloud HPC for AI: Addressing Latency, Cost, and Scale at the Architectural Level

By Sumit Vishwakarma

Tags: Cloud, AI & Machine Learning, Design, Security IP, About Synopsys, Interface IP, Foundation IP, HPC, Data Center, Silicon IP

Read Article

ASK

BETA

End Chat

Closing this window clears your chat history and ends your session. Are you sure you want to end this chat?

Legal Disclaimer

NOTICE: You are interacting with an AI-powered chatbot that provides general information about Synopsys, including its products and services, which may be incorrect or incomplete. In the event of any conflict or discrepancy, the terms of your applicable agreements supersede any information provided by this chatbot. These chats may be accessed by Synopsys and its service providers to customize the experience and improve this tool, and your use of this chatbot is an agreement to that data processing activity.

Search Synopsys

Popular Content

Multiphysics Fusion Technology for Multi-Die Designs Explained

Unified multiphysics fusion helps multi-die teams validate earlier and sign off faster.

Automotive Executive Guide: Rethinking Automotive Development

A guide to virtualization in software-defined vehicles for automotive leaders.

Mastering AI Chip Complexity

This eBook explores AI chip design trends, challenges,
and strategies for first-pass silicon success.

Browse by Tags

Creating an AI Accelerator Chip in 18 Months with Neuchips

Direct-to-ASIC Approach for Data Center Inference

Reducing Chip Development Time by More than One Year

Summary

Continue Reading

Accelerate AI Design from Silicon to Systems: Building for the Physical AI Era

When AI Gets a Body: From Cloud Intelligence to Physical Systems

Cloud HPC for AI: Addressing Latency, Cost, and Scale at the Architectural Level

End Chat

Legal Disclaimer

Search Synopsys

Popular Content

Multiphysics Fusion Technology for Multi-Die Designs Explained

Unified multiphysics fusion helps multi-die teams validate earlier and sign off faster.

Automotive Executive Guide: Rethinking Automotive Development

A guide to virtualization in software-defined vehicles for automotive leaders.

Mastering AI Chip Complexity

This eBook explores AI chip design trends, challenges, and strategies for first-pass silicon success.

Browse by Tags

Creating an AI Accelerator Chip in 18 Months with Neuchips

Direct-to-ASIC Approach for Data Center Inference

Reducing Chip Development Time by More than One Year

Summary

Continue Reading

Accelerate AI Design from Silicon to Systems: Building for the Physical AI Era

When AI Gets a Body: From Cloud Intelligence to Physical Systems

Cloud HPC for AI: Addressing Latency, Cost, and Scale at the Architectural Level

End Chat

Legal Disclaimer

This eBook explores AI chip design trends, challenges,
and strategies for first-pass silicon success.