AI-powered recommendation applications are opening up new avenues to enhance the customer experience. With this technology, online stores can highlight other items to add to digital shopping carts, digital music services can suggest songs based on tunes already in the rotation, and social media channels can offer up content that might fit the user’s interests. When these systems work seamlessly and deliver accurate suggestions, they can also bring more dollars to the bottom line. However, a significant amount of challenging engineering work goes on behind the scenes to produce accurate recommendations.
AI accelerators are a critical part of the technology stack for recommendation systems. Their speed and energy efficiency, as measured in inferences per Joule of energy, are key to their prediction accuracy. In 2019, Meta (then Facebook) called on the industry to work on hardware acceleration for recommendation system, based on its open-source deep learning recommendation model (DLRM). That call to action inspired the engineering team at Neuchips Inc. to rally around this problem of providing increased recommender model capacity that scales in an Open Compute Project (OCP) form factor. In the race to meet Meta’s request, the young company announced this summer that it has taped out its first DLRM accelerator, the Neuchips RecAccel™-N3000, in Taiwan.
Designed for data center recommendation models, the RecAccel™-N3000 has achieved one million DLRM inferences per Joule of energy (which translates into 20 million inferences per second per 20-Watt chip). The AI accelerator, developed with support and EDA tools from Synopsys and other semiconductor industry leaders, will be manufactured on TSMC’s 7nm process, with the sample plan scheduled to be ready at the end of 2022.
In this blog post, we’ll provide more details about how Neuchips, with a team of about 30 engineers, was able to tape out its 400mm2 AI chip in just 18 months, a process that would typically require more than 100 engineers over the course of 3 to 4 years. Another opportunity to learn more about Neuchips comes during its presentation, “Design of a High-Efficiency Accelerator for Full-Scale Deep Learning Recommendation Models (DLRM) in the Datacenter,” at the upcoming ARC® Processor Summit 2022 on Thursday, September 8, at the Santa Clara Marriott. The company’s session is scheduled from 2:35p.m. to 3:20p.m. PDT.