AI algorithms are growing more complex and also more specific to the application at hand. There are also lots of variables that affect how well a processor performs for a given application—enough so that it’s difficult to get an apples-to-apples comparison. A processor used to run a relatively simple algorithm may not measure up when applied to a more complex one, leaving you with power and performance benchmark data that doesn’t provide an accurate result in silicon.
Benchmarking an AI-enabled processor to run a convolutional neural network (CNN) involves many considerations. At a simple level, if you’ve got a neural network that’s common and you have the same data and coefficients, you can run this network through your architecture to generate the performance result, typically an accuracy measurement. However, in a real-time, embedded system, you’ll also need to factor parameters such as power, area, latency, and bandwidth into your benchmarking for a more realistic picture. Understanding overall SoC performance also involves considering aspects such as the chip’s process node, its clock speed, and optimizations on the network (like compression and quantization). Since the purpose of benchmarking is to compare two or more architectures, to verify that a given architecture will meet your application’s requirements, it’s very important to be clear about your system and its limitations.
There are lots of variables that affect how well a processor performs for a given application—enough so that it’s difficult to get an apples-to-apples comparison.
There’s currently no industry-standard neural network when it comes to AI hardware benchmarking, but the MLPerf benchmark suite comes close. Developed by ML Commons, an open engineering consortium, the MLPerf benchmarks are de-facto industry-standard metrics that measure machine-learning performance and now encompass data sets and best practices. On the inference side, the consortium’s neural networks include data center, edge, mobile, and tiny.
One of the more commonly used neural networks in the MLPerf benchmarking suite is ResNet-50, a CNN that’s 50 layers deep and provides object classification. It can be used as a building block to create more advanced benchmarking neural networks. The neural networks provided by MLPerf provide a good starting point for evaluating architectural efficiency of a given processor. Of course, every processor vendor is incentivized to fully optimize its neural network accelerator to MLPerf, which means that you won’t necessarily get a measurement of how good their tools are if you go by MLPerf results alone. And this is essential, since the tools must be able to perform accurate neural network mapping to optimize for a specific processor. If you use MLPerf as a starting point for benchmarking, choose some non-standard neural networks as well, and give your vendors a short turnaround to optimize them to get a better sense of how well their processors will perform.