The Criticality of Performance per Watt Optimization for AI Chip Development
Chip developers are seeing an urgent rise in demand for compute processing capability driven by AI workloads. This increase in compute requirements drives a corresponding increase in the demand for power consumption. For example, a ChatGPT query requires nearly 10 times as much power, on average, as a Google search. Power has traditionally been treated as a secondary constraint, with performance taking precedence during development. It is no longer feasible to leave power optimization until the end of the design cycle. The performance per watt metric is now of critical importance for AI chips and chiplets and must be addressed throughout the development process. Hyperscalers now often revise their metrics to be “tokens/watt.”