Bringing Machine Learning from the Cloud to the Embedded Edge

Kavitha Prasad, VP of Business Development and Systems Applications,

With Zettabytes of data generated and pushed to the cloud, the costs of data management, connectivity, and storage are massive and growing. In addition, system latency in the cloud is a challenge that’s difficult to address when distance is a factor. To overcome these cloud-based challenges, system designers are pushing more artificial intelligence (AI) and compute functions to the embedded edge.

Processing and analyzing data closer to the applications they serve is growing in popularity and opening new markets for robotics, autonomous systems, and other smart vision applications in industrial, consumer, automotive, and aerospace applications. While pushing machine learning (ML) processing to the edge reduces latency, it also comes with new development and deployment challenges of its own.

System Development Challenges for Machine Learning at the Edge

To help edge AI system architects bring their visions to life, offers a heterogeneous compute platform that balances power and performance with ease-of-use and time-to-market. In addition, in the ever-changing world of AI inference, architects need the flexibility to easily add machine learning to legacy applications and also meet future application demands by easily upgrading machine learning capabilities as the technology and application needs evolve.’s new Machine Learning SoC (MLSoC™) platform supports ML and traditional compute with high-performance, low power, and a software-first philosophy to accelerate design velocity. Based on feedback from dozens of customers, we are building an architecture that is software-centric and easy to use that consistently performs 30x better as measured by frames per second/watt (FPS/W) than alternatives. We are working closely with customers to understand their applications and map them to our hardware.

By providing early access to our SDK, we are beginning to help customers accelerate their time-to-market and deliver machine learning enabled products before their competition. Supporting industry standard machine learning frameworks like Tensor Virtual Machine (TVM), as well as Open VX, OpenCV, and OpenCL, allows customers to focus on their applications and not on the hardware or the interfaces. In addition, using a standard open-source approach greatly increases the designers’ ease-of-use and can be used in conjunction with the selected hardware to be future-proofed for next-generation demands.

Getting software into customers’ hands early is critical, though we also know that the selection of IP for the hardware is paramount to the success of our product. After gathering significant customer input to narrow down the list of IP to integrate onto our platform, we determined that we needed an IP vendor who could provide a robust and complete IP portfolio as well as verification and validation tools. We selected Synopsys as they offer the complete processor, interface, and security IP portfolio we need to address our customers’ challenges, along with tools such as the Fusion Design Platform to enable optimized implementation.

For example, the MLSoC platform offers up to 50 Tera-operations per second (TOPS) total performance running at 10 TOPS per watt to enable ML workloads on the edge that would traditionally require cloud-level performance. selected the DesignWare® ARC® Embedded Vision Processor as its power/performance profile meets our requirements for computer vision processing (Figure 1). The DesignWare ARC EV7x Vision Processors’ heterogeneous multicore architecture includes up to four high-performance VPUs. Each EV7x VPU includes a 32-bit scalar unit and a 512-bit wide vector DSP and can be configured for 8-, 16-, or 32-bit operations to perform simultaneous multiply-accumulates on different streams of data.

In addition to the processing function, we enhanced both power and performance with the selected DesignWare Security and Interface IP. Synopsys’ Security IP helps protect the system-on-chip’s (SoC's) data and algorithms while MIPI CSI-2, Ethernet, PCI Express, and LPDDR IP provide high-speed camera, host processor, and memory connectivity at the lowest power.

Figure 1: MLSoC Architecture

Our purpose-built, highly integrated MLSoC supports legacy compute along with industry-leading machine learning to deliver more than 30x better compute-power efficiency, compared to industry alternatives. We are delighted to collaborate with Synopsys towards our common goal to bring high-performance machine learning to the embedded edge."

Krishna Rangasayee


founder and CEO,

Meeting Security and Safety Standards for the Future

Protecting the SoC’s data is one thing – protecting the lives of passengers riding in autonomous vehicles is a leap ahead. is gearing up the MLSoC for applications requiring ISO 26262 certification by following the processes needed for compliance, including integrating the right type of IP. We are integrating the safety mechanisms needed for certification at the right time, as well as the security features required for trusted devices. 

A Look Ahead’s MLSoC is initially optimized for computer vision applications, which are central to endpoint AI use cases in robotics, security, and autonomous machines, though we have plans to expand to broader machine learning applications. Tapeout is planned for the middle of this year and we are actively engaged with customers at all stages of their development cycle. As our SDK is available in advance, customers will be able to quickly compile and deploy their systems once the silicon becomes available. 

About the Author

Kavitha Prasad is the Vice President of Business Development and Systems Applications at Through close collaboration with customers and partners, Kavitha is responsible for defining the system-level architecture for's machine learning SoC. She is a technology leader with over 22 years of experience in delivering multiple successful products in ASICs, SoCs, FPGAs, and servers across multiple process nodes. Prior to, Kavitha was responsible for system solutions across embedded market segments for the Intel Platform Solutions Group, among other positions at Intel. Prior to Intel she held multiple technology roles at Xilinx and Philips. Kavitha holds a Master of Science in Electrical and Electronics Engineering from San Jose State University.