ARC Processor Summit 2018 Proceedings

Your Embedded Edge Starts Here

Synopsys' ARC® Processor Summit 2018 offered 21 sessions focusing on the latest technologies and trends in embedded processor IP, software, programming tools and applications. Browse the presentations, or select a specific track:Automotive

Keynote Address

Life on the "Edge"

Satyen Yadav, General Manager, IoT Ecosystem Development, Amazon Web Services 

Today, an increasing number of objects are being connected to the internet at an unprecedented rate. As a result, customers are collecting vast amounts of valuable IoT data that was previously unavailable. We are already seeing the benefits of applying machine learning models to process data at the source where it is being generated- farmers predict crop yield, power companies predict energy demand, vehicles can identify distracted drivers, and doctors deliver improved care with real-time insight from medical devices … the possibilities for applying intelligence at the edge are countless. However, processing and analyzing this vast amount of IoT data is not possible with the help of traditional business intelligence tools. In this session, we will showcase how customers can use AWS’s IoT, artificial intelligence, and machine learning services to gain predictive insights and take intelligent, real-time actions on their IoT data, from the cloud to the edge.

The Marriage of AI and Safety in Automotive SoCs 
Fergus Casey, Director, R&D, ARC Processors, Synopsys 

As the automotive industry looks beyond Level 2 (Driver Assist) designs, the race is on to deliver high-performance safety-critical autonomous vehicle components powered by the latest AI technology. AI techniques can provide increased accuracy for object and pedestrian detection, but these designs must still meet the ISO 26262 standard’s most stringent level of functional safety and fault coverage. In this presentation, we analyze an autonomous driving use-case emphasizing the need for the inseparable union of AI and safety. We will discuss how Synopsys achieves this marriage without significant impact on performance, power, or area compared to non-ASIL Ready processors. From an understanding of this use-case and requirements, we present an embedded SoC solution providing the highest level of safety without compromising AI performance.

From architecture through to tape-out, we provide an overview of the design, verification, and safety methodologies required for SoC safety certification.

Reducing the Total Cost of Ownership with Classic and Adaptive AUTOSAR 
Dheeraj Sharma, Product Expert, Elektrobit 

In this talk, we cover the key features of Classic and Adaptive AUTOSAR and how these features reduce the total cost of ownership. Participants should expect to walk away with a high-level overview of AUTOSAR and an understanding of how it provides value for their current and future automotive projects. We focus on the design, configuration, security, and safety aspects that provide the hallmark flexibility of the AUTOSAR platform. Finally, we discuss migration and reuse strategies that match the changing E/E architecture seen within the vehicles.

Personalize the In-Cabin Experience with Face Recognition and Inference of Driver Emotional States 
Vinay M K, Co-founder and Vice President, PathPartner 

Cars are more than just transportation – they provide entertainment, communication, convenience, and more. The ideal in-cabin experience is personalized to each occupant in the car to enhance aesthetics and entertainment as well as safety and automotive performance. Offering personalized experiences gives car manufacturers a way to differentiate their products from other options in the market. In this presentation, we will discuss how designers and system architects can enhance in-cabin personalization with face recognition, micro radar, iris scanning, and other sensor fusion technologies to implement, for example, driver drowsiness alerts. We will explain how driver and passenger recognition can be the key enabler for personalization targets, and how to provide this functionality with limited impact on power and area. Finally, we will explore the types of CNN networks available for recognizing actions/expressions and how recognition can provide semantic context.

Taming AI Using Convolutional Neural Networks with Compression and Pruning 
Bo Wu, Applications Engineer, Synopsys 

To implement AI applications at the edge, you need to move from training to inference on your embedded target. This transition comes with a new set of considerations to take into account. For the target system, you must consider factors such as performance, memory size, throughput, and bandwidth, as well as maintaining the accuracy of your graph. In this session, we will show how you can profile your graph and then achieve your design targets with techniques such as feature-map compression, graph acceleration and coefficient pruning.

Optimizing Deep Learning Perception Software for ADAS and Autonomous Driving 
Chungbin Heo, Optimization Engineer, StradVision, Inc. 

StradVision has developed SVNet, a deep-learning-based object detection of 6 object classes (Pedestrian, Car, Bus, Truck, Box Truck, Two-wheeler). SVNet is robust for bad weather/lighting conditions (e.g., rain, snow, fog, night), small object sizes (e.g., 32 pixel height pedestrian, 20 pixel height vehicle), and occlusion (up to 75% occlusion). In this presentation, I will describe automotive OEM/Tier-1’s requirements (for frontal camera, rear camera, around view monitoring), technical challenges, and how our solution can address them with DesignWare EV6x Vision Processors and its dedicated CNN Engine.

Functional Safety Certification - Your Advantage 
Gudrun Neumann, Software Manager & Team Lead for Functional Safety Software Team, SGS-TÜV Saar 

The ISO 26262 standard does not require a certification of processes or products. But, certifications can ease your development cycle. In this presentation, we will describe the advantages of process, product and personal certifications using Synopsys processes and products as examples. We will also show how providing certificates to customers means offering them valuable additional benefits.

Extending Control and DSP Performance for Automotive RADAR Applications 
Graham Wilson, Sr. Product Marketing Manager, ARC Processors, Synopsys 

This session will cover object detection and classification techniques using the 77GHz RADAR signal source for automotive advanced driver-assistance system (ADAS) applications. An overview of the RADAR technology used in ADAS applications, including review of the relevant signal processing, typical requirements and design parameters will be given, and some of the design considerations and trade-offs of implementing the RADAR system will be discussed. We will close with and an example of an efficient RADAR signal processing chain (3D, FFT, CFAR, clustering and tracking) algorithm implemented on the ARC HS47D processor, where processing on the ARC DSP core is the forming a data cube (3D FFT) to find range, relative velocity, and direction of arrival (DoA) of objects. In this example, front-end FFT processing is implemented on dedicated hardware blocks, performing 3-stage FFT computation.

IoT/Digital Home Track

Addressing the Challenges of Always-on IoT with Efficient Processors for Machine Learning 
Pieter van der Wolf, Principal Product Architect, Synopsys 

The ARC EM DSP processors are extremely well-suited for performing machine learning inference in always-on IoT devices. Smart IoT devices increasingly offer always-on features to allow advanced user control. For example, they can be always-listening to allow control through voice commands, or always-watching to support system wake-up by means of a face trigger. Such always-on functions often employ machine learning techniques for recognizing voice commands, faces, etc. A key requirement for implementing such functions is a very low power consumption, as battery lifetime is key for the always-on capability. In this presentation we discuss the key features of the ARC EM DSP processors that enable efficient machine learning inference. We then show how excellent results, in terms of low MHz requirements, small code size and low power consumption, are achieved for voice trigger and face trigger applications.

Implementing Artificial Intelligence in Embedded Vision, IoT, and Smart Home Applications 
Bruno Lavigueur, ASIC Digital Design Engineer, Synopsys 

As neural network techniques are applied to consumer applications, designers must figure out how to introduce these computationally demanding algorithms while minimizing power consumption. This presentation will discuss how to balance the tradeoffs between performance, power, area, and bandwidth in AI applications. It will cover the evolution of CNN graphs, and describe the attributes of popular graphs such as Masknet, ICNET, and RetinaNet for IoT and smart home applications. Finally, it will touch on how an embedded vision processor architecture can maximize computational efficiency without sacrificing accuracy, using facial recognition for the IoT as an example.

Fast and Ultra-Low Power Graphics Development for Mobile & Embedded Systems 
Iakovos Stamoulis, CTO and Co-Founder, Think Silicon 

High quality CGI (Computer Graphics) and high-resolution display support is proliferating dramatically in embedded, automotive, wearable and IoT devices.

NEMA®|t200 is the latest member of the NEMA|GPU-Family and is a perfect candidate, in combination with ARC Processors, for the acceleration of OpenGL|ES / OpenVG graphics content, providing very high graphics performance even in memory and power resource limited applications. In addition to the GPU, Think Silicon’s focus is to provide a comprehensive SDK including NEMA|Power-Profiler and NEMA|GUI-Builder to support and help developers to optimize their applications for performance and power and to assist them in rapid system deployment. In this presentation Think Silicon will demonstrate the strength of a fully-fledged development system, from RTL implementation to GUI (Graphical User Interface) creation.

Securing Mobile IoT from Chip to Cloud with Integrated SIM Solutions 
Michael Moorfield, Head of Technology Strategy and Innovation, Truphone 

The SIM card as we know it is disappearing. This session explores how integrating SIM functionality into SoC designs can greatly simplify how mobile IoT connectivity is enabled, secured, and managed. Truphone will demonstrate how its eSIM solution running on an ARC-based security platform, combined with its global connectivity and remote SIM provisioning services is enabling secure, out of the box connectivity from chip to cloud.

Streaming Low-Power Audio to “Hearable” Devices Using Bluetooth 5 
Ron Lowman, Strategic Marketing Manager, Synopsys 

Wireless audio solutions have become pervasive, however, battery life remains a key limitation. The implementation of a standardized audio solution over Bluetooth will revolutionize the adoption and use of hearables not only because of the battery life improvement but because of the new applications Bluetooth Low Energy audio will enable. We will explore the new Low Complexity Communication Codec (LC3), the required IP components, and how those IP components are being optimized for next-generation Bluetooth audio hardware for SoC integration.

Deploying NB-IoT Communication Solutions with Extensible Processors 
Anatoly Savchenkov, Software Engineering Manager, Synopsys 

NB-IoT is an emerging technology for narrow-band wireless communication standardized by the 3GPP. It was designed with a focus on minimizing the end-user equipment power and performance requirements to enable the widespread deployment of NB-IoT compatible devices and ensure quick technology adoption. It created a market for licensable software-defined IP and increased the demand for efficient and flexible IP cores to execute this software. This presentation highlights the key challenges of NB-IoT modem design and proposes an optimization strategy demonstrating the efficiency of Synopsys ARC EMxD family of cores as a platform for software-defined NB-IoT modems.

Reducing Dynamic Power and Time-to-Tapeout for High-Performance AI Processor SoCs 
Yudhan Rajoo, Sr. Technical Marketing Manager, Synopsys 

Developing new AI applications requires implementing cutting-edge technologies while meeting performance, power, area and time-to-market requirements. Edge applications require low power, while CNN engines can be especially challenging to a design’s power budget due to the density of the multiply-accumulates needed to run large neural network computations. In addition, designers can face tedious and time-consuming iterations for floor planning and routing in order to meet PPA targets. This presentation will describe how usage of specialized logic cells and memories can address specific RTL-to-GDSII implementation challenges for CNN engines while reducing time-to-tapeout with optimal PPA. We will show a case study describing how utilizing the HPC Design Kit of logic libraries and embedded memories optimized for the DesignWare EV61 Embedded Vision Processor resulted in lower power and faster design closure.

Mobile/Storage

Using Trace Visualization for Efficient Debugging of Embedded Systems 
Jonah Kraft, CEO, Percepio 

Software issues related to timing or resource usage can be difficult since they are not directly visible in the source code, but rather are an emerging property of the system. Such issues call for software tracing. Developers often associate tracing with instruction tracing and overwhelming amounts of low-level information, where you can’t see the forest for all the trees. However, recent advances in trace visualization and software-generated tracing offers an alternative approach that is more suitable for finding anomalies in complex software behavior. This presentation introduces software tracing and demonstrates the potential of state-of-the-art trace visualization, using Percepio Tracealyzer as an example.

High-Performance Solutions for Next-Generation SSD Designs 
Michael Thompson, Sr. Product Marketing Manager, ARC Processors, Synopsys 

Storage is a critical component of the technology enabling online business, information access, streaming video, artificial intelligence and much more. Most of the electronics that we use today wouldn’t be possible without the ever-increasing size and performance that we are seeing from flash storage. This increasing capacity and performance will challenge current methods of maintaining and using data in storage mediums. This is leading to an interest in using artificial intelligence to enable software to dynamically balance and optimize data on SSDs to maximize capacity and throughput. This presentation will investigate how machine learning and other techniques will be used on future SSD designs, and the underlying software and hardware that will make it possible.

Easing Complex Application Development with Processor and System Trace Resources 
Michael Doan, ASIC Digital Design Engineer, Synopsys 

Complex application development is made easier by integrating trace and debug systems that enable observability and the ability to debug early in the design phase, minimizing bug discovery impact, by focusing development resources. This presentation demonstrates the flexibility and breadth of resources provided by ARC Trace Solutions for advanced processor and system development.

Enabling Ultra-High Performance, Low-Power 5G Modem Designs with Heterogeneous Multicore Systems 
Pieter van der Wolf, Principal Product Architect, Synopsys 

The 5G standard pushes the requirements on wireless communications equipment for greater than 1Gbps data rates as well as reducing system latency, allowing an expansion of 5G use cases to automotive and other timing-critical applications. SoC modem developers for 4G systems previously met performance requirements with heterogeneous systems, using multiple task-specific processor cores. 5G modem SoCs for user equipment (mobile devices) will need to take the heterogeneous implementation further to provide greater performance for higher data rates, larger MIMO configurations and lower latency, while maintaining similar power budgets to 4G modems. This session will go through the range of digital signal processors, controller cores and task-specific cores that will allow 5G modem SoC developers to implement the required amount of programmability/flexibility in their design, while achieving the performance and low power requirements.

Accelerating Group Theoretic Cryptography with ARC APEX Instructions 
Drake Smith, Vice President of Development, SecureRF 

This presentation will describe how a mathematically efficient cryptographic operation is significantly sped up using the ARC Processor EXtension (APEX) technology. We will give a brief overview of the underlying math operations and how they are implemented in software only. Then we will outline our approach at offloading the most compute-intensive operations onto hardware together with the design process we followed using Synopsys ARChitect, MetaWare, and Intel Quartus Prime. Attendees will see a comparison of the resulting performance metrics with APEX versus an assembly language-only implementation.

The design example will use a fast, small-footprint, and low energy digital signature algorithm that has immediate applicability for a wide range of IoT solutions and is ideally suited for applications where an ARC processor may need to securely communicate with an 8- or 16-bit device. Attendees will learn how to incorporate this and other security methods into their own ARC-based designs.

Building an Embedded Vision Application with a Caffe CNN Model and OpenVX 
Jamie Campbell, Software Applications Engineer, Staff, Synopsys 

You’ve picked your CNN graph and embedded vision processor – now what? In this tutorial, we will walk you through the tool flow required to take your concept to reality, using a Caffe CNN graph and OpenVX kernels with MetaWare EV Development Toolkit, the software development environment for DesignWare EV6x processors. We will explain how to use OpenVX to create an image processing graph in conjunction with the CNN engine, as well as covering best practices for optimizing your application for an embedded environment.

Accelerating AI/Neural Network Performance While Reducing Power in Android Devices 
Mischa Jonker, Software Engineer, Synopsys 

With Android 8.1, Google has added the Neural Network API to the Android mobile operating system. This API is designed to ease offloading neural network workloads to accelerators such as the GPUs, DSPs and other optimized processors such as the Synopsys EV6x embedded vision processor. In this presentation, we will show how the Android Neural Network API, TensorFlow Lite and the EV6x processor work together to increase neural network performance while reducing power consumption. Power and performance comparisons between running on an application processor versus the benefits of offloading computations to the EV6x processor will be discussed.

Register for Proceedings