ARC Processor Summit 2015 Proceedings

Keynote

The Era of Machines That See: Opportunities and Challenges in Embedded Vision
Jeff Bier, Founder, Embedded Vision Alliance/President, BDTi

Increasingly powerful, inexpensive programmable processors and image sensors are making it possible to incorporate computer vision capabilities into a very wide range of electronic products, such as retail point of sale kiosks and signs, personal medical devices, automotive safety systems and smart phones. This presentation provides an update on the development and deployment of embedded vision technology in industry. It will highlight some of the most interesting and most promising products incorporating vision. Important developments in practical enabling technologies, including processors, sensors, development platforms, and standards will also be discussed.

Hardware Track

IP That Will Drive Energy-Efficient IoT Designs
Ron Lowman, Strategic Marketing Manager, Synopsys

Reducing energy consumption and system costs are critical in today's IoT applications. New investments in IP and tools play a significant role in extending battery life, reducing energy costs and enabling added functionality for wearable and machine-to-machine devices. Developing SoCs for IoT applications requires IP that not only supports low power process technologies but also implements advanced low power techniques for energy-efficient sensing, processing, and communications.

Addressing IoT Application Needs with Power-Efficient Processors and Subsystems 
Rich Collins, Product Marketing Manager, Synopsys

Device manufacturers building components for wearable and battery-operated IoT devices face a performance-power paradox that is driving the need for new types of low-power solutions. General-purpose and fixed-configuration processors cannot deliver the required performance to manage specialized IoT tasks like processing sensor information, recognizing voices, filtering, and audio playback within the necessary power budget. Efficient processors and pre-verified, SoC-ready subsystems optimized for sensor and control applications simplify the development of chips for complex IoT applications.

Enabling a New Class of High Performance IoT Applications 
Masami Nakajima, Ph.D., Core Technology Business Division, Renesas Electronics Corporation

To save battery life, many of today's IoT devices offload most of the critical processing tasks to higher performance processors, either on-chip or off-chip. In addition to power efficiency, IoT edge devices will increasingly require higher bandwidth sensor analysis performed locally (i.e. within the chip or subsystem). Embedded flash has emerged as the MCU storage standard, but slower access times have historically limited this higher performance analysis. Solutions combining power efficient sensor processing with higher performance embedded flash technology such as Renesas’ MONOS will provide the ideal power/performance trade-offs required for the next generation of IoT devices.

Advanced Vision Capabilities for Next-Generation SoC Designs 
Tom Michiels, R&D Manager, Synopsys

The availability of high-performance vision processors is making it possible to integrate vision capabilities into SoCs giving them the ability to see and interpret their surroundings. Developing these vision-based chips requires a range of IP and tools that are compliant with the latest embedded vision programming standards, and flexible enough to meet the changing needs of emerging vision applications. This presentation details the architecture of an embedded vision processor implementing convolutional neural networks (CNN) and the open source-based programming environment used to ensure efficient resource utilization of the heterogeneous multicore platform.

Accelerating Applications Using Custom Extensions to ARC Processors 
Martin Kite, Sr. Hardware Architect, Synopsys

This session will introduce the APEX (ARC Processor EXtension) technology of the ARC processor family which can be used to extend the instructions and register set of the CPU as well as add tightly couple peripherals and custom interfaces. Real world examples of using APEX to improve system performance will be shown as well as a live demonstration of how to get started adding your own instructions.

Reducing System Power, Area, and Latency Using Tightly Coupled Memories and Peripherals 
Ad Vaassen, ASIC Digital Design Engineer, Synopsys

The increasing demand for better filtering and processing capabilities of the processor within embedded systems is causing shift from tightly coupled 8-bit microcontrollers to bus-based 32-bit processor. As a consequence, the power, performance and area (PPA) tradeoffs have also shifted in favor of performance, at the expense of power and area. This presentation will cover how closely coupled memories and processor extensions can be leveraged to improve the power and area of these embedded systems by making the bus infrastructure superfluous. Removing the bus infrastructure reduces area costs as well as the latency when accessing memories and peripheral registers. Reduced latency translates into further performance improvement and power reduction. Improving the PPA ratio of an embedded system by tightly coupling memories and peripherals is demonstrated by comparing a bus-based implementation of a sensor hub versus a tightly coupled, bus-less implementation.

Software Track

Debugging with Real-time Trace
Tom Pennello, R&D Engineer

Synopsys ARC Real-time trace is an efficient way to capture the behavior of a program: not only instruction trace, but register and memory changes as well. Trace can be captured at high speed with an Ashling Ultra-XD pod and uploaded to the debugger at gigabit Ethernet speeds. Captured trace can be turned into a "replay" database whereby you can debug your program by executing it both forwards and backwards. In this presentation we demonstrate trace with the Ultra-XD and also how trace can be used without it. We explain trace filtering, program profiling from trace, and the additional features of replay, such as call stack history. We also demonstrate trace replay for multicore.

Accelerating IoT Application Development Using embARC 
Chuck Jordan, Software Engineer, Synopsys

The embARC Open Software Platform is an easily-accessible, highly-productive solution for developing software for ARC processor-based embedded systems and subsystems, especially those targeted for the IoT. The comprehensive suite of free and open-source software available from the embarc.org website, including drivers, operating systems and middleware, enables code development to start sooner and complete faster. In this session we will look at how to quickly bring up an IoT communication stack for a constrained device application, featuring Wifi and 802.15.4-based RF interfaces, IPv6/RPL and CoAP protocols, using the embARC Open Software Platform.

Securing the System: Hardware/Software Tradeoffs in Security Implementations 
Derek Bouius, Technical Marketing Manager, Synopsys

Proper security in any system comes with design trade-offs where either performance or size is optimized. This presentation will highlight performance characteristics of various implementations of encryption and authentication cryptographic algorithms. These algorithm implementations cover size and speed optimized software, special instructions for hardware acceleration and full hardware offload in an ARC processor based system. This session will describe cryptographic software and discuss how plug-in based architectures can leverage different modes of hardware acceleration.

Designing Speech Solutions for Ultra-Low Power Devices 
Dean Neumann, CEO, Malaspina Labs

Wearable devices, IoT devices, industrial and home automation devices and smart appliances represent a growing segment of multi-function "smart" devices. Many of these devices lack the touch screens, graphical displays, or keypads required to select and control the device's functions. Such devices are well suited to using speech as an alternate input method, but have constraints: very limited CPU and memory capacity in which to execute a speech interface, very limited battery capacity which limits their bandwidth or connectivity to access cloud-based speech solutions, and very constrained industrial designs limiting antenna design, microphone placement, interface ports or mounting configurations. Yet in order to achieve market acceptance, speech interfaces for these devices will still be expected to function robustly in a variety of challenging environmental and noise conditions. This presentation will address some of these challenges, discuss potential solutions, and identify market opportunities for speech interaction with low-power devices using Synopsys' ARC EM processor cores.

Optimized Linux for ARC HS38 Multicore Processors 
Vineet Gupta, Software Engineer, Synopsys

The ARC HS38 processor is based on the highly-efficient ARCv2 instruction set architecture (ISA) and pipeline that delivers the high degree of power-performance efficiency and code density required of embedded applications running Linux. The processor has a full-featured Memory Management Unit (MMU) supporting a 40-bit physical address space and page sizes up to 16 megabytes (MBs), giving designers the ability to directly address a terabyte of memory with faster data access and higher system performance. This session will provide insights on how the advanced hardware features of the HS Processor are exploited in the GNU toolchain and Linux kernel to maximize Linux application performance and efficiency in both single core and multi-core (SMP) configurations.

Effective DSP Programming for IoT with the MetaWare Development Toolkit 
Mark Schimmel, Software Engineer, Synopsys

This session will describe the different DSP programming models available with the ARC MetaWare Development Toolkit to develop applications for the power efficient and feature-rich EM DSP processors used for IoT devices. The MetaWare compiler provides portable and flexible programming models to ease DSP development and maximize application performance. This session will go in the details of compiler support for ARCv2DSP ISA code generation, guided and auto-vectorization optimizations, fixed-point math primitives API, native fixed point data types along with code examples. It will also highlight the rich DSP Library available that allows algorithms to be constructed from standard DSP building blocks and an ITU-T base-ops library for developing very efficient voice codecs.

Integrating Ultra-Low Power Voice Control into Your Next SoC 
William Teasley, VP Engineering, Sensory, Inc.

Sensory will discuss its award-winning TrulyHandsfree™ Voice Control feature set, including always on, always listening voice triggers, voice biometrics, and noise robust command and control that has been optimized to run on ARC processor cores. Tips for chip design for voice recognition, Sensory’s porting process, typical footprints, and power saving hardware low power sound detector will be covered as well, along with demos of all the features.

Embedding Sensor Fusion Algorithms for IoT
Pramod Ramarao, Product Manager, Hillcrest Labs

Lower cost semiconductors, increased internet bandwidth and WiFi everywhere has pushed the concept of the Internet of Things to the center of attention as the next high growth technology market. Wearables as a subset require ultra low power consumption and the ability to sense the environment around them including motion, positioning, light, temperature and altitude to provide meaningful data to both the user and cloud services. This session will describe how to efficiently integrate sensor fusion algorithms into a low power SoC with minimum area footprint.