Maximizing Application Performance on ARC DSP Processors with the MATLAB/Simulink Plugin

Hugh O’Keeffe, VP Global Engineering, Ashling

Introduction

Performance requirements for the Internet of Things (IoT) and connected vehicles have made digital signal processing (DSP) technology increasingly necessary in mathematically-intensive embedded applications like sensor fusion, voice detection, speech recognition, and audio processing. To meet these requirements, Synopsys offers a range of DesignWare® ARC® Processor IP that are optimized for DSP-intensive applications. Facilitating rapid software development for these processors, the DesignWare ARC MetaWare Development Toolkit offers full C/C++ programming support for the ARC processors’ DSP instructions and XY memory. The Toolkit includes a rich library of DSP functions highly optimized for ARC processors, which C/C++ developers can easily incorporate into their embedded applications software.

MathWorks MATLAB® is a programming platform designed specifically for engineers and scientists. It integrates computation, visualization, and programming in an easy-to-use environment where problems and solutions are expressed in the MATLAB language, a matrix-based language that allows the most natural expression of computational mathematics. Simulink® is an add-on product to MATLAB that provides an interactive, graphical environment for modeling, simulation, and analysis of dynamic systems. By using MATLAB and Simulink together users can combine textual and graphical programming to design their system in a simulation environment.

This article describes the MetaWare MATLAB/Simulink plugin, which gives DesignWare ARC processor users the power to create highly-optimized applications using the MathWorks MATLAB/Simulink platform.

MATLAB and Simulink

MathWorks MATLAB is a mathematical programming platform with a proprietary language and a rich library allowing the development of mathematical algorithms, data analysis, and the creation of models and applications. It enables rapid construction of virtual prototypes to explore design concepts at any level of detail with minimal effort. Simulink provides a graphical user interface (GUI) for building models as block diagrams and includes a comprehensive library of predefined blocks for constructing graphical models of systems. MATLAB and Simulink work together, and MATLAB models or applications can be easily integrated into the Simulink design, allowing simulation of the system as a whole. Both MATLAB and Simulink support generation of C code (using the MathWorks “Embedded Coder”), which allows deployment of simulated and verified system designs to PCs, servers, embedded systems, and so on.

ARC MetaWare Development Toolkit

Synopsys’ ARC MetaWare Development Toolkit is a complete C/C++ toolchain solution that contains all the components needed to support the development, debugging, and optimization of embedded applications for all DesignWare ARC processors.

The MetaWare MATLAB/Simulink plugin integrates the DesignWare ARC MetaWare Development Toolkit into MATLAB and Simulink, allowing compilation of “Embedded Coder” generated C code using the MetaWare Compiler. This compiled code can then be run on ARC processor targets (for example, using the DesignWare ARC EM Starter Kit or the DesignWare ARC AXS103 Software Development Platform) and simulators (for example, ARC nSIM Instruction Set Simulator). Figure 1 provides an overview of MetaWare integration with MATLAB/Simulink.

Figure 1. MATLAB/Simulink MetaWare Integration

Code Replacement

The plugin is provided with the MetaWare Development Toolkit and includes a user manual. Installation can be easily done from within MATLAB using the Install App option.

MATLAB/Simulink code-generation software produces ANSI/ISO C code. However, for optimal performance, generated code must be processor-specific, and the MATLAB/Simulink code replacement feature allows automatic replacement of specified function calls in the generated code with function calls to code optimized for a target processor. The code replacement feature replaces certain plain-vanilla ANSI/ISO C function calls (generated by MATLAB/Simulink) with calls to the highly optimized functions included in the MetaWare DSP libraries.

The plug-in allows MATLAB to be configured to enable code replacement to use functions from the Synopsys DesignWare ARC DSP Library.

The Plugin provides a “mapping” table that tells MATLAB how to do the code replacement and ensures the optimal use of the ARC MetaWare DSP library for:

  • Vector Functions
  • Complex Math
  • Scalar Functions
  •  Real and Complex Matrix Functions
  • Simulink DSP Blocks including:
    • Discrete FIR
    • Convolution
    • Correlation
    • Biquad Filter DF1/Biquad Filter DF2T
    • FIR Decimation/Interpolation
    • LMS Filter
    • Real/Complex FFT
    • Real/Complex Inverse FFT

Full details of all mapping supported are provided in the user manual. For example, the following table shows the supported Matrix Function mapping to the DesignWare ARC MetaWare DSP library:

Table 1. Matrix Functions

The MathWorks CRTOOL can be used to easily show the mapping performed by the plugin or you can simply build your application with mapping enabled and with mapping disabled to see the differences in the resultant generated code.

Processor-in-the-Loop

Simulink supports Processor-In-the-Loop (PIL), which is a way of developing and testing Simulink models on hardware. In a PIL simulation, the generated code runs on target hardware. A particular block in a model is chosen to run on hardware, so a PIL block is created allowing Simulink to run this block in parallel with the regular block simulation. The results of the PIL simulation are transferred to Simulink, allowing verification of the numerical equivalence of the simulation and the code-generation results. This PIL verification process is a crucial part of the design cycle to ensure that the behavior of the deployment code matches the design. Figure 2 shows a typical PIL setup.

The plugin supports PIL on the ARC EM Starter Kit and the ARC AXS103 Software Development Platform hardware targets using an Ashling Opella-XD JTAG probe interface to host PC. PIL requires that the target is running the embARC Open Software Platform (with FreeRTOS) available from www.embarc.org.

Figure 2. Typical PIL Setup using an Ashling Opella-XD JTAG Probe

An API (rtIOStream) is used for managing communication between Simulink and code running on the target hardware, which can be implemented over a serial interface or TCP/IP connection. The plugin supports communication using a serial connection (UART over USB) or a TCP/IP interface.

Simulink allows the simulated block and the PIL block (running on the hardware) to run simultaneously in parallel and the resultant outputs of both can be compared and thus verified in Simulink. This verification process ensures that the deployed code running on the hardware matches the Simulink design. Figure 3 shows the simulation and the hardware target running in parallel. Note how the outputs of both blocks are compared to verify correct operation.

Figure 3. Simulink design showing a block running in Simulink (Controller) and PIL (Controller1)

Benchmarking

The MATLAB plugin supports native code execution profiling or benchmarking available in Simulink. When enabled, the generated code is instrumented to record how long functions take to execute. After a simulation is run, Simulink provides code execution profiling reports for analysis. Note that ARC TIMER0 must be present and enabled in the target processor to perform benchmarking.

Code Replacement using the DesignWare ARC MetaWare DSP library can be used to improve overall performance and improvements will be typically dependent upon:

  • the function/operation being optimized (using the ARC MetaWare DSP library)
  • the vector/matrix input size(s)

For example, a 5x5 matrix multiply using 32-bit single precision floating point data types was found to run 2.5 times faster when using the ARC MetaWare DSP library. A discrete FIR filter (q15 fixed-point; 64x1 input vector) ran 6.6 times faster.

Conclusion

The MATLAB/Simulink plugin for the Synopsys DesignWare ARC MetaWare Development Toolkit gives ARC processor users the power to generate high performance code, as shown by the benchmark example above. By using MATLAB’s Embedded Coder Code Replacement Library, the plugin can generate highly-optimized ‘C’ code targeted to run on the “ARC” architecture using the DesignWare ARC MetaWare DSP Library which takes full advantage of the ARC architecture DSP extensions.