Synphony C Compiler for Imaging Applications  

 

From taking better pictures on a camera to improved medical diagnostics, imaging applications touch every aspect of daily life. The common thread between all these applications is the requirement to capture data from a sensor and process it at high speed to handle the large volume of data produced by sensors as well as to provide superb user experience in devices such as cameras.

Synphony allows the capture and design of hardware implementations of imaging algorithms in C, the same untimed language commonly used during algorithm development and exploration. By taking care of the complex task of extracting parallelism and creating high performance hardware,the Synphony allows the designer to focus on the key differentiator – the algorithm and the overall architecture to achieve the final image look that will captivate customers.

Typical Imaging Designs
The Synphony team has successfully worked with designers to complete a wide variety of complex imaging applications with excellent quality of results. Example designs that have been done or can be done using Synphony C Compiler includes:
  • Image capture: Bayer to RGB conversion, color space conversion, sharpening and smoothing, smile and face detection and red-eye reduction
  • Medical imaging: Algorithms like back-projection and FFT/IFFT Compression/Decompression: Commonly used algorithms in compression such as DCT/IDCT, wavelet coding, Huffman coding; JPEG, JPEG2000 and lossless JPEG pipelines
  • Image processing and enhancements: Scaling, sharpening, noise reduction, edge smoothing, background/foreground separation and film grain insertion
  • Image analysis for security, automotive, medical and industrial applications: Edge detection, face recognition, feature detection, lane/sign detection and object recognition
  • Printer pipelines: Color transform, halftoning, scaling, compression/decompression of halftoned images and resolution enhancement

These designs require some of the advanced capabilities provided by Synphony C Compiler such as support for multi-rate designs, multi-buffered memories for high performance, low-power implementations and multi-modal designs to handle different resolutions and compression/decompression standards.

Customer Case Study: JPEG2000
The following figure shows a JPEG2000 design for both the encoder and the decoder. Both parts of the algorithm are captured at the C level and implemented using Synphony.

This pipeline involves multiple blocks operating at different processing rates. The processing rate is governed by both the amount of computation per block as well as the shape of the incoming data. Data shape changes, different execution rates, and memory interactions make an algorithm such as the JPEG2000 difficult to implement in hardware. The difficulty for this algorithm is in the control structure and not in the computation datapath. Synphony C Compiler abstracts much of the control from the user, making it easier to create this type of design.

Customer Case Study: Optical Inspection System
In this design, the incoming image from the production line is compared to a reference image for early detection of product defects. The Synphony C Compiler solution targets an FPGA implementation platform to replace a multi-processor cluster and met all the criteria including the high performance requirement. The computational complexity and throughput requirements result in an optimized design that takes up ~70% resources of the largest device (LX330) in the Xilinx Virtex5 family. The design was created by a single design engineer in 2 man-months, whereas the manual design would have required multiple man years. For systems such as this where performance and time to market are key, Synphony C Compiler presents the only viable alternative for developing systems on an FPGA at a speed competitive to a multi-processor cluster implementation.

Customer Case Study: Discrete Adaboost Algorithm for Object Recognition
Kyoto University used Synphony C Compiler to design an innovative hardware implementation of the discrete adaboost algorithm, which is the most computationally intensive part of a highly accurate object recognition scheme (Cascade Particle Filter). In order to apply the scheme to embedded applications such as surveillance, automotives and robotics, a specialized processor or hardware implementation is indispensable because of the low power requirement.

Synphony C Compiler generated RTL was synthesized using Synopsys DC for UMC 130nm library. The results show that the synthesized hardware has about 240,000 gates excluding the area for SRAMs and a total processing time of about 5.3 milliseconds at 100 MHz clock frequency. This processing time meets the requirement for real-time processing.

Other Representative Designs:
  • JPEG VLC/VLD
  • Advanced image up/down scaler
  • Back-projection algorithm for medical imaging