Contact Sales

Search Synopsys

Innovate Faster with Synopsys Multi-Die Solution

Explore our eBook for scalable multi-die solutions to boost innovation, productivity, and success.

Automotive Executive Guide: Rethinking Automotive Development

A guide to virtualization in software-defined vehicles for automotive leaders.

Mastering AI Chip Complexity

This eBook explores AI chip design trends, challenges,
and strategies for first-pass silicon success.

Facial Expression Analysis with Deep Learning & Computer Vision

Gordon Cooper

Jan 23, 2017 / 5 min read

Recognizing facial expressions and emotions is a basic skill that is learned at an early age and important to human social interactions. Humans can look at a person’s face and can quickly recognize the common emotions of anger, happiness, surprise, disgust, sadness, and fear. To transfer this skill to a machine is a complex task. Researchers have devoted decades of engineering time to writing computer programs that recognize a feature with accuracy, only to have to start over again to recognize a slightly different feature.

What if, instead of programming a machine, you could teach it to recognize emotions with great accuracy?

Deep learning techniques are showing great promise in lowering error rates for computer vision recognition and classification. Implementing deep neural networks (Figure 1) in embedded systems can help give machines the ability to visually interpret facial expressions to almost human-like levels of accuracy.

Neural Network Diagram for Facial Expression Analysis

Figure 1. Simple example of a deep neural network

A neural network can be trained to recognize patterns and is considered to be “deep” if it has an input and output layer and at least one hidden middle layer. Each node is calculated from the weighted inputs from multiple nodes in the previous layer. These weighting values can be adjusted to perform a specific image recognition task. This is referred to as the neural network training process.

For example, to teach a deep neural network to recognize a photo showing happiness, it is presented with images of happiness as raw data (image pixels) on its input layer. Knowing that the result should be happiness, the network recognizes patterns in the picture and adjusts the node weights to minimize the errors for the happiness class. Each new annotated image showing happiness helps refine the weights. Trained with enough inputs, the network can then take in an unlabeled image and accurately analyze and recognize the patterns that correspond to happiness.

Synopsys ARC NPX6 NPU Family for AI/Neural Processing Datasheet

Explore Synopsys ARC® NPX Neural Processor IP for high-performance, power-efficient AI SoCs.

Download Datasheet

Deep neural networks require a lot of computational horsepower, calculating the weighted values of all these interconnected nodes. In addition, memory for data and efficient data movement are also important. Convolutional neural networks (CNNs) (Figure 2) are the current state-of-the-art for efficiently implementing deep neural networks for vision. CNNs are more efficient because they reuse a lot of weights across the image. They take advantage of the two-dimensional input structure of the data to reduce redundant computations.

Diagram of Facial Expression Analysis for Emotion Detection Technology

Figure 2. Example of a Convolutional Neural Network architecture (or graph) for facial analysis

Implementing a CNN for facial analysis requires two distinct and independent phases. The first is the Training Phase. The second is the Deployment Phase.

The Training Phase (Figure 3) requires a deep learning framework – Caffe or TensorFlow, for example – that will use CPUs and GPUs for the training calculations and the knowledge to use the framework. These frameworks often provide example CNN graphs that can be used as a starting point. The deep learning framework allows for the fine tuning of the graphs. Layers may be added, removed or modified to achieve the best possible accuracies.

Facial Expression Analysis Chart for CNN Training on Synopsys Page

Figure 3. CNN Training Phase

One of the biggest challenges in the Training Phase is finding the right labeled dataset to train the network. The accuracy of the deep network is highly dependent on the distribution and quality of the trained data. Several options to consider for facial analysis are the emotion annotated dataset from the Facial Expression Recognition Challenge (FREC) and the multi-annotated private dataset from VicarVision (VV).

The Deployment Phase (Figure 4), for real-time embedded design, can be implemented on an embedded vision processor like the Synopsys DesignWare® EV6x Embedded Vision Processors with programmable CNN engine. An embedded vision processor is the best choice for balancing performance with small area and lower power.

Diagram of Facial Expression Analysis Using CNN Deployment

Figure 4. CNN Deployment Phase

While the scalar unit and vector unit are programmed using C and OpenCL C (for vectorization), the CNN engine does not have to be manually programmed. The final graph and weights (coefficients) from the Training Phase can be fed into a CNN mapping tool and the embedded vision processor’s CNN engine can be configured and ready to execute facial analysis.

Images or video frames captured from a camera lens and image sensor are fed into the embedded vision processor. It can be difficult for CNN to handle significant variations in lighting conditions or facial poses, so pre-processing of the images make the faces more uniform. The heterogeneous architecture of a sophisticated embedded vision processor and CNN allows the CNN engine to classifying the image while the vector unit is preprocessing the next image – light normalization, image scaling, plane rotation, etc., and the scalar unit handles the decision making (i.e., what to do with the CNN detection results).

Image resolution, frame rate, number of graph layers and desired accuracy all factor into the number of parallel multiply-accumulations needed and performance requirements. Synopsys’ EV6x Embedded Vision Processors with CNN can run at up to 800MHz on 28nm process technologies, and offer performance of up to 880 MACs simultaneously.

Once the CNN is configured and trained to detect emotions, it can be more easily reconfigured to handle facial analysis tasks like determining an age range, identifying gender or ethnicity, and recognizing the presence of facial hair or glasses.

Summary

CNN running on an embedded vision processor opens new applications for vision processing. Soon, having the electronics around us interpreting our feelings will be commonplace, from toys detecting happiness to electronic teachers that can determine the students’ level of understanding by identifying facial expressions. The combination of deep learning, embedded vision processing and high performance CNNs will soon bring this vision closer to reality.

Subscribe to the Synopsys IP Technical Bulletin

Includes in-depth technical articles, white papers, videos, upcoming webinars, product announcements and more.

Continue Reading

Datasheet

Synopsys ARC NPX6 NPU Family for AI/Neural Processing

Download Datasheet

Webinar

Addressing Real-Time Workloads in Automotive Applications with Efficient ARC-V Processors

Webinar

Search Synopsys

Popular Content

Innovate Faster with Synopsys Multi-Die Solution

Explore our eBook for scalable multi-die solutions to boost innovation, productivity, and success.

Automotive Executive Guide: Rethinking Automotive Development

A guide to virtualization in software-defined vehicles for automotive leaders.

Mastering AI Chip Complexity

This eBook explores AI chip design trends, challenges,
and strategies for first-pass silicon success.

Facial Expression Analysis with Deep Learning & Computer Vision

Synopsys IP
Technical Bulletin

Synopsys ARC NPX6 NPU Family for AI/Neural Processing Datasheet

Summary

Subscribe to the Synopsys IP Technical Bulletin

Continue Reading

Synopsys ARC NPX6 NPU Family for AI/Neural Processing

Addressing Real-Time Workloads in Automotive Applications with Efficient ARC-V Processors

Implementing High Performance Real-Time Designs Using Synopsys ARC Processor IP

Search Synopsys

Popular Content

Innovate Faster with Synopsys Multi-Die Solution

Explore our eBook for scalable multi-die solutions to boost innovation, productivity, and success.

Automotive Executive Guide: Rethinking Automotive Development

A guide to virtualization in software-defined vehicles for automotive leaders.

Mastering AI Chip Complexity

This eBook explores AI chip design trends, challenges, and strategies for first-pass silicon success.

Facial Expression Analysis with Deep Learning & Computer Vision

Synopsys IP Technical Bulletin

Synopsys ARC NPX6 NPU Family for AI/Neural Processing Datasheet

Summary

Subscribe to the Synopsys IP Technical Bulletin

Continue Reading

Synopsys ARC NPX6 NPU Family for AI/Neural Processing

Addressing Real-Time Workloads in Automotive Applications with Efficient ARC-V Processors

Implementing High Performance Real-Time Designs Using Synopsys ARC Processor IP

This eBook explores AI chip design trends, challenges,
and strategies for first-pass silicon success.

Synopsys IP
Technical Bulletin