Posted by Baljeet Malhotra on October 25, 2017
Artificial Intelligence (AI) is revolutionizing the way we live, work and think. In recent times, computing machines have become intelligent enough to recognize real world objects, recognize speech, learn programs, paint like an artist, or even dream like humans.
Security and reliability of software systems, which is enormously important to our modern economy, is also benefiting from advances in AI research. Although open source is no more or less secure than other software, given the availability of source codes, detection and exploitation of security vulnerabilities in open source presents an easier target. Figure 1 below reveals the number of vulnerabilities reported in National Vulnerability Database (NVD). Note that there are many more vulnerabilities that never make it to NVD, a topic that I’ll address in my FLIGHT 2017 presentation with Nathan Zhang, a data scientist on my team.
Figure 1: Distribution of vulnerabilities reported in NVD by year
The recent exploitation of vulnerability (CVE-2017-5638) in Apache Struts reminds us of severe consequences that enterprises (as well individuals) face when they don’t secure and manage the open source in their applications. As various open source solutions expand to different industries and markets, the timely discovery and mitigation of publicly known vulnerabilities has become increasingly important. Unfortunately, the security experts who often discover these vulnerabilities (with the intention of mitigating the risks) are finding it extremely difficult to analyze the vulnerabilities. For instance, to determine various threat levels and exploitability factors, security experts are often required to determine: (1) access/authentication complexity, (2) confidentiality, integrity and availability impacts of vulnerabilities, and (3) numerical scores to quantify the items mentioned in (1) and (2). NVD is one of the several good sources for vulnerability assessment methodologies.
Overall, vulnerability analysis is a time-consuming task, which unfortunately must be done in a time-sensitive manner without compromising with the essential steps of analysis needed to mitigate the risks in an effective way. Unfortunately, this situation is becoming worse due to the increased number of vulnerabilities that are being discovered (recall Figure 1). On a given day, our security experts at Synopsys could end up analyzing tens of vulnerabilities to make the consumers of affected open source solutions more secure. In this context, we are using AI solutions to help our security experts conduct vulnerability analysis at a large scale quickly and accurately. If computing machines (powered by AI solutions) can do this analysis independently and automatically it will be be incredibly time-effective and cost-effective. While a worthy goal, we first need to understand where the challenges lie.
An important part of AI driven security solutions is training computing machines with real world datasets. At Synopsys, we are fortunate to have the world’s largest database of open source software, supplemented by important pieces of meta data such as publicly known vulnerabilities, licenses, vendor information, and so on. Our data scientists and security experts are utilizing these data to build the next generation of open source security solutions. In this context, training a computing machine is very important. To train a machine, you essentially need to provide a relevant and sufficient amount of data to your algorithms so that they can continue to learn from the evolving data as new open source solutions become available and new vulnerabilities are discovered.
These constantly evolving data pose several challenges that need to be overcome before we can realize effective AI driven security solutions. Many of these challenges stem from the fact that open source projects entail large volumes of structured and unstructured data that are difficult to find, manage and analyze. We are applying various Data Mining, Machine Learning and Natural Language Processing solutions to solve some of the most challenging problems related to open source security. Following are some examples of our AI driven solutions.
Essentially, AI cannot fully automate the process of open source security or open source risk management. Nonetheless, we’ve seen success in experimenting and implementing various AI driven solutions that are stepping stones toward a fully automated open source risk management solution.
Do you want to know more about our AI based approaches or get involved in our research projects? Contact us for more details.
Get the latest Software Integrity news, thought leadership, and more.