Software Integrity Blog

 

Know your code—and know your stuff!

An open source audit digs into a codebase to see what’s inside. Find out what our audit services team unearthed in the 1,250+ codebases we reviewed in 2019.

Open source audit data: What we found in 1,250 codebases

The nature of open source use makes it hard to track. Open source in a codebase typically results from the collective decisions of individual developers. A developer faced with a gap in functionality might cast about the internet for a “puzzle piece”—an open source component, a code snippet—that fits. The result: A puzzle completed in less time, with less effort, than if your developers had to craft each piece from scratch.

But some developers are more savvy than others about vetting the components they ingest on their company’s behalf. And without proper vetting, those components can embed quality, security, and license issues into the finished project.

Tracking trends in open source use through audits

It’s a challenge to track open source within companies (though software composition analysis makes the job more manageable). It’s even harder to do so at the industry level. But understanding industrywide open source trends is essential to crafting best practices that keep your development organization ahead of the game.

So how can we get a complete picture of what’s going on in the industry? Through data aggregated from open source software audits. In open source audit, the audit team pries open a codebase to see what’s inside. The results of one audit are almost always surprising. And when we combine the data from thousands of audits, we see clear patterns in open source use that every development organization should be aware of.

Combining the data of thousands of software audits shows clear patterns

Making sense of our open source audit data

My Black Duck Audit Services team analyzes more code for open source than anyone in the world, across all industries and technologies. Through brute force, for the last four years, we’ve been digging into codebases and aggregating anonymized data on code composition, legal issues, security issues, and other operational factors. Recently, working with the Synopsys Cybersecurity Research Center (CyRC), we published our 2020 Open Source Security and Risk Analysis report, a great bedtime read for anyone in software.

Below are some highlights of what we found across over 1,250 codebases we reviewed in open source audits in 2019. But you really should download the report to get more details and a breakout by industry. You may also want to check out our open source in M&A webinar, in which I put the results in the M&A context. “Phil really knows his stuff,” one participant commented at the end. But that’s shooting a compliment at the messenger. The reality is Synopsys knows its stuff when it comes to open source.

70% of the code in the codebases we audited in 2019 was open source

Software composition

  • Virtually every codebase reviewed in an audit last year (99%) included some open source.
  • Most of the code in these codebases, 70%, was open source.
  • The average codebase contained about 445 open source components.

License risk

  • 73% of the codebases had at least one license issue.
  • 67% of the codebases contained components with license conflicts, most frequently GNU GPL conflicts.

Security risk

  • 75% of codebases under audit contained open source components with unpatched vulnerabilities.
  • The percentage of codebases containing high-risk vulnerabilities increased to 49% in 2019.

Yes, Synopsys, with the CyRC and Black Duck Audit Services team, knows its stuff. After you read the report, you’ll know your open source stuff too!

Download the 2020 OSSRA

This post was originally published May 2019 and refreshed June 24, 2020.

 

More by this author