In an open source software audit, you should scan all software assets required to build your applications. But how do you identify and locate them?
One of the biggest challenges when preparing for an open source audit is to determine the set of files to scan.
The short answer is that you should scan every software asset required to build your applications. But how do you identify and locate these files?
An application is typically created from source code and third-party libraries. Our recommendation is to analyze all these artifacts in a software audit, because all are likely to include open source.
In-house developers who write source code often reuse code they find on the internet, whether in the form of snippets or whole files. But more importantly, applications often use third-party libraries or even entire frameworks that simplify software development. These third-party frameworks and libraries are often released as open source. Even when they’re not, they almost always contain OSS components.
In most cases, the source code is easily accessible. Typically, repositories are handled by a software configuration management (SCM) tool, and all developers have access to them.
But difficulties may arise when you try to locate the third-party libraries, also called dependencies. These are typically managed not by the SCM tool but by the package manager, which is often integrated into a build automation or CI/CD tool.
Some of the most popular build tools are GNU Make, MSBuild, Ant, Maven, Gradle, Jenkins, and Bamboo. But there are many more, and all work a little differently. On top of that, organizations use these tools in different ways, leading to a wide variety of scenarios.
In most cases, only a release engineer or a hands-on software architect who has an in-depth understanding of the build process and tools will be able to provide all the third-party artifacts needed for your software audit.
Note that you can exclude some source files and libraries from the scope of an open source software audit. For example, you may not have to analyze files used as development or testing tools. That’s because they typically don’t ship with the application itself and don’t affect open source risk.
We also recommend excluding duplicated folders. Duplication will only inflate the cost of the software audit without adding value.
In an M&A scenario, it is extremely helpful for the target to involve someone with knowledge of the architecture and the build. If “deal awareness” is a concern, a cover story may help, as there are many reasons that one might want to extract the code. Perhaps a customer requires a code escrow or the board has asked a third party to look at the code.
Our team of consultants has a great deal of experience and can help figure you out which files to provide before your open source software audit. We encourage customers to reach out to us with any questions.
Emmanuel Tournier is a senior manager in the Black Duck audit services group at Synopsys Software Integrity Group. He started working as an open source consultant with Black Duck audit services in 2011 and has been running the open source team since 2016. He came to Black Duck Software (now Synopsys) after 12 years working as a software developer and project manager in the defense and aerospace industry and IT services.