This is the biggest difference between an audit and a scan. An automated scan tool, even one like Black Duck® (which, unlike most, can discover open source snippets) is still just that—an automated tool. To create the most accurate SBOM, experts must review the results of any automated scan. Automated tools can’t always identify each open source component, often because open source components have additional open source components as dependencies. Automated tools take their best guess, but human experts can apply multiple techniques and judgment to accurately identify a component and find components that the tools are not designed to discover.
String searching algorithms look for strings within source code text. This is a key part of an audit because it can discover easy-to-miss pieces of open source and essential information like copyright references in the codebase. An example of a string search might be, “Search for a string that includes the word ‘copyright’ within five words of the word ‘incorporated’ but does not include company name.” This capability uncovers copyrights in the software that may not belong to the developer. It also discovers references such as URLs that can identify code pulled from blogs, custom licenses, or code that requires a commercial license. Further, auditors may find evidence of code misuse by searching for phrases like “stolen from” or “taken from.”
Snippets are small pieces of source code that have been copied from other works. A snippet of open source software can easily find its way into an organization’s proprietary code. For example, a developer may find a useful function from an open source program and paste it into their program. But because they did not use the entire component, only the most sophisticated automated tools can detect it. And even then, an expert eye is required to identify the component from which the code was copied. A small snippet of code may not create a security vulnerability, but it still carries license compliance obligations. And copyleft licenses can be problematic and put proprietary IP at risk.