Probably the simplest and most straightforward approach to static analysis is the Unix utility grep. Armed with a list of good search strings, grep can reveal quite a lot about a code base. The down side is that grep is rather lo-fi because it doesn’t understand anything about the files it scans. Comments, string literals, declarations and function calls are all just part of a stream of characters to be matched against. Better fidelity requires taking into account the lexical rules that govern the programming language being analyzed. By doing this, a tool can distinguish between a vulnerable function call gets(&buf); a comment /* never ever call gets */ and an innocent and unrelated identifier >int begetsNextChild = 0; Basic lexical analysis is the approach taken by early static analysis tools, including ITS4, Flawfinder and RATS, all of which preprocess and tokenize source files (the same first steps a compiler would take) and then match the resulting token stream against a library of vulnerable constructs. Earlier, Matt Bishop and Mike Dilger built a special-purpose lexical analysis tool specifically for the purpose of identifying time-of-check to time-of-use (TOCTOU) flaws.
While lexical analysis tools are certainly a step up from grep, they produce a hefty number of false positives because they make no effort to account for the target code’s semantics. A stream of tokens is better than a stream of characters, but it’s still a long way from understanding how a program will behave when it executes. Although some security defect signatures are so strong that they don’t require semantic interpretation to be identified accurately, most are not so straightforward.
To increase precision, a static analysis tool must leverage more compiler technology. By building an abstract syntax tree (AST) from source code, such a tool can take into account the basic semantics of the program being evaluated.
Armed with ASTs, the next decision to make is the analysis’ scope. Local analysis examines the program one function at a time and doesn’t consider relationships between functions. Module-level analysis considers one class or compilation unit at a time, so it takes into account relationships between functions in the same module and considers properties that apply to classes, but it doesn’t analyze calls between modules. Global analysis involves analyzing the entire program, so it takes into account all relationships between functions.
The analysis’s scope also determines the amount of context the tool considers. More context is better when it comes to reducing false positives, but it can lead to a huge amount of computation to perform. Researchers have explored many methods for making sense of program semantics. Some are sound, some aren’t; some are built to detect specific classes of bugs, while others are flexible enough to read definitions for what they’re supposed to detect. Let’s review some of the most recent tools:
- BOON applies integer range analysis to determine whether a C program can index an array outside its bounds. While capable of finding many errors that lexical analysis tools would miss, the checker is still imprecise: it ignores statement order, it can’t model interprocedural dependencies, and it ignores pointer aliasing.
- Inspired by Perl’s taint mode, CQual uses type qualifiers to perform a taint analysis, which detects format string vulnerabilities in C programs. CQual requires a programmer to annotate a few variables as either tainted or untainted and then uses type inference rules (along with pre-annotated system libraries) to propagate the qualifiers. Once the qualifiers are propagated, the system can detect format string vulnerabilities by type checking.
- The xg++ tool uses a template-driven compiler extension to attack the problem of finding kernel vulnerabilities in the Linux and OpenBSD. It looks for locations where the kernel uses data from an untrusted source without checking it first, methods by which a user can cause the kernel to allocate memory and not free it, and situations in which a user could cause the kernel to deadlock.
- The Eau Claire tool uses a theorem prover to create a general specification-checking framework for C programs. It can help find common security problems like buffer overflows, file access race conditions, and format string bugs. Developers can use specifications to ensure that function implementations behave as expected.
- MOPS takes a model-checking approach to look for violations of temporal safety properties . Developers can model their own safety properties, and some have used the tool to check for privilege management errors, incorrect construction of chroot jails, file access race conditions, and ill-conceived temporary file schemes.
- Splint extends the lint concept into the security realm . By adding annotations, developers can enable the tool to find abstraction violations, unannounced modifications to global variables, and possible use-before-initialization errors. Splint can also reason about minimum and maximum array bounds accesses if it is provided with function pre- and postconditions.
Many static analysis approaches hold promise, but have yet to be directly applied to security. Some of the more noteworthy ones include ESP (a large-scale property verification approach), model checkers such as SLAM and BLAST (which use predicate abstraction to examine program safety properties[11,12]), and FindBugs (a lightweight checker with a good reputation for unearthing common errors in Java programs). Several commercial tool vendors are starting to address the need for static analysis, moving some of the approaches touched on here into the mainstream.
Good static analysis tools must be easy to use, even for non-security people. This means that their results must be understandable to normal developers who might not know much about security and that they educate their users about good programming practice. Another critical feature is the kind of knowledge (the rule set) the tool enforces. The importance of a good rule set can’t be overestimated.
In the end, good static checkers can help spot and eradicate common security bugs. This is especially important for languages such as C, for which a very large corpus of rules already exists. Static analysis for security should be applied regularly as part of any modern development process.
- D. Verndon and G. McGraw. “Risk Analysis in Software Design,” IEEE Security & Privacy, vol. 2, no. 5,2004, pp. 79–84.
- G. McGraw, “Software Security,” IEEE Security & Privacy, vol. 2, no.2, 2004, pp. 80–83.
- M. Bishop and M. Dilger, “Checking for Race Conditions in File Accesses,” Computing Systems, vol.9, no. 2, 1996, pp. 131–152.
- D. Wagner et al., “A First Step Towards Automated Detection of Buffer Overrun Vulnerabilities,” Proc. 7th Network and Distributed System Security Symp. (NDSS2000), Internet Soc., 2000, pp. 3–17.
- J. Foster, T. Terauchi, and A. Aiken, “Flow-Sensitive Type Qualifiers,” Proc. ACM Conf. Programming Language Design and Implementation (PLDI2002), ACM Press, 2002, pp. 1–12.
- K. Ashcraft and D. Engler, “Using Programmer-Written Compiler Extensions to Catch Security Holes,” Proc. IEEE Symp. Security
and Privacy, IEEE CS Press, 2002, pp. 131–147.
- B. Chess, “Improving Computer Security using Extended Static Checking,” Proc. IEEE Symp. Security and Privacy, IEEE CS Press,
2002, pp. 118–130.
- H. Chen and D. Wagner, “MOPS: An Infrastructure for Examining Security Properties of Software,” Good static checkers can help spot and eradicate common security bugs. Building Security In Proc. 9th ACM Conf. Computer and Communications Security (CCS2002), ACM Press, 2002, pp. 235–244.
- D. Larochelle and D. Evans, “Statically Detecting Likely Buffer Overflow Vulnerabilities,” Proc.10th USENIX Security Symp.(USENIX’01), USENIX Assoc., 2001, pp. 177–189.
- M. Das, S. Lerner, and M. Seigle, “ESP: Path-Sensitive Program Verification in Polynomial Time,” Proc. ACM Conf. Programming Language Design and Implementation (PLDI2002), ACM Press, 2002, pp. 57–68.
- T. Ball and S.K. Rajamani, “Automatically Validating Temporal Safety Properties of Interfaces,” Proc. 8th Int’l SPIN Workshop on
Model Checking of Software, LNCS 2057, Springer-Verlag, 2001, pp. 103–122.
- T.A. Henzinger et al., “Software Verification with Blast,” Proc. 10th Int’l Workshop Model Checking of
Software, LNCS 2648, Springer-Verlag, 2003, pp. 235–239.
- D. Hovemeyer and W. Pugh, “Finding Bugs is Easy,” to appear in Companion of the 19th Ann. ACM Conf. Object-Oriented Programming, Systems, Languages, and Applications, ACM Press, 2004.
Co-authored by Brian Chess, chief scientist at Fortify Software. His technical interests include static analysis, defect modeling, and Boolean satisfiability. He received a Ph.D. in computer engineering from the University of California, Santa Cruz. Contact him at email@example.com.