Software Integrity

 

Gartner and static analysis

James McGovern recently wrote a post on Gartner’s static analysis (SA) report. Among other things, he lamented the lack of actionable guidance within the report. A lack of implementation guidance doesn’t shock me from Gartner, I can’t say I expect that from them.

I can help James and community out by giving some of that guidance myself. I’ll try to do so in a tool-independent way. This topic deserves more than a “blog entry” format (admitting our propensity for longer entries ;-)) so if people want more information on the topic, they can ping me.

In essence, you want to know the following about static analysis tools:

  1. Who is going to run it (how will owners shepherd it)?
  2. How much effort is it going to take to run (what budget will I need to support it)?
  3. When do I run it and how often (Where does it fit within my SDL?)
  4. What it’s going to find (How will the results enter/impact my organization)?
  5. What won’t it find (How does it play with the rest of the assurance ecosystem)?

Market Adoption and Making Your Choice

Gartner seems underwhelmed by Microsoft’s ability to effect the commercial static analysis market. We have always believed these technologies would make great features of a compiler/IDE and so we’re also disappointed. They’ve got great technologies but don’t exploit them commercially. I still like their freely available FXCop but this opinion isn’t popular. Perhaps that’s because I see a lot of value to FXCop as a unit testing framework that enables security test and other people are comparing it to glossy commercial tools like Fortify and Ounce run by security analysts in a very hands-off fashion. Who’s considering the tool really does make a huge difference in how it will fare.

From my perspective, the most successful SA tool vendors have made answering the “who should adopt the tool” question more difficult as they’ve waffled on licensing models. By the seat? By the core? By the KLoC? I’ve gotten complaints from prospective tool buyers indicating that a leading tool vendor’s price was 10X another vendor’s. Before you ask “Which one?”: I’ve heard specific examples in opposite directions!

Certain tools will look better to different potential buyers. Experience has shown positive response to Coverity’s C/C++ results from developers. Application security groups like Fortify and Ounce’s products, and several years ago, I saw QA Managers in two very different firms go ga-ga over Klockwork’s tool. I attribute people’s impressions not to a product vendor focusing on quality or security but to how closely the tool vendor’s conception of how roles interface with their tools throughout the vulnerability management and software development life cycles matches the organization’s own structure. Up until a few years ago I don’t believe SA vendors had fully conceived the “developer”, “analyst”, “security manager” roles in their products. Ounce was the one exception: they’ve always staunchly believed themselves to be principally a Security Analyst’s tool. Some vendors attempted to unify roles into use of a single interface while others attempted to split the product up into different configurations for each. This created confusion and friction between some adopting organizations and the tool vendor’s license model.

In the end you want SA tools to affect as many developers as possible and to get low-latency scan results as early in software’s life cycle as possible. However, having successfully deployed tools a variety of ways, I’m convinced central deployment is the way to go. Application Security should run the tool. This will help manage rule/configuration update, as well as central measurement collection, risk management, and issue tracking. How do we get low-latency and early-lifecycle centrally? Treat the tool as a service. More on that later.

Notable exceptions to the central deployment model include business units possessing one or more of the following characteristics:

  • Acquisitions of previously successful, culturally-independent companies
  • High-quality product development teams
  • Fundamentally different SDL, development platform, language, and toolkits

For in-unit deployment one can “run locally” but must “report centrally” supporting security goals involving measurement and risk management. And, let’s be clear: if a business unit has a wicked-good continuous integration/build-management group, they-by all means-should be tapped to integrate with the SA service. Synopsys and its clients have had success in integrating both the front (code submission) and back-end (bug/issue tracking) with Fortify and a variety of open source build/CI/bug-tracking products. This maintains a central model, but reduces latency dramatically.

How Much Effort

I’m sorely disappointed in the honesty of tool vendor sales-folk regarding level of effort. One of my first analogies for SA tools is this: Did your company buy Mercury products for load and performance testing? Think of your SA tool like Mercury–you’ll need at least as much money to deploy and implement the tool organization-wide as the licenses cost you. Sorry.

Organizations with more than 2-300 developers should plan on the following:

  • 3-4 man weeks to do initial tuning and pilot implementation
  • 4-16 man hours / large application to integrate with the static analysis tool and assure appropriate configuration
  • 4-8 man hours / 50-100KLoC (language-/complexity-dependent) to triage results on a new-application scan
  • 1 person / year as tool shepherd (app integration, custom-rule creation, rule-pack maintenance and release, support)

Organizations get into trouble when their SA tool enjoys broad acceptance by the organization before they effectively manage the submission and results triage/documentation process. My data indicates code submission can burn 8-24mhrs / application if security groups don’t ‘remember’ (or automate) the submission and scan setup process. Security groups doing scans centrally and hand-writing code review results can waste another 16hrs documenting their findings / application.

…think about it: if your organization allots a week / tool-assisted code review and wastes 24 hours getting the tool’s first result set and 16hrs documenting findings… that only left 10 hrs to actually review the code! Since I believe that 4-8 hrs / 50-100KLoC is necessary to triage, you can imagine how deep I think such code reviews are 😛 Here, we’ve got dramatic benefits to depth by decreasing cost of submission and reporting. This is key to scaling SA efforts. Those adopting tools will feel overwhelmed if they don’t plan to solve submission and reporting problems before wider roll-out.

How Often

As often as possible. If you’ve got a Continuous Integration initiative (CI), engage those folk. If you deploy your tool purely centrally, plan to scan each high-risk application at least once / release. No, to my knowledge, there isn’t a good answer to the “can these tools do incremental (partial) scans”. You should always keep old results around though, as they’ll (along with other measures) help you understand whether more or less vulnerabilities are being caught during each scan. Remember though, every time you add a rule to the scan, your vulnerability-finding bar just moved and your metrics might be thrown off.

Ounce, Fortify, and Klocwork seem pretty good about determining whether or not they’ve found a particular issue in a previous scan or not. This includes situations where code blocks move around a bit. This means that running regression scans to determine whether or not an issue has been fixed is viable. I can’t comment on Coverity’s capabilities here.

What’s it going to find

I’m impressed that Fortify has published its vulnerability taxonomy and a bit shocked that others haven’t produced as public a resource. I’m excited too that vendors have agreed to comply with CWE labels when they report findings. I will say what I always do on this topic though: test the tool on your own code though. You simply don’t know what it will find until you throw all your idiosyncratic build, language, toolkit, and platform nonsense at the particular tool you’ve selected.

What won’t it find

I’m continually shocked as to how much one can find by running tool Y after tool X on the same piece of code (replace X and Y as you see fit). As I’ve written time-and-time-again, the corner cases and exceptions missed by each engine are baffling. Let me put things in perspective for you: three of my clients have reported that their SA tool deployment find 17, 33, and 50% of the total number of issues found by assessment efforts respectively. If these percentages hold for your organization, inside a coding bug realm then your job is at best 50% done having run the tool and triaged its results.

Considering that Microsoft and Synopsys believe that “bugs” account for only 50% of the vulnerability space, that leaves you unaware of 75% of your application’s vulnerability at the end of triage. As I always say, “use the tool to facilitate code review and increase a reviewer’s understanding of the code–not as a code reviewer that finds bugs automatically.”

Good luck out there, I’d love to hear about your particular experiences.
-jOHN