Posted by David Benas on December 22, 2016
As the prevalence of software continues to trend upwards with time, a common assumption is that it is becoming more feature-rich and reliable. However, most in the software industry wouldn’t hesitate to point out how difficult it actually is to achieve fully-working software.
In fact, when calculating software risk, a key assumption is that it will at some point fail. For any software that is considered somewhat complex, the probability of failure is often a point of contention. Namely, what is the methodology behind correctly assessing risk for software that goes through a vetted QA process and is, for all intents and purposes, considered “correct”?
From a general insurance point of view, risk falls into two distinct categories: pure risk and speculative risk. Pure risk is just that–risk that has a failure condition always ending in a loss (e.g., homeowners insurance). Software also falls into this category since it is assumed to work.
Speculative risk is more complex and can result in either loss or profit (e.g., stock market trading). Generally, the premium assessed scales with the probability of that failure condition. The calculation of such failure probability is generally a complex process. It involves a mass amount of prior data and crafty mathematics. At the end of the day, the calculation of a premium for risk insurance is a function of the failure probability, max financial loss, and the profit on what requires insuring.
Software risk calculations are often glanced over or disregarded completely. In part, this is due to black box software testing being fairly incomplete. Regardless of the exhaustive testing, there is always a possibility that something wasn’t taken into account or was missed by a sleep-deprived QA engineer.
Due to the massive amount of insurance data, in addition to the human tendency to be predictable in those models, risk calculation reliability tends to be high. However, software failure condition knowledge is limited to what software developers have experienced or submitted in bug reports. Additionally, the historical context of failing software encompasses a brief period of time–especially in comparison to other types of insurance risk tables.
Software is also quite distinct. Even if two pieces of software have the same functionality, it’s easy to assume that they are completely different in design. Utilizing the same black box testing methodology for different software doesn’t make much sense. As such, the set of black box tests that can be run against software is limited by developer experience and creativity, and often is (and should be) unique to each software.
Software, for the most part, is broken in some way. The further you venture beyond simple “Hello world!” programs, achieving a 100% unbreakable, bug-free software becomes more and more difficult (if not impossible) to achieve. Treat every piece of software (intensively QA tested or not) as if it could one day malfunction or break.
Regardless of whether the code itself is broken, the environment, timing windows, etc. could all be a root cause of something going wrong. While the addition of black box testing and inputting known-bad parameters increases the reliability of software, it can never be perfect. The degree of imperfection that exists likely increases proportionally to the relative complexity of the project. All software that seems completely correct, to that end, is probably not.
To accurately assess the risk of software given these constraints, we must use a more unified model. The model should take into consideration all the aforementioned shortcomings of current software risk analysis. It must also account for the experience the insurance industry has in their own risk assessments.
The insurance industry considers three main elements when assessing premiums and considering risk:
Maximum financial loss and profit are two distinct values that most companies, given some time and cooperation between departments, could establish. Narrowing down the probability of a failure event depends on the testing methodology.
In software, this could be the number of failed black box tests out of total conducted black box tests. This also fails to take into consideration the discussion on black box testing shortcomings. Since all software is bound to have undiscovered bugs that aren’t identified by testing, a confidence interval should be taken into account. That is, given a set of tests, what is the percentage of coverage over the set of all bugs that could exist?
A time-boxed penetration test cannot find all security vulnerabilities in a piece of software. Similarly, QA and black box testing of the software cannot find all bugs or unintended functionality that may someday result in financial loss.
The risk of software that has gone through vetted assurance processes and seems bug-free is wholly dependent on the confidence in the testing process. The confidence in this process should never be 100%. It can be as high as 99% if historical (in company history) precedent exists for this to be true.
Even the most mildly complex software has vulnerabilities, bugs, and/or flaws. Black box testing is not sufficient to call a piece of software “correct.” To accurately assess the risk of software, the confidence in the testing process must be taken into account. An increase in confidence in the testing process greatly decreases the risk of the software analysis at hand.
Get the latest Software Integrity news, thought leadership, and more.