Sorry, not available in this language yet

Detecting Spectre vulnerability exploits with static analysis

Mar 21, 2018 / 8 min read

Table of Contents

Background
Spectre
Using static analysis to detect Spectre
Mitigation

In the last few months, Spectre (CVE-2017-5753 and CVE-2017-5715) has emerged as a new kind of vulnerability. In the interest of helping the development community actively defend against these exploits, the Synopsys Software Integrity Group is releasing a checker that can identify code patterns that are vulnerable to the Spectre attack. We examined what kinds of code Spectre can exploit and how static analysis can help detect them. In this article, we’ll discuss some real-world examples we found and share techniques for mitigating the effects of Spectre.

Background

First, let’s take a brief look at a couple of concepts necessary to understand the Spectre vulnerability. Readers familiar with the details of the Spectre vulnerability (branch prediction, speculative execution, and cache timing attacks) can skip to the “Using static analysis” section below.

Branch prediction

When the processor comes across a conditional branch instruction, it tries to predict which branch will be executed. Its prediction is based on many factors, such as the range of values used in the condition, the history of previously executed instructions, and the number of times the branch was executed in the past.

Speculative execution

As an optimization, processors speculatively execute instructions ahead of time, assuming that the predicted branch will be taken before actually evaluating the condition. If at a later point the processor realizes that the branch was not taken, it will simply discard the results and carry on with the correct branch instruction sequences. Consider the following code snippet:


if (x < array1_size) {
  val = array1[x];
}

One might assume that val = array1[x] gets executed only if x is less than array1_size. However, if the value of array1_size is not in the cache, the processor must wait until the value is read from the main memory, which takes significantly longer. Instead of being idle, the processor may predict that the condition is true and speculatively execute the body of the if-statement, even if x is outside array1’s bounds. The Spectre vulnerability relies on the fact that speculative execution can affect the cache state.

Cache timing attacks

Cache timing attacks are used to determine whether a value is in the cache by measuring how long it takes to read from a memory address. If the value is in the cache, the read will be fast, and if not, it will be slow. An attacker can populate the cache with their own data by reading a very large array, and then run the victim’s code that accesses sensitive information. To load its own data into the cache, the victim’s code must evict some of the attacker’s data. The attacker can now run their own code again and, by timing memory accesses, infer what memory locations the victim has accessed. Prime and probe and flush and reload are well-known cache timing attacks used as data extraction mechanisms in the Spectre attacks.

Spectre

Spectre is an information leakage vulnerability that uses the speculative execution and data caching mechanisms of the processor to extract sensitive data from the victim process. Recall that speculative execution based on branch prediction can cause updates to the data cache. If the predicted branch is not taken, the processor discards the output from the registers as well as other state information but does not discard the updates to the cache. An attacker can exploit this by mistraining the processor to take the wrong branch and then use the fact that the result of that speculative execution stays in the cache to extract the sensitive data. There are two variants of this attack technique: variant 1, bounds check bypass (CVE-2017-5753), and variant 2, branch target injection (CVE-2017-5715). In this post, we’ll focus on variant 1 because it’s easier to exploit. It relies on a more commonly occurring pattern in user- and system-level C/C++ programs compared to variant 2.

In the following code sample from the Spectre white paper, both array1 and array2 are arrays of bytes:

 
  if (x < array1_size) {
  v = array2[array1[x]*256]             
}

To be able to exploit this code, the attacker must be able to control the value of x. The attacker must also manipulate the cache state so that the following conditions hold:

array1_size is not in the cache
array1’s address is in the cache
the location of the secret data to be extracted is in the cache
array2’s address is not in the cache

Let’s assume that the attacker has managed to mistrain the branch prediction algorithm to expect the if-condition to be true. The attacker then provides a maliciously chosen value for x that is outside the bounds of array1 such that array1[x] points to sensitive data. Since array1_size is not cached, the processor must wait for array1_size to be read from the main memory; meanwhile, it will speculatively execute the array2[array1[x]*256] memory read. Since array2 also is not in the cache, the read will update a cache location that depends on the array1[x]*256 value. Once the processor finishes the load of array1_size, it will roll back the speculative execution, but the change to the cache state will remain. The attacker can use cache timing attacks to observe which cache location changed during speculative execution and thus infer the value of array1[x]. By varying the value of x, the attacker can thus infer the value at any memory location in the victim process.

Using static analysis to detect Spectre

Next, let’s see how static analysis can help detect code vulnerable to the Spectre attack. To help mitigate the effects of the Spectre attack, we have extended the Coverity static analysis tool to identify code instances that may be vulnerable to the bounds check bypass variant of Spectre. Our goal is to provide development teams with a checker that efficiently identifies exploitable code patterns without overwhelming the user with too many defect reports. We worked with development teams from prominent hardware vendors, as well as kernel developers, to make sure the analysis results were meaningful and useful.

The AUDIT.SPECULATIVE_EXECUTION_DATA_LEAK checker finds code patterns that might make a sensitive process (e.g., kernel, hypervisor, browser) vulnerable to the Spectre attack. The checker looks for an execution path on which the following events occur:

a value x is compared to some variable (e.g., x < array1_size)
x is used as an index to access memory and read some value y
y is then used to access another memory location

Pattern variations

While the sequence of events above corresponds directly to the pattern presented in the Spectre white paper, our checker does more than just syntactic matching. There are many ways to write equivalent code, and we attempted to be robust to variations on the pattern in variant 1. The following are some of the techniques we used to find more variations:

Track aliasing across assignments (e.g., val = x)
Track values through binary operations (e.g., x + 1 < array1_size)
Conditional statements other than if-statements (e.g., for (i=0;i+x<array1_size;i++) {})
Dereferences and pointer arithmetic (e.g., char *p = arr1 + index)
Memory accessing APIs (e.g., memcmp)
Interprocedural memory access

These techniques enabled us to detect several of the additional Spectre variants shown here, such as these:

  
void victim_function_v12(size_t x, size_t y) {
  if ((x + y) < array1_size)
    temp &= array2[array1[x + y] * 512];
}


void victim_function_v11(size_t x) {
  if (x < array1_size)
    temp = memcmp(&temp, array2 + (array1[x] * 512), 1);
}

Limitations

Since program analysis is an undecidable problem in general, static analysis cannot precisely detect all the exploitable code patterns. To address this inherent limitation, we had to make some heuristic decisions about what the checker would report. Inspired by the security mantra “It’s better to get all than lose some,” for this checker we tried to err on the side of overapproximating the set of possible vulnerable code locations. On the other hand, reporting too many issues would overwhelm the developer, who might not be able to address them all. Below are some of the trade-offs we made:

Taint tracking

A particularly challenging task is deciding whether a value can be attacker controlled. While Coverity has taint tracking capabilities, doing this precisely would be slow and would require modeling of APIs that could yield user-controllable data. For this reason, we do not check whether the index value can be controlled by an attacker. It’s left to the developer’s discretion to examine the potentially vulnerable locations reported by the tool and decide whether the index is user-controllable.

Interprocedural support

While the tool detects interprocedural dereferences, it does not track whether a comparison is happening interprocedurally. Inferring that a function is an array bounds check is nontrivial, and reporting on any comparison would result in a lot of defects. Therefore, we do not report an issue in the following code:


int is_x_safe(int x) { if (x < array1_size) return 1; return 0; }
void vulnerable(int x) {
  if (is_x_safe(x))
    temp &= array2[array1[x] * 512];
}

Example

We ran the Coverity AUDIT.SPECULATIVE_EXECUTION_DATA_LEAK checker on several real-life codebases. Recall that the checker doesn’t guarantee that the code is exploitable. Rather, it’s meant to flag problematic code patterns. It’s up to the developer to examine the defect report and decide the appropriate remediation. In our experiments, the checker reported on average 1 defect for every 35,000 lines of code. This is significantly less than the total number of branching instructions in the code and a manageable number of defects for developers to examine.

The following example is based on one of the defects the checker found in a real codebase. We chose this example to highlight our checker’s features. The checker can detect patterns spreading over many lines of code that would be difficult to identify by code review. For brevity, the code has been shortened to show only relevant parts:

If the x parameter to victim_function can be controlled by an attacker, the code is potentially vulnerable to the bounds check bypass variant of Spectre. x + offset is compared to data->header.maxIndex on line 35. Assuming the attacker can mistrain the branch predictor, the array access on line 36, index = array1[x + offset], would be speculatively executed even if x + offset is out of bounds. This could lead to sensitive data being stored in the index variable. Recall that an attacker cannot access it directly but must use it as an index to access memory. index is later used as an index on line 48: data->array2[index * 512]. If the attacker can ensure that data->array2[index * 512] is not in the cache, the cache location corresponding to the sensitive value index will be updated. Using cache timing attacks, the attacker may be able to retrieve part of the sensitive data.

Mitigation

Identifying the vulnerable code is only half of the job. Figuring out the correct way to protect against the effects of the attack is equally important.

The root cause of the attack is the fact that speculative execution does not revert the changes to the cache. A natural fix is to disable speculative execution by adding special instructions that serialize the instructions after the comparison statement. This tells the processor to execute the instructions in program order: “Wait until the comparison statement has been executed to determine which branch to take.” Intel has issued a white paper that provides a way of doing this. In the following example, the _mm_lfence()instruction ensures that the if-statement will not be speculatively executed:


if (x < array1_size) {

  _mm_lfence();

  v= array2[array1[x] * 256];

}

But adding an _mm_lfence()instruction after every conditional branch would result in impractical slowdowns. Therefore, it’s important to add these instructions only to code that is vulnerable to Spectre. The Coverity AUDIT.SENSITIVE_DATA_LEAK checker addresses this need by helping developers find these vulnerable code patterns. Read about the latest updates to the AUDIT.SPECULATIVE_DATA_LEAK checker.