Software Integrity

 

Swift: Close to greatness in programming language design, Part 2

Swift: Close to greatness in programming language design, Part 2
Ahead of Coverity Static Analysis support for the Swift programming language, we are examining design decisions in the language from the perspective of defect patterns detectable with static analysis. To kick things off, I recommend reading Part 1 in this series if you have not already.

Defect patterns continued: More basics

Now we consider additional defect patterns that I would classify as “basic,” because they are clearly avoidable with good language design and relatively easy to detect with static analysis. However, these are generally more common in code written by experienced programmers than the patterns we saw in Part 1.

Avoided: Conversion after operation

Expressions in C-derived languages are generally typed from the inside out, which can lead to unexpected bugs, trivialized in these examples:

long mult(int x, int y) {
    return x * y; // might overflow 'int' before conversion to 'long'
}

or

double frac(int x, int y) {
    return x / y; // computes integer quotient before conversion to floating point
}

This affects the most popular statically-typed C derivatives: C++, Java, C#, Objective C/C++, but Swift’s interesting typing rules seem to exclude dangerous implicit conversions like this. I don’t fully understand these Swift rules yet, nor do I know how difficult it is to work with them day-to-day, but I’m impressed so far.

Swift goes a step further by checking for arithmetic overflow by default. This helps to ensure overflows are detectable in testing and unlikely to lead to security issues in production.

Swift also gets bonus points here for bounds-checking shift amounts rather than applying obscure (or undefined) semantics when shifting by full number of bits (or more). This handling of shifts doesn’t make more programs correct, but importantly makes bugs much more detectable in testing.

Avoided: Lost parameter mutation

I’m not so much talking about this:

void swap(int a, int b) {
    int tmp = b;
    b = a;
    a = tmp;
}

where someone really doesn’t know how to pass by reference, but more like this (C#):

void update(SomeType obj) {
    obj.field++;
}

where the code is fine if SomeType is a reference type (class) but useless if SomeType is a value type (struct). C++ largely avoids this confusion by making pointers and references explicit, while Java, JavaScript, and others avoid it by ensuring all record types are always accessed by reference.

Like C#, Swift has value-typed structs in addition to implicit reference types for classes, but Swift makes constant bindings for parameters, meaning the compiler rejects the code if they are modified. It is even necessary to mark struct functions as mutating if they might modify the receiver object, so you can’t easily make this mistake inter-procedurally.

I really applaud this design choice of Swift (immutable parameter bindings). Although bugs like this are rare, modifying parameter variables is almost universally confusing.

Susceptible: What was I going to do with that expression?

Java and C# wisely only allow a select few expressions to be used as statements, because most expressions are only useful for their result. Consider this real JavaScript code:

"test: textmate style format strings" : function() {
   var fmt = snippetManager.tmStrFormat;
   snippetManager.tmStrFormat("hello", {
       guard: "(..)(.)(.)",
       flag:"g",
       fmt: "a\\UO\\l$1\\E$2"
   }) == "aOHElo";
},

I believe they meant for this unit test to make an assertion but it doesn’t. Or, (more real JavaScript)

if (!args.eatSpace()) { return 'Invalid arguments'; }
...
if (args.eatSpace() && args.match(/\/.*\//)) { 'patterns not supported'; }

I think they meant to return that string.

Java and C# thankfully exclude these kinds of errors by seriously restricting the set of valid expressions that can be used as statements. Swift unfortunately joins JavaScript, PHP, Python, and Ruby (and in many cases C/C++/Objective-C/C++) in allowing these kinds of errors. Yes, there’s a compiler warning in Swift and many other languages, but it’s amazing how often these are ignored in the real world.

Swift unfortunately compounds the problem by allowing semicolons to be elided at the end of a line, like JavaScript. This can lead to surprisingly bone-headed errors, as in this simplified example from real JavaScript code, which is also accepted as Swift:

var html = "<table class='modification'>" +
      "<tbody>" +
        trs
      "</tbody>" +
    "</table>";

Note the missing ‘+’ after trs. Why do they allow this? Allowing semicolons to be elided should be mostly OK in a C-like language–but only if expressions legal as statements are minimized and unreachable code (see below) is disallowed.

But before praising Java and C# too much, they are susceptible to at least one significant class of the “throwing away expression result” bug, as in this great bug in some open source code:

public void remove() {
    new OperationNotSupportedException();
}

Creating a new object and throwing it away is not typically a bug in a garbage-collected language, but here they clearly forgot to write ‘throw.’ We can catch these in C# or Java with high sensitivity and accuracy because of the strict type hierarchy on throwable objects, unlike C++ for example. I hope that the Error protocol in Swift enables a similar check.

Finally on this subject, I have to completely admonish Ruby. As someone who has reviewed tons of defect reports on open source code in many languages, I must judge Ruby a “complete disaster” when it comes to these kinds of bugs. Not only is it easy to miss an assertion, but Ruby’s automatic returning of the “last value” computed in a function makes it super easy to mix up what gets returned, such as by adding code after a nested ‘end’ at the end of the function.

Avoided: Not the operator you think it is

This case is basically one of the few opportunities to pick on Python for more than being a dynamic language. What does the following print in Python?

x = 42
print(++x)
print(x)

If you guessed something involving 43, you were wrong. Python doesn’t have a ++ operator, but this code is legal. That’s because Python does have a unary + operator and they didn’t bother to ensure not to parse that as part of a ++! Ruby has the same flaw.

Newer versions of Swift do not have a built-in ++ either, but at least they still parse and reject that well-known idiom:

Col 9: '++' is unavailable: it has been removed in Swift 3

Susceptible: Unreachable code

Now for some admonishment of C/C++, C#, JavaScript, PHP, Python, Ruby, and Swift. Only Java is spared (mostly) because the Java language disallows any statements that are unreachable (according to the most basic control flow rules that assume expressions may take any value; use “if (false)” all you want). The rest of these languages have a compiler warning at best. This stuff matters, as in this Python code:

try:
    return self.new_api.add_format(book_id, fmt, stream, replace=replace, run_hooks=False, dbapi=self)
except:
    raise
else:
    self.notify('metadata', [book_id])

Doesn’t the ‘else’ get run if no exception is caught? Only if control reaches the end of the try block, which never happens in this code.

And I can’t stop pointing out the terrible interaction we see with JavaScript semicolon insertion:

function IsDigit( e )
{
    e = e || event ;
    var iCode = ( e.keyCode || e.charCode ) ;
    return 
           (
             ( iCode >= 48 && iCode <= 57 )          // Numbers
             || (iCode >= 37 && iCode <= 40)         // Arrows
             || iCode == 8           // Backspace
             || iCode == 46          // Delete
           ) ;
}

Spot the error? “return” is parsed as a complete statement with an unreachable expression-statement after it. Really. Luckily, Swift’s static typing should exclude this and most related cases.

Brief aside: it is possible to have legally unreachable code in Java, but that code has to be an expression not a full statement:

for (int i = 0; i < max; i++) { // i++ not reachable
    if (check(i)) {
        step();
        break;				     
    } else {
        fail();
        break;
    }
}

Why would languages like C# and Swift allow unreachable statements, as in the beloved goto fail bug? I don’t know, but I have heard that people sometimes utilize unreachable code during debugging. I don’t believe that it’s that useful, especially in a language like Swift where comments nest nicely. If you want to get rid of some code temporarily, comment it out or put it in an if(false) or something. That doesn’t seem too much to ask for some real protection against serious bugs. On the other hand, people being let down by their programming language is a real part of the money generator for Coverity static analysis.

On a somewhat related note, the Never type in Swift enables marking functions as never returning, so you don’t have to write extraneous return statements and such just to appease the compiler. C/C++ programmers know this kind of feature as noreturn. Wouldn’t it be great though if Swift rejected unreachable code after a call to a Never function? In my testing, this only rises to the level of a compiler “note,” not even a warning.

Susceptible: Continue while false

The ‘continue’ construct can be confusing in C-derived languages, because the termination condition is still checked before proceeding to the next loop iteration. Our classic example of misuse stemming from this confusion looks like this (in Swift here; applies to do-while loops in many languages):

var permissive : Bool = false;
repeat {
    if (!tryIt(permissive) && !permissive) {
        permissive = true;
        continue; // doesn't actually try again
    }
} while (false)

I would fix this like so:

for permissive in [false, true] {
    if (tryIt(permissive)) {
        // done
        break;
    }
}

If you really do intend to break out of a while(false) loop, you should use ‘break’ instead of ‘continue’.

Continue reading with Part 3 of this series.