Software Integrity

 

Fuzzing Bitcoin with the Defensics SDK, part 2: Fuzz the Bitcoin protocol

Fuzzing Bitcoin with the Defensics SDK

This is the second of two articles that describe how to use the Defensics SDK in fuzzing Bitcoin. In the previous article, you saw how to set up a test bed for bitcoind. We created two containers, alice and bob, and were able to set up communication between the two bitcoind instances. In this article, you’ll learn how to create a data model for the Bitcoin network protocol, then use this model in the Defensics SDK to perform fuzzing on bitcoind.

Sources:

Bitcoin message header model

I started fuzzing with the version message, which is the first message a peer sends to another peer to announce itself. As you can see in the Wireshark capture in the previous article, a version message sent by one peer gets a version message in response, as well as a verack.

The Bitcoin network protocol is well-documented here: https://bitcoin.org/en/developer-reference#p2p-network.

Every Bitcoin message has a common header, which consists of four fields:

  • A magic number that identifies the network, either mainnet, testnet, or regtest
  • A command string
  • A payload length
  • A payload checksum

Here is one way to represent the Bitcoin header in Defensics SDK BNF format:

octet = 0x00-0xff
start-string = 0xfabfb5da  # Magic number for regtest
command-string = 12(octet) # Message name
payload-size = 0x00000000-0xffffffff
checksum = (0x5df6e0e2 | 0x00000000-0xffffffff)
bitcoin-header = (start-string command-string payload-size checksum)

The definition of checksum includes a default value, which is the checksum value for an empty payload.

By itself, this doesn’t do much. It doesn’t calculate the payload size or the checksum, and it doesn’t even have a command defined.

Bitcoin version message model

Defining the version message payload is similarly straightforward.

version-field = (0x7e110100 | 4(octet))
services = 0x0d00000000000000
timestamp = 0x0000000000000000
addr-recvservices = 0x0100000000000000
addr-recvipaddress = 0x00000000000000000000000000000000
addr-recvport = 0x0000
addr-transservices = 0x0100000000000000
addr-transipaddress = 0x00000000000000000000000000000000
addr-transport = 0x0000
nonce = 0x0102030405060708
user-agentbytes = 0x00-0xff
user-agent = '/Satoshi:0.14.1/'
start-height = 0x01000000
relay = 0x00

bitcoin-payload = (version-field services timestamp
    addr-recvservices addr-recvipaddress addr-recvport
    addr-transservices addr-transipaddress addr-transport
    nonce user-agentbytes user-agent start-height relay)

Then creating the entire version message is just a matter of putting the Bitcoin header with the version payload:

version-out = (bitcoin-header bitcoin-payload)

Pulling the model into the Defensics SDK

Once the definitions are outlined in the BNF, pulling them into the Defensics SDK is easy. Let’s say all the BNF is defined in a file resources/bitcoin.bnf. In a Python test suite, we can pull in these definitions very simply.

from com.synopsys.defensics.sdk import ControlTools

ct = ControlTools.get(sys.argv)
ef = ct.factory()

ef.readTypes("resources/bitcoin.bnf")

version_out = ef.field("version-out", "version-out")

Once we’ve pulled in these definitions, we can modify them or fix things that need fixing. For example, the version message is assigned below as a string with zero padding to make the field exactly 12 bytes long.

version_out.search('command-string').element().set("'version' 0x0000000000")

Assembling the whole message sequence

Remember that the version message sent by Defensics will get a version and verack in response. We can model this exchange in the Defensics SDK using a message sequence. But first, we need a model for the version and verack messages we expect to receive. We’ll loosely define a generic response message in the BNF, as follows:

bitcoin-any-payload = 0..n(octet)
bitcoin-any = (bitcoin-header bitcoin-any-payload)

Bitcoin messages travel over TCP, so we use a TCP injector to create the message sequence as follows:

io = ct.injector().tcp(fs1.getValue(), int(fs2.getValue()))

sequence = ef.sequence([
    io.send("version-out", version_out),
    io.receive("version-in", bitcoin_any),
    io.receive("version-verack", bitcoin_any),
])

Opening a can of whoop ass

All this modeling so far does not result in impressive test cases. In particular, the payload size field and the payload checksum field will not be correct. Because these are probably the very first fields examined by bitcoind, test cases are likely to be immediately discarded.

To make our test cases look believable, and to accomplish the best testing possible, we need the size and checksum fields to be correct.

In the Defensics SDK, rules are used for cases like this where certain fields need to behave in certain ways.

The length field is the easiest and can be accomplished using a built-in rule from the Defensics SDK.

rf = ef.rule()
rf.length(version_out, "bitcoin-payload", "payload-size") \
    .format("int-lsb-32bit")

This creates a length rule that calculates the length of bitcoin-payload and places it in payload-size, formatting it as a 32-bit integer, least significant bit first.

The checksum is more challenging because it cannot be addressed using a built-in rule. The checksum is the least significant four bytes of the SHA256 digest of the SHA256 digest of the payload. That’s not a typo. First you calculate the SHA256 digest value of the payload. Then you calculate the SHA256 of that digest value. Then you take the least significant four bytes of the result and use that for the checksum.

I defined a custom rule in the Defensics SDK as follows:

class SHA256x2(CustomRule):
    def __init__(self, payload, checksum):
        self.payload = payload
        self.checksum = checksum
 
    def handle(self, engine, element):
        payload = element.search(self.payload).element()
        encoded = payload.encode()
 
        sha1 = hashlib.sha256(encoded).digest()
        sha2 = hashlib.sha256(sha1).digest()
        hexs = binascii.hexlify(sha2[:4])
        checksum = element.search(self.checksum).element()
        checksum.set('0x' + hexs)

return element

Using the rule was very simple:

rf.custom(SHA256x2('bitcoin-payload', 'checksum'),
          CustomRuleMode.POST_EVALUATE,
          version_out)

This is one of the places where we’re really cashing in on the value of generational fuzzing. Even when other parts of the test cases are anomalized beyond recognition, the rules we’ve defined in the data model ensure that the payload size and checksum fields are correctly set. Test cases delivered to the bitcoind target will be scrutinized, then pass through to further parsing code after the size and checksum are verified to be correct.

Putting it all together

I haven’t shown you all the source code, just the most important parts. Once you have everything together, you can load the test suite in Defensics and use it very much like you can use any other Defensics test suite.

Assuming Defensics is on the same network as the test bed virtual machine, you just need to tell Defensics the IP address and port number of the target. If you use port 18444, it is mapped to the alice container.

By default, Defensics will use implicit TCP instrumentation, which means Defensics assumes the target is still healthy as long as it can keep opening up the TCP port. If we did manage to kill bitcoind, Defensics would no longer be able to open the port and would flag an error.

If you want some more information as you are testing, use the following command to observe output to bitcoind’s debug log. As with any fuzz testing, you’ll typically see a mix of messages. Sometimes bitcoind will report the received test case, and sometimes it will complain about one thing or another. Monitoring the log is a good way to confirm that test cases are being received and processed by the target.

root@alice:~# tail -f ~/.bitcoin/regtest/debug.log
2017-07-06 15:12:18 ProcessMessages(version, 102 bytes) FAILED peer=37725
2017-07-06 15:12:18 receive version message: : version 70014, blocks=2139062143, us=[::]:0, peer=37726
2017-07-06 15:12:18 receive version message: : version 70014, blocks=16843009, us=[::]:0, peer=37727
2017-07-06 15:12:18 receive version message: : version 70014, blocks=0, us=[::]:0, peer=37728
2017-07-06 15:12:18 receive version message: : version 70014, blocks=0, us=[::]:0, peer=37729
2017-07-06 15:12:18 receive version message: : version 70014, blocks=16843009, us=[101:101:101:101:101:101:101:101]:257, peer=37730
2017-07-06 15:12:18 receive version message: : version 70014, blocks=0, us=[::]:0, peer=37731
2017-07-06 15:12:18 receive version message: /Satoshi:0.14.1/: version 70014, blocks=858993459, us=[::]:0, peer=37732
2017-07-06 15:12:18 ProcessMessages(version, 91 bytes): Exception 'CDataStream::read(): end of data: iostream error' caught, normally caused by a message being shorter than its stated length
2017-07-06 15:12:18 ProcessMessages(version, 91 bytes) FAILED peer=37733
2017-07-06 15:12:18 receive version message: /Satoshi:0.14.1/: version 70014, blocks=1, us=[::]:0, peer=37734
2017-07-06 15:12:18 receive version message: /Satoshi:0.14.1/: version 70014, blocks=1, us=[::]:0, peer=37735
2017-07-06 15:12:18 ProcessMessages(version, 92 bytes): Exception 'CDataStream::read(): end of data: iostream error' caught, normally caused by a message being shorter than its stated length
2017-07-06 15:12:18 ProcessMessages(version, 92 bytes) FAILED peer=37736
2017-07-06 15:12:18 ProcessMessages(version, 97 bytes): Exception 'CDataStream::read(): end of data: iostream error' caught, normally caused by a message being shorter than its stated length
2017-07-06 15:12:18 ProcessMessages(version, 97 bytes) FAILED peer=37737
2017-07-06 15:12:18 receive version message: /Satoshi:0.14.1/: version 70014, blocks=1, us=[::]:0, peer=37738
2017-07-06 15:12:18 ProcessMessages(version, 100 bytes): Exception 'CDataStream::read(): end of data: iostream error' caught, normally caused by a message being shorter than its stated length
2017-07-06 15:12:18 ProcessMessages(version, 100 bytes) FAILED peer=37739
2017-07-06 15:12:18 ProcessMessages(version, 101 bytes): Exception 'CDataStream::read(): end of data: iostream error' caught, normally caused by a message being shorter than its stated length
2017-07-06 15:12:18 ProcessMessages(version, 101 bytes) FAILED peer=37740
2017-07-06 15:12:18 socket send error Broken pipe (32)
2017-07-06 15:12:18 ProcessMessages(version, 95 bytes): Exception 'CDataStream::read(): end of data: iostream error' caught, normally caused by a message being shorter than its stated length
2017-07-06 15:12:18 ProcessMessages(version, 95 bytes) FAILED peer=37741
2017-07-06 15:12:18 receive version message: /Satoshi:0.14.1/: version 70014, blocks=1, us=[::]:0, peer=37742
2017-07-06 15:12:18 receive version message: : version 70014, blocks=0, us=[::]:1792, peer=37743
2017-07-06 15:12:18 ProcessMessages(version, 87 bytes): Exception 'CDataStream::read(): end of data: iostream error' caught, normally caused by a message being shorter than its stated length
2017-07-06 15:12:18 ProcessMessages(version, 87 bytes) FAILED peer=37744
2017-07-06 15:12:18 receive version message: : version 70014, blocks=0, us=[::]:0, peer=37745
2017-07-06 15:12:18 receive version message: /Satoshi:0.14.1/: version 70014, blocks=1, us=[::]:0, peer=37746
2017-07-06 15:12:18 receive version message: /Satoshi:0.14.1/: version 70014, blocks=1, us=[::]:0, peer=37747
2017-07-06 15:12:18 receive version message: /Satoshi:0.14.1/: version 70014, blocks=1, us=[::]:0, peer=37748
...

Future directions

I hope you’ve enjoyed this romp through fuzzing Bitcoin using the Defensics SDK. You can see how the Defensics SDK can bring the power of generational fuzzing to any type of software.

In the specific case of bitcoind, you could take this testing further as follows:

  1. Improve the data models. For example, the fields inside the version message are cursorily specified. Use the protocol specification to make them more “real.”
  2. Model and test other messages in the Bitcoin protocol.
  3. Improve failure monitoring. For example, you could run the bitcoind target with valgrind or ASAN to catch memory errors.

Want to know more about fuzzing Bitcoin?

Come ask us in the Software Integrity Community.

 

More by this author