In this post I like to introduce FuzzyVM [fuzzɛvm] a guided differential fuzzing framework for Ethereum Virtual Machines (EVM). It found two consensus issues on the Ethereum mainnet.
Ethereum, contrary to most other blockchain network, has multiple implementations of the Ethereum protocol. This allows the network to be resilient in the case of attacks or bugs as demonstrated in the Shanghai attacks. Having multiple implementations also us to test implementations against each other to find differences.
These differences are often critical to the health of the network as all nodes have to agree on the current state of the network. If a node does not agree on the same state as the rest, it can not validate blocks and is therefore forked off the network. These consesus issues are critical for the health of the network. A consensus issue for example forked off all OpenEthereum nodes after the Berlin hardfork. This issue resulted in most block explorers being stuck. Fortunately the issue was quickly resolved and the nodes were updated to follow the canonical chain.
The EVM is an integral part of an Ethereum node as it executes the state transition. Smart Contracts in Ethereum are usually written in a high level language like Solidity, Vyper or Fe. The high level language gets compiled to EVM bytecode which is deployed on-chain. This EVM bytecode is executed if a transaction is send to the contract. The EVM provides over 80 Opcodes (like ADD, SUB, CALL) as well as 9 complex functions as precompiled contracts. These precompiled contracts (also called precompiles) implement functions like recovering the signer from an ECDSA signature or verifying a bilinear pairing on the BN256 curve.
The Ethereum Foundation maintains a set of test cases in the ethereum/tests repository. These tests can be executed by new implementations to determine if the EVM was implemented correctly. However, these tests can not cover all edge cases of the EVM. FuzzyVM uses guided fuzzing to create tests that cover more than the ethereum/tests. Guided fuzzing is a technique that measures the code coverage that a test achieved and mutates the test accordingly in order to increase the code coverage. This allows FuzzyVM to “learn” interesting inputs that trigger behaviour in the EVM. FuzzyVM currently tests go-ethereum, OpenEthereum, Besu, Nethermind and TurboGeth against each other.
Architecture
FuzzyVM consist of two processes, the generation and the execution process as shown in Figure 1. The generation process creates new state tests in the ethereum/tests format with the help of go-fuzz and goevmlab. It uses different generation strategies to create interesting programs. The EVM provides precompiled contracts, that implement complex functions like signature recovery or pairings. A normal fuzzer would rarely create valid inputs to these functions. FuzzyVM employs generators to create inputs to these precompiles that passes all precondition checks.
The execution phase executes the tests on the EVM implementations with the help of goevmlab. Most clients provide a standalone program for testing the EVM implementations as shown in Figure 2. During execution the EVMs produce a trace for every opcode that is executed. These traces are collected by FuzzyVM and compared against each other. Differences in the output are logged and the tests are saved for manual review later.
FuzzyVM makes it easy to introduce new generation strategies. This allows us to extend it for testing new EIP’s once a chainspec for them exist, which makes introducing new changes to the Ethereum Virtual Machine safer.
Trophies
FuzzyVM found two consensus crashes in Ethereum mainnet clients as well as multiple non-critical issues in the test binaries. Both bugs were privately disclosed to the client teams.
The CALLDATACOPY, CODECOPY, EXTCODECOPY and RETURNDATACOPY opcodes consume three items from the stack; the destination, source and length of the data that should be copied. The Nethermind client halted execution when the length was zero for these four opcodes. Thus calling them with length set to zero would alter the execution and provide an invalid state root. FuzzyVM created tests that triggered this behaviour with the CALLDATACOPY and CODECOPY opcodes. The flaw in the other two opcodes was found during manual review.
The second critical flaw found by FuzzyVM concerns the ModExp precompile in the Besu client. The ModExp precompile consumes six parameters from the stack; the base length, modulus length, exponent length, base, modulus and exponent. If the base length and the modulus length are zero, Besu would read the exponent length and the exponent anyway. If the exponent was set to a high number it could overflow causing the client to crash. Thus all Besu nodes that would receive a block containing a transaction that trigger this bug would crash.
Conclusion
Satoshi Nakamoto once wrote that he doesn’t believe “a second, compatible implementation of Bitcoin will ever be a good idea”. I disagree. Ethereum’s client diversity is unmatched in the crypto-space and makes Ethereum resilient against attacks. Having different clients also allows us to test the different clients against each other to find undefined behaviour in the Ethereum protocol. FuzzyVM provides an easy way to test EVM implementations and we hope that more EVM implementations can be added to the mix. It allows us to fuzz-test new EIPs and to find regressions in the existing clients.
Next Steps
The goal of FuzzyVM is making Ethereum safer, thus we would like to add more ETH1 clients to the fuzzing mix. EVM clients can join the fuzzing efforts, if they implement EIP-3155 which describes a common format for EVM traces (it still needs some work). If you are a client developer and want to have your implementation fuzzed, you can drop me an E-Mail at marius@ethereum.org. In the future we would like to add more fuzzing strategies to the mix that can test more assumptions as well as create strategies for new and upcoming EIPs. Contributions are very welcome, you can find the repository at github.com/MariusVanDerWijden/FuzzyVM