Why I am against EOF in Pectra


The Ethereum Object Format (EOF), is an upgrade to the EVM that has been worked on for roughly four years and is included in Pectra. I am not convinced that it is very useful and thus have been arguing against it’s inclusion in the past. I’m writing this post to discuss some of the advantages and disadvantages of EOF and why I think that it shouldn’t be included.

During this post, I will refer to the current EVM as “legacy EVM”. This is just to make the distinction between the EOF EVM and the current EVM easier. It should not be read as “the current EVM is outdated”.

Advantages

EOF brings several advantages.

Stack To Deep

One common annoyance of Solidity programmers has been the stack too deep issue. Since the EVM only supports DUP1-DUP16 and SWAP1-SWAP16, opcodes to swap stack elements around, you can’t have more than 16 local variables or function parameters in Solidity. This has been a big annoyance for developers as they need to structure their code around this issue.

EOF introduces DUPN, SWAPN and EXCHANGE, three opcodes that take an 8-bit immediate which makes it possible to have up to 256 local variables or function parameters. However DUPN, SWAPN and EXCHANGE are not EOF exclusive, so they could easily be added to the normal EVM without EOF, they have been proposed before EOF.

Remove JUMPDEST analysis

Probably the biggest advantage of EOF is that the JUMPDEST analysis is performed at deploy time instead of at runtime. The JUMPDEST analysis checks which parts of the bytecode are data and which are valid jump destinations. This needs to be done, so that a JUMP opcode does not end up in the immediate of for example a PUSH32.

This JUMPDEST analysis can be cached in theory, but there have been attacks in the past where nodes could’ve been DOS’d by creating contracts with a lot of initcode over and over again. In order to combat these DOS issues, the maximum initcode size was introduced in EIP-3860.

In theory the JUMPDEST analysis limits the maximum size of a contract that can be deployed. In my opinion however, the max codes size can be increased significantly without running into these issues again. The DOS back then was because the initcode size was unlimited and EIP-3860 introduced additional gas costs for the JUMPDEST analysis.

Static program analysis

As long as the EVM contains non-static jumps, the analysis of a program is in the worst case O(n^2). EOF removes dynamic jumps which reduces the analysis to O(n) in the worst case.

I’m not convinced that that the difference is significant though, since EVM programs are inherently small in size. Even at the worst case of 24576 byte, the analysis needs ~600 million steps. Assuming 1000ns per step (which is quite a lot, for reference KECCAK takes <700ns on my machine) this can be analysis be done in 10 minutes.

Smaller program size

One advantage of EOF is smaller program size. The added structure of EOF allows compilers to optimize the way they’re structuring their code which might reduce the program size by 2-5% according to Charles Cooper from Vyper.

Move runtime checks to deploy time

EOF moves a lot of runtime checks to deploy time. For example the checks for stack overflow and stack underflow are done at deploy time. This would in theory make the execution faster.

There are two issues in practice that I see. We will still need to maintain the un-optimized opcodes for the legacy EVM, which means duplicating a lot of logic, once with the additional checks, once without. And the speed improvements for the average contracts are most likely miniscule since most of the latency of these checks is probably optimized away by the processor when loading memory.

I would really like to see the speed improvements of a mainnet block with and without these checks.

Address Space Extension

EOF does not allow for EXTCALL, EXTDELEGATECALL or EXTSTATICCALL to call addresses where the upper 12 bytes are set. This restriction allows to set some of these in the future as a bitmask for address space extension.

However this is not required in legacy contracts, which makes it impossible to move to address space extension for the legacy evm anyway.

Gas and code observability

EOF also gets rid of the GAS opcode and replaces CALL, DELEGATECALL and STATICCALL with variants that do not allow the user to specify how much gas they would like to send with the call. This allows us to easier change gas prices for opcodes in the future. It also gets rid of all code inspection related opcodes for EOF, like CODEHASH, CODESIZE etc. Removing code introspection allow us to change or replace bytecode in the future, but I highly doubt that this will ever realistically happen.

I’m pretty sure that getting rid of gas observability will impact a few users, they can still work around this by calling into a legacy contract which does the gas observation.

Disadvantages

Now that I’ve talked about the advantages of EOF (and admittedly already gave some of my thoughts about it), I want to discuss some of the disadvantages of EOF that I see.

Complexity

The biggest issue I see is that EOF is extremely complex. It introduces RJUMP, RJUMPI, RJUMPV, CALLF, RETF, JUMPF, EOFCREATE, RETURNCONTRACT, DATALOAD, DATALOADN, DATASIZE, DATACOPY, DUPN, SWAPN, EXCHANGE, RETURNDATALOAD, EXTCALL, EXTDELEGATECALL and EXTSTATICCALL.

All-in-all EOF introduces 19 new opcodes. Additionally it removes 16 opcodes present in the existing EVM from EOF and changes the behavior of multiple opcodes in both versions. These changes have already resulted in two consensus issues in Besu and evmone (used by erigon) even before EOF made it to mainnet. Luckily these issues were found by Martin Holst Swende before they made it into a release.

All these new opcodes must be tested, especially because some of these (like EOFCREATE and RETURNCONTRACT) interact with each other.

Additionally EOF introduces the code and stack validation. This means that the code is verified before it’s being deployed. Therefore the code validity now becomes part of consensus which means that bugs in the code or stack validation will result in consensus bugs. Even worse, if the code and stack validation spec has a bug, a faulty contract could be deployed which would result in undefined behavior if called.

Impact analysis

As far as I know, no impact analysis was done yet. While compiler teams have indicated that they like the changes, they have not published any reliable numbers on gas reductions, code size decrease or potential speed increases.

A change like EOF is really only useful if users (big protocols) will move away from their existing contracts and migrate over to EOF contracts. There need to be very good reasons for protocols to switch to a new contract though like significant gas cost reductions.

Security

As far as I know, there has been no analysis yet into the worst cases for the 19 new opcodes. I started to do some benchmarks for the code and stack validation, but they never went very far. I am not yet convinced that we haven’t found the worst cases in the implementations yet.

The test suite provided by Danno, Ypsilon and the Testing team is quite comprehensive as far as I can tell.

Conclusion

I don’t think that EOF should go into Pectra, because the drawbacks strongly outweigh the potential benefits.

The biggest improvement that I see is the SWAPN opcode as it directly impacts the developer experience. The DUPN opcode could be emulated by a SWAPN DUP1 SWAPN chain. However the DUPN opcode is not exclusive to EOF, we could just ship it to the legacy EVM.

EOF is only really useful if you don’t have to care about the existing EVM. Thats why I would propose for L2 projects that have no or very little existing state to adopt EOF first. The code for it has been written in almost every client already, so it would not be hard for L2 teams to take ownership of it and move it over the finish line.

I personally believe that L1 should stop shipping small incremental improvements and should focus on big unlocks like 4844 and Verkle. These changes will really enable new things to be built on Ethereum, while EOF in my opinion only makes existing things slightly faster or slightly cheaper.

If we decided to ship EOF on L1, I would hope for more analysis on the impact of EOF and the security aspects of it. We usually modify 1-2 opcodes per hardfork and we require a very thorough analysis (like with EIP-6780). I have not seen nearly the same amount of scrutiny with this proposal.

I know that people have been working on this and related changes for over four years now, but I don’t think that we should take personal feelings into consideration when deciding on the roadmap of Ethereum