B003: Maximal Gas Optimizations
In the past, projects used to not care about gas-optimized code and instead focused their code reviews entirely on security. However, with gas spikes becoming more prevalent than ever, gas optimizations are becoming a trending topic that a lot of folks are interested in.
In this article we will attempt to explain some hidden gas costs the EVM incurs and how to avoid them by coding in a gas-eccentric way. Before we move forward with explaining these costs, we need to shed a little bit more light on how gas exactly operates in Ethereum as well as give a brief explanation about why it is necessary.
Transaction Fees
Each transaction on Ethereum and any EVM-based blockchain consumes gas for its execution. While simple transactions between two accounts have a set gas cost, complex smart-contract-based transactions can consume an arbitrary amount of gas which is dictated by the EVM statements the transaction executes.
While the gas consumed by a transaction is important, alone it does not serve as an indicator of the actual fee of a transaction. To calculate the final fee of a transaction, the gas consumed by it needs to be multiplied by the gas cost of the transaction which is defined by the transaction’s submitter.
The above combination leads to a dynamic market forming that defines what the ideal gas cost for a transaction should be for it to be accepted by the blockchain miner network. As miners are directly awarded the transaction fees, they are incentivized to include transactions with the highest gas cost first.
Although the above auction-style system is meant to change soon, the core idea will remain the same whereby each unit of gas needs to be paid for using actual funds.
The rationale behind gas cost is simple; a complex transaction performs state changes on the blockchain that are expensive to execute in hardware terms and as such, an organic throttling mechanism needs to be in place to prevent denial-of-service attacks, etc.
Compilation Optimizer
This is a widely misunderstood tool that most people fail to utilize properly. Although two optimizers are currently available in the latest version of Solidity, we will focus on the “old” optimizer that operates on the opcode level.
The optimizer attempts to simplify the opcode representations by reducing the number of instructions required to reach the same final result and additionally removes unused code. As the optimizer does not operate at the code level, it will not be able to pick up redundancies such as reading the length
value of a storage
array during a for
loop.
The “runs” value is meant to represent how many times each code segment is meant to be run and represents the trade-off between contract bytecode size (low value) and contract runtime efficiency (high value). Due to this, a lot of people insert a ridiculously large “runs” value but this is not necessarily correct.
A value of 400–1000 is usually ideal given that the numbers above this range may actually result in a higher gas cost due to having expanded the code to too many EVM statements. On the other hand, a low value will not have expanded the code at all leading to small bytecode size but unoptimized gas consumption.
Overall, there is no “golden number” and projects should play around with this number along with their gas-cost unit tests to identify the ideal trade-off that should be specified for their particular needs.
Costly Notions
Our primary focus will be on how to rigorously optimize the gas cost consumed by smart contract transactions. Before moving forward to short, tangible examples, we will showcase how a seemingly simple code example can be optimized in multiple ways.
The above code segment, although completely valid, contains a lot of redundancy that can significantly reduce its gas cost. Broken down, these are the hidden gas costs of the above segment:
- Contains three
mapping
lookups - Reads a full storage slot three times
- Upcasts a
bytes8
three times
Mapping Lookups
Let’s deal with the first issue first. Within Solidity, a mapping
is accessed by computing the keccak256
hash of its key. This means that, regardless of the key type, a keccak256
operation will be performed each time a particular entry is requested. To avoid this, a “pointer” should be stored in the code to ensure that the lookup occurs only once.
Storage Access
By far, storage reads and writes are the most costly notion within Solidity. A programmer’s primary goal to reduce gas costs is to minimize the total storage reads and writes performed by a particular function block. This can be done by avoiding data redundancy, such as the struct
containing the key used for the mapping
, or by applying data optimization techniques.
One of those is Solidity’s inherent tight-packing mechanism. When variables are declared at a contract-level or a struct-level, their declaration order impacts the way they will be packed under the same storage slot. Overall, if two variables can fit in a single 32-byte (256-bit) storage slot, they will be packed into one by the compiler automatically.
This may not always lead to a reduction in gas cost which will be explained in the next chapter, however, in this instance we already have an address
type declared in our struct
that contains 160-bits. As timestamp
is meant to represent a Unix timestamp, it is safe to use the remaining 96-bits for storing the timestamp
.
As a final optimization, we will change our local Entry
declaration from storage
to memory
. When a struct
is declared as memory
, all its variables are immediately read from storage and copied to memory thus minimizing storage reads if the whole struct
is meant to be utilized.
Type Upcasting
The final point here to optimize is data upcasting. The EVM is built to handle 32-byte data, meaning that any type less than that has been artificially created using upcasting and downcasting techniques that clean up the extraneous bits.
As a result of these specialized operations, all operations will cost more when performed on less-than-256-bit data types. In most contract systems I have observed, the usage of smaller data types is usually unwarranted and done for verbosity reasons; however, I personally advise against that as gas optimizations should take precedence to code verbosity.
To further optimize our code segment, the bytes8
identifier can be simply set to bytes32
thus preventing the upcasting operations and ensuring that the gas cost consumed by processing it in the keccak256
operation of the mapping lookups is as small as it can be.
Comparison
To illustrate how simple statements can significantly affect the gas cost of a function, we will compare the finalized code above with the initial bullet list:
- Contains three
mapping
lookups — Contains a singlemapping
lookup - Reads a full storage slot three times — Reads a full storage slot once
- Upcasts a
bytes8
three times — Contains no upcasting
Conclusion
A lot more optimizations are available within Solidity that can be applied yet remain relatively unknown or at least not widely applied, most likely due to the low gas costs that Ethereum possessed when the first Solidity developers hopped on the ecosystem.
In this article we showcased three ways code can be optimized, however, there are numerous ways code can be optimized in Solidity. If more interest is garnered, I will create a follow-up post that dives straight into optimizations and showcases some more hidden ones, such as using access control guarantees or avoiding library redundancy.