
Ethereum Storage Roadmap: Challenges and Opportunities
TechFlow Selected TechFlow Selected

Ethereum Storage Roadmap: Challenges and Opportunities
The growing storage demands pose significant challenges for Ethereum nodes.
Author: EthStorage
Abstract
-
Growing storage demands pose significant challenges for Ethereum nodes.
-
Due to storage constraints, some clients have started pruning historical data, leading to inconsistent storage behaviors among full nodes in the network.
-
To ensure consistency across all clients, historical data pruning is being standardized through EIP-4444 and EIP-4844.
-
As a result, recovering the latest L1 or L2 state by replaying historical data now relies on centralized, out-of-protocol services, prompting exploration of more decentralized and Ethereum-aligned solutions.
-
Ethereum Portal Network is a lightweight, decentralized peer-to-peer (P2P) network designed for all types of Ethereum data, including historical data. It is built for resource-constrained devices and provides Ethereum JSON-RPC services. The history and beacon networks are nearly production-ready.
-
The EthStorage Network is an incentivized modular storage network dedicated to EIP-4844 BLOBs data. To store BLOBs, users call the L1 storage contract’s put() method, paying ETH as storage fees while recording the BLOB hash on-chain. Over time, these storage fees are gradually distributed to storage providers who submit off-chain BLOB storage proofs. The EthStorage testnet is currently running on Ethereum’s Sepolia testnet, with multiple community participants successfully proving their local storage.
-
Future plans include developing a decentralized Ethereum state network, implementing storage proofs for variable-sized data, and enabling decentralized access directly from browsers.
Acknowledgments: We thank Piper Merriam from EF, Karthik Raju from Polychain, and Qiang from EthStorage for their feedback on this article.
Background
On October 22, 2023, Péter Szilágyi, lead developer of Go-Ethereum (Geth), expressed deep concerns on Twitter. He pointed out that while the Geth client retains all historical data, other Ethereum clients like Nethermind and Besu can be configured to delete certain historical Ethereum data (e.g., historical blocks and headers). This creates behavioral inconsistency across clients and places unfair pressure on Geth. This sparked intense discussion and debate within the Ethereum community regarding storage issues in the Ethereum roadmap.

Storage Challenges
Why have Nethermind and Besu chosen to stop storing historical data? What drives this decision? From our perspective, two main reasons stand out:
-
Storage requirements for Ethereum clients are becoming increasingly high.
-
There is no in-protocol incentive or penalty for storing Ethereum's historical data.
The first reason stems from rising storage demands when running an Ethereum client. To better understand these requirements, the pie chart below shows the storage distribution of a fresh Geth node as of block 18,779,761 on December 13, 2023.

As shown:
-
Total storage size: 925.39 GB
-
Historical data (blocks / transaction receipts): ~628.69 GB
-
State data in Merkle Patricia Trie (MPT): ~269.74 GB
The second reason is the lack of in-protocol incentives or penalties for storing historical blocks. While the protocol requires nodes to store all historical data, it fails to provide any mechanism to encourage storage or penalize non-compliance. Storing and sharing historical data becomes purely altruistic, allowing node operators to freely delete or modify historical data without consequences. In contrast, validator nodes must locally maintain and update the complete state to avoid slashing due to proposing/voting on invalid blocks.
Therefore, when storage costs become a major burden, it's unsurprising that some node operators choose to prune historical data. Without historical data, node clients can significantly reduce storage costs—from approximately 1TB down to around 300GB.

Illustration: Nethermind configuration running a node without historical blocks — currently saves about 460GB in storage
With upcoming Ethereum data availability (DA) upgrades, storage challenges will intensify. The path toward scaling Ethereum DA begins with EIP-4844 in the Dencun upgrade, which introduces a fixed-size binary large object (BLOB) and a separate fee market called blobGasPrice. Each BLOB is set at 128KB, and EIP-4844 allows up to 6 BLOBs per block. To scale data throughput, Ethereum plans to adopt 1D Reed-Solomon coding, initially supporting 32 BLOBs per block and eventually reaching 256 BLOBs per block at full scale.
If Ethereum DA operates at full capacity (256 BLOBs per block), the Ethereum DA network is expected to receive approximately 80 TB of DA data annually—a volume far beyond the storage capabilities of most nodes.

Ethereum Storage Roadmap and Its Implications

Vitalik's tweetposting the Ethereum roadmap mentioning "Purge," primarily related to storage aspects
Rising storage costs have drawn attention from researchers in the Ethereum ecosystem. To address this and ensure consistency across all clients, proposals are being developed to formally define historical data pruning. Two key proposals are:
-
EIP-4444: Limit Historical Data in Execution Clients: This proposal allows clients to delete historical blocks older than one year. Assuming an average block size of 100KB, the upper limit for historical block data would be approximately 250 GB (100KB * (3600 * 24 * 365) / 12, assuming block time = 12 seconds).
-
EIP-4844: Sharded BLOB Transactions: EIP-4844 discards BLOBs older than 18 days. Compared to EIP-4444, this is a more aggressive approach, limiting historical BLOB size to about 100 GB ((18 * 3600 * 24) * 128KB * 6 / 12, assuming block time = 12 seconds).
What are the implications of deleting historical data across all clients? A primary issue is that new nodes can no longer sync to the latest state via “full sync” mode—where transactions are executed sequentially from genesis to the latest block. Instead, we must rely on “snap sync” or “state sync” methods to directly download the latest state from existing Ethereum nodes. This method is already implemented in Geth and serves as the default syncing mechanism.
Similarly, this consequence extends to all Layer 2 (L2) systems: new L2 nodes cannot fully synchronize the latest **L2 genesis** state by replaying L2 blocks from genesis to the present. Moreover, since L1 nodes do not maintain L2 state, L2’s “snap sync” method cannot derive the latest L2 state from L1, violating a critical L2 assumption of inheriting Ethereum’s security guarantees. The anticipated solution would depend on third-party services such as Infura, Etherscan, or the L2 projects themselves to store historical L2 data or state snapshots—centralized solutions achieved through out-of-protocol, indirect incentives.
The core questions we aim to explore are:
-
Can we find better decentralized solutions for storage and access?
-
Is it possible to build solutions with direct incentives aligned with Ethereum—such as those built atop L1 contracts?
-
Building on all of the above, can we offer a fully decentralized, in-protocol, directly incentivized solution for Ethereum storage?
Solutions
Solution 1: Ethereum Portal Network
The Ethereum Portal Network is a lightweight, decentralized access network for connecting to the Ethereum protocol. It provides Ethereum JSON-RPC interfaces such as eth_call and eth_getBlockByNumber, translating JSON-RPC requests into P2P queries over a Distributed Hash Table (DHT), similar to the IPFS network. Unlike IPFS, which allows storage of any data type and is vulnerable to spam, the Portal P2P network exclusively hosts Ethereum data such as historical block headers and transaction data. This is enabled by built-in light-client validation techniques within the Portal network.
A key feature of the Portal Network is its lightweight design and compatibility with resource-constrained devices. It can operate on nodes with only a few megabytes of storage and low memory, promoting decentralization. Even smartphones or Raspberry Pi devices could potentially join the network and contribute to Ethereum data availability.
Portal Network development aligns with the Ethereum client diversity philosophy, with clients implemented in Rust, JavaScript, and Nim. The Beacon Network and History Network are already available, while the State Network is under active development. Notably, the Portal Network does not provide direct incentives for data storage—all nodes operate altruistically.

Illustration: Rust-based Portal Network client (Trin) with 100MB storage limit running
Solution 2: EthStorage Network
The EthStorage Network is a decentralized, incentivized storage network specifically designed for storing EIP-4844 BLOBs, funded by the ESP program.
-
Minimal Trust: Unlike existing solutions requiring centralized data bridges, EthStorage relies on Ethereum’s consensus and a 1/m trust model involving permissionless EthStorage storage nodes. The BLOB storage process works as follows: users sign a transaction carrying the BLOB and invoke the storage contract’s put(key, blob_idx) method. The contract then records the BLOB hash on-chain. Storage providers subsequently download and store the BLOB directly from the Ethereum DA network, bypassing data bridge dependencies.
-
Aligned Storage Costs and Incentives: When calling the put() method, the transaction must include a storage fee (via msg.value) deposited into the contract. After successful off-chain storage and verification of storage proofs by nodes, this fee is gradually distributed to storage providers over time. Compared to Ethereum’s current one-time payment model (paid to proposers), EthStorage’s time-based fee distribution follows a discounted cash flow model—assuming storage costs decrease relative to ETH price over time. This innovation aligns fees with long-term storage contributions.
-
Proof of Storage: Inspired by data availability sampling, EthStorage performs sampling over saved BLOBs across time intervals. To efficiently verify samples on-chain, EthStorage leverages smart contracts and recent advances in SNARK technology.
-
Permissionless Operation: Any storage node in EthStorage can earn rewards by storing data and regularly submitting storage proofs on-chain.
From a modular blockchain perspective, EthStorage acts as a storage Layer 2 for Ethereum, charging storage fees instead of transaction fees. By indexing BLOB hashes on-chain, EthStorage functions as a modular storage layer for Ethereum, enhancing storage scalability and reducing costs (targeting ~1000x improvement).
In terms of development, EthStorage has been integrated with EIP-4844 on Ethereum’s Sepolia testnet. We’ve conducted stress tests between EthStorage and the Sepolia testnet, including writing hundreds of gigabytes of BLOBs into EthStorage. Over 100 community participants have joined the network and successfully proven their local storage.
The main advantage of the EthStorage Network is providing decentralized, direct incentives on top of Ethereum—a pioneering feature as far as we know. However, its limitation lies in being specifically designed for fixed-size BLOBs.

Dashboard of EthStorage on Ethereum Sepolia testnet
Looking Ahead
Although Ethereum storage has yet to receive widespread attention, it holds significant importance within the Ethereum ecosystem. With the rapid growth of the Ethereum network, the storage and accessibility of Ethereum data have become critical challenges. Both the Portal Network and EthStorage Network are still in early stages, and several important long-term directions deserve focus:
-
Decentralized, Low-Latency Access to Ethereum State Data: Accessing Ethereum state in a decentralized and verifiable manner is crucial but challenging. Using traditional DHT models, querying account information often requires multiple round-trips to internal trie nodes stored across different P2P nodes, resulting in high latency. Leveraging the structure of the state tree to accelerate access is key. The upcoming State Network in the Ethereum Portal Network aims to solve this problem.
-
Integration of Portal and EthStorage Networks: The Portal Network can seamlessly extend to support BLOB data. The EthStorage team has partially implemented this functionality. The next step is unifying these networks to provide a decentralized JSON-RPC network enabling programmable access to BLOBs via smart contracts. By combining application logic in contracts with scalable BLOB storage from EthStorage, we can enable new dApps on Ethereum—such as dynamic decentralized websites (e.g., decentralized Twitter/YouTube/Wikipedia).
-
Decentralized Browser Access: Similar to the ipfs:// protocol for accessing data in IPFS, the web3 industry needs a native Ethereum access protocol to allow browsers to directly retrieve Ethereum data, unlocking the vast potential of rich Ethereum data. This includes diverse domains—from token ownership and account balances to NFT images and dynamic decentralized websites—all enhanced by smart contracts and future Ethereum storage capabilities. In this space, the web3:// protocol defined by ERC-4804/6860 is actively being developed and promoted to achieve this goal.
-
Advanced Proofs of Storage for Variable-Sized Data: Beyond fixed BLOBs, exploring advanced storage proofs for variable-sized data (e.g., historical blocks or even state objects) is essential. Developing sophisticated algorithms will enhance the adaptability of storage solutions.
Through these efforts, we hope to collectively contribute to the Ethereum roadmap and lay the foundation for decentralized storage solutions in the future Ethereum ecosystem.
Join TechFlow official community to stay tuned
Telegram:https://t.me/TechFlowDaily
X (Twitter):https://x.com/TechFlowPost
X (Twitter) EN:https://x.com/BlockFlow_News














