What is a blockchain oracle?

2023.10.30

What is a blockchain oracle?

An oracle itself is not a data source, but rather a tool that retrieves, verifies external data, and forwards it to smart contracts.

2023.10.30 - 06:28:09

Navigating Web3 tides with focused insights

An oracle itself is not a data source, but rather a tool that retrieves, verifies external data, and forwards it to smart contracts.

Blockchain oracles act as bridges between blockchains and the external world, enabling smart contracts to access off-chain data.

An oracle is a third-party service tool that retrieves, verifies external information, and transmits it to smart contracts running on blockchains.

By providing a mechanism for interacting with off-chain data, they extend the capabilities of smart contracts by executing valuable tasks and services.

Without oracles, smart contracts would be confined to on-chain data and unable to obtain external information.

Take a basic example: Alice and Bob bet on a horse race. Both players can lock their funds into a smart contract that distributes the money to the winner based on the real-world race result.

Although smart contracts cannot directly interact with the outside world, a third-party oracle can retrieve the race outcome by querying a trusted API, transmit the result to the smart contract, determine the winner, and enable the contract to allocate funds accordingly.

Oracles serve as bridges between the external world and the world of smart contracts.

Note that oracles themselves are not data sources, but tools that retrieve, verify, and forward external data to smart contracts. They can transmit various types of information, such as price data, payment confirmations, or sensor measurements.

Moreover, oracles must preserve the inherent characteristics of smart contracts—trustlessness and decentralization—while transmitting this data.

Essentially, this is the core problem oracles aim to solve: ensuring the reliability, authenticity, and trustworthiness of off-chain data serving smart contracts, while eliminating single points of failure and vulnerabilities.

Types of Oracles

There are many types of blockchain oracles available, each serving different purposes.

We can classify oracles based on data source type (hardware or software), direction of information flow (inbound or outbound), and trust model (centralized or decentralized). Each oracle type has unique features and advantages.

Hardware Oracles: Collect data from the physical world, such as information from motion sensors or RFID tags.

Software Oracles: Gather data from digital sources like websites, servers, or databases. Often used to provide real-time data such as exchange rates or price changes.

Inbound Oracles: Primarily transmit off-chain or real-world data onto the blockchain. Can trigger specific actions based on off-chain events.

Outbound Oracles: Send blockchain data to the external world. Can provide updates about on-chain events to external systems.

Centralized Oracles: Operated and managed by a single entity, relying on a single data source. This introduces risks due to single points of failure, making smart contracts vulnerable to attacks.

Decentralized Oracles: Utilize multiple data sources and consensus mechanisms to deliver more reliable and tamper-resistant data. Minimize counterparty risk and enhance the credibility of information used by smart contracts.

Human Oracles: Individuals with specialized expertise who act as data sources. They collect information, verify its validity, and input it into smart contracts. Human oracles can use cryptographic techniques to authenticate their identity and provide trustworthy data.

Contract-Specific Oracles: Designed for specific smart contracts to meet their unique needs. However, they require additional effort to operate and maintain, and may lack general applicability.

Computational Oracles: Perform complex computations and return results on-chain. These computations are often too difficult or expensive to execute directly on-chain. Such oracles are particularly valuable under network gas constraints and high computational costs.

Decentralized Oracles

Blockchain oracles are essential for any sophisticated and valuable smart contract service.

The use cases for blockchain oracles span numerous industries, including geolocation tracking (supply chain analytics, IoT), sports (prediction markets), weather (travel, agriculture), time and interval data (automation), and our primary focus—financial and capital market-related data.

Decentralized Finance (DeFi) aims to bring the world more efficient, transparent, and fair financial markets.

To achieve this, DeFi applications need reliable, trustless access to a broad range of data: asset prices (from cryptocurrencies to real estate), benchmark reference data (interest rates, funding rates), volatility, and market impact data, among others.

Indeed, since the "DeFi Summer" of 2020, the rapid expansion of the DeFi industry has highlighted an urgent need for comprehensive, accessible, and robust oracle market data.

Additionally, oracle infrastructure must deliver high-quality data, seamlessly integrate with any L1/L2 blockchain, and scale to meet the growing demands of an increasingly complex DeFi ecosystem.

In DeFi, price feed oracles remain the most prominent and widely discussed type. The history of price feed oracle design is nearly as long as that of smart contracts, yet existing architectures still reveal limitations.

In the following discussion, we will focus on several key questions:

Why do we need blockchain and price feed oracles, and why are they important?
What are the requirements for current oracle designs, and are they effective?
What alternative designs could address existing issues?

Clearly, oracles will continue to play a critical role in blockchain. However, existing oracle networks have already revealed flaws that prevent DeFi from scaling to the level it requires.

Traditional oracle solutions typically rely on intermediaries (nodes) to validate and aggregate data, leading to latency, opaque data sources, and cross-chain scalability issues driven by cost.

Currently, a new oracle network architecture is emerging—one focused on a "pull" rather than a "push" model, incentivizing highly credible data owners and creators to publish their data directly.

Why Do We Need Price Oracles?

The main category of these oracles is known as price feed oracles, which provide pricing data for assets such as cryptocurrencies, stocks, and commodities.

To illustrate their importance, consider a few examples:

Derivatives Protocols: Must provide traders with accurate asset prices and liquidate undercollateralized positions promptly.

DEX Aggregators: Source liquidity from various decentralized exchanges, requiring accurate oracle price data to identify optimal prices and execute trades with minimal slippage.

Stablecoins: Crypto-collateralized stablecoins need oracle data to ensure positions are sufficiently collateralized and maintain their peg to the underlying asset.

Lending Protocols: These protocols often operate using dynamic interest rates, a function of current asset prices. Delayed or inaccurate price data can harm overall protocol liquidity health, especially during periods of price volatility.

We cannot rely solely on a single data source, as this creates a centralized point of failure—contrary to the spirit of DeFi. Instead, we need tamper-resistant, timely data.

This is easier said than done. Due to the critical role oracles play in DeFi, they frequently become prime targets for attacks. Nevertheless, having reliable and robust data sources is crucial for any DeFi project.

This is why oracles are often referred to as the backbone of DeFi. As the DeFi space continues to evolve and expand, the need to quickly and reliably access attack-resistant data will grow increasingly vital.

Now that we understand the background of oracles, let’s examine existing oracle architectures.

The Current State of Price Oracles

A common oracle network design, known as the “reporter oracle network,” relies on multiple independent nodes operating as intermediaries between off-chain data sources (such as market data experts or public APIs) and blockchain applications (end users).

In reporter networks, intermediary nodes retrieve data from off-chain sources and then deliver it—the “last mile”—to the target blockchain.

These nodes also handle data aggregation, validation, and proof generation.

For example, suppose 100 nodes are tasked with retrieving the BTC price at a given moment.

They will retrieve prices from various sources (e.g., each node might use around 30 sources on average), then aggregate their reported prices to produce a single average or median value.

Most nodes will likely report the correct price, while a small number may report incorrect values due to poor data sources.

Finally, the oracle network aggregates the majority of node reports and publishes the result as the correct data.

To keep nodes operational and honest, economic incentives are typically employed.

Nodes reporting accurate prices receive token-based rewards, while those submitting incorrect data face penalties through mechanisms like slashing.

This oracle design offers several key advantages:

Security: Multiple data sources and intermediary nodes make it difficult for any party to manipulate the network and influence the final price output.

Data Sources: A wide array of sources ensures the oracle accesses diverse pricing information, generally improving accuracy and reliability.

Blockchain-Agnostic: Any blockchain network can adopt this design, as long as it supports node deployment for block validation.

However, this design also has drawbacks.

Having multiple nodes verify data, aggregate it, and reach consensus is inefficient. Existing oracle deployments update data roughly every 15 minutes, which is extremely slow and inefficient for globally scalable blockchains.

If frequent price updates are required across many asset pairs, associated network costs (e.g., ETH gas fees) rise rapidly, reducing the number of available asset pairs.

Without substantial gas subsidies, network congestion cannot be resolved. The rising gas costs required to support a growing node network must ultimately be borne or subsidized by users.

This limitation severely hampers the scalability of reporter networks in supporting more data or users.

Furthermore, data sources in reporter networks are often opaque. Data is typically aggregated off-chain in a non-transparent manner and published on-chain without full visibility—directly contradicting blockchain's goal of transparency.

Thus, while the entities operating data-providing nodes are known, their ultimate data sources remain unknown. This is particularly concerning during periods of high volatility when data sources infrequently update or lack granularity.

In fact, upstream data providers may not even know their data is being used to secure smart contract value, further compromising data quality and reliability.

This doesn’t even address data legality: Some data providers prohibit their data from being relayed to public ledgers, as they wish to restrict distribution to subscribers only.

Reporter network designs are specifically tailored for publicly accessible on-chain data—and have played a significant role in advancing DeFi to its current state.

However, as we strive to bring DeFi to billions globally, addressing the limitations of traditional oracle architectures is crucial.

In a previous article, we compared reporter oracle networks with newer oracle architectures, emphasizing the need for more transparent, economical, and scalable oracle solutions.

Future price oracles must be ready to scale to all trading pairs familiar in traditional finance (TradFi) and support every blockchain developers choose to build on.

The Pyth Oracle Network introduces a publisher-based oracle network design that rethinks the types of data price oracles should retrieve, the data sources they select, and the relationship between data owners and users.

Let’s explore this new architecture.

Rethinking Price Oracles

The financial data industry is massive. The largest U.S. exchanges earn billions annually just from selling market data. Given this observation, it may be wise to reconsider some fundamental assumptions about the data sources for price oracles.

For instance, public price data exists online, provided by free aggregators like Yahoo! Finance or Google Finance.

This data isn't very granular—for example, U.S. stock prices are often delayed by 15 minutes or more due to regulations.

Meanwhile, much valuable data is closely guarded by institutions: accurate, timely information holds immense value. Exchanges and data terminal services like Bloomberg or Refinitiv understand this and charge substantial subscription fees.

The implicit assumption behind reporter oracle networks is that all data needed by blockchains is freely available online. By incentivizing intermediary nodes to collect, verify, and transmit data, DeFi can track global market movements.

In reality, valuable financial data is restricted to a few privileged parties and not easily accessible. Rewarding nodes for retrieving and relaying data works for certain data types but fails for capital market data where speed is critical and information is a core competitive advantage.

This approach is also constrained by quality, efficiency, and even legal limitations in supporting larger node networks.

Pyth Network takes a fundamentally different approach: the oracle network incentivizes highly credible parties—owners and creators of valuable data—to voluntarily and directly publish their data to the oracle network.

On-chain programs use price aggregation mechanisms to eliminate outliers, while cross-chain bridges sign and verify all price data sent to target blockchains.

In this publisher oracle network, data providers run their own nodes and publish data directly on-chain.

This design eliminates reliance on intermediary nodes, resulting in higher-quality data, greater gas efficiency, and ultimately, higher scalability for the oracle network to support thousands of price feeds.

First-Party Data Sources

Trusted institutions supplying data to Pyth Network are called data providers or “publishers.” Data providers are typically established institutions with abundant high-quality data, including global exchanges, market makers, and trading firms.

Some of the most notable include Cboe, Jane Street, Optiver, Binance, OKX, QCP Capital, Two Sigma, Wintermute, and CMS. There are currently over 80 data providers in the network.

All these data providers are first-party sources: they create and thus own the price data they supply, either because they are order-execution venues (where traders intend to trade) or traders themselves (executing trades at specified prices).

In reporter networks, nodes must search for or purchase data from other intermediaries or first-party sources, making them third-party data sources.

First-party data ensures data quality and network security. Every data provider’s contribution to a Pyth price feed means individual sources are accountable for the quality of their input.

Additionally, data providers’ reputations—and the damaging impact malicious attacks would have on their entire business—serve as a strong deterrent against traditional oracle attack vectors.

It’s also clear these institutions possess far higher-quality data than what simple web scraping or public aggregators can offer.

Moreover, since these data sources own their data, they can distribute it to blockchain applications without intellectual property concerns.

Diving Deeper: How Pyth Works

The Pyth Network protocol enables first-party data providers to publish their proprietary price information on-chain for public use.

The protocol facilitates interactions among three parties:

Data Providers: Reputable institutions submit price data directly to Pyth’s on-chain oracle program. For any price feed product (e.g., BTC/USD), multiple providers publish data to ensure accuracy and robustness.

Pyth Oracle Program: Runs on the Pythnet application chain. It securely and transparently aggregates submitted data to generate a composite price.

Users: Consumers of Pyth’s data. Typically decentralized applications such as Synthetix, Ribbon, and CAP Finance.

Pythnet Application Chain

In August 2022, Pyth Network launched Pythnet, an application-specific blockchain that enables Pyth data to be aggregated and published across blockchains via the Wormhole cross-chain bridge.

Built on Solana technology but eventually separated from the Solana mainnet, Pythnet allows data providers to submit data for aggregation. Through Wormhole, aggregated prices can be transmitted to over 20 blockchains. This architectural choice delivers extraordinary scalability advantages.

New price feeds published on Pyth Network become instantly available across all 20+ supported blockchains.

This is highly beneficial for builders aiming to expand their applications to new blockchains, allowing immediate access to the same markets and assets as the original chain.

Additionally, Pyth’s unique architecture enables rapid deployment to new Wormhole-supported blockchains—at a pace of about one per month.

In contrast, competing oracle networks often face technical delays, limiting their ability to expand to new chains. For example, one oracle network took nearly two years from announcement to launch on Solana.

Pull, Don’t Push

Pyth Network operates via a "pull" oracle model, where users actively request—or "pull"—the data they need into their local blockchain environment.

In contrast, traditional oracle solutions use a "push" model, where price data is automatically pushed on-chain at preset intervals, even when no one is actively using the updates.

Pyth’s pull oracle design offers the following advantages:

Gas Efficiency: Users pay only when they need data. Gas is not wasted on unused price updates. Moreover, if one entity pulls Pyth price data on-chain, every participant on that chain can use the update.

High-Frequency Price Updates: Pyth price feeds update faster than once per second—faster than most block times. Such frequency would be impossible if every price had to be pushed on-chain.

Low Latency: Users access the latest pulled price data instead of being forced to use the most recently pushed update.

Reliability: During market volatility, pushed updates may compete with other transactions for network bandwidth. Pyth’s pull updates can be bundled into users’ valuable transactions.

Scalability: Pyth can scale to thousands of new price feeds without increasing gas costs. Costs arise only when users pull data.

While the benefits of the pull model are numerous, the most critical is that the pull oracle (on-demand update) model delivers the scalability essential for DeFi’s future.

Further Considerations for Improvement

Although Pyth has proven capable of consistently delivering high-quality data across over 20 blockchain networks, a recurring critique argues that its reliance on institutional data sources may lead to excessive centralization.

It’s important to note that Pyth has many data providers, meaning any single provider’s error has minimal impact on any given price feed.

Manipulating a price feed would require a majority of providers to submit incorrect data. Our whitepaper discusses the network’s resistance to data provider collusion in greater detail.

While reliance on “trusted” institutions in Pyth Network is a valid criticism, Pyth’s approach brings significant advantages to DeFi while preventing oracle manipulation or collusion by data sources.

We will continue pushing innovation and improvement in oracle solutions across performance, security, and decentralization—a challenging balance—and aim to maintain leadership in this space.

The Path Forward

Price feed oracles are the backbone of DeFi, providing accurate, timely data so critical applications can securely and correctly trade, safeguard, and transfer assets.

Past designs were built on the premise that intermediary nodes could be incentivized to collect and agree upon public information in a trustless manner and submit aggregated results.

This approach has merits but also flaws, including transmission delays, opaque data sources, distribution rights considerations, and overall limitations on oracle network scalability.

Ongoing innovation in decentralized finance—even if the public takes time to recognize what the industry is creating—has significantly advanced DeFi infrastructure.

Pyth Network introduces a faster, more reliable, and more secure way to access financial data that most blockchain developers previously couldn’t obtain. Pyth Network has seen substantial growth in:

250+ available price feeds
25 million+ daily price updates
$50 billion+ total secured transaction volume
150+ integrated applications
20+ supported blockchains

Pyth price feeds are permissionless. Developers can start integrating directly from developer documentation and explore use cases such as how Synthetix perpetual contracts utilize Pyth price data.

Other well-known Pyth users include Ribbon Finance, Venus, and CAP Finance.

As the DeFi ecosystem continues to grow, Pyth Network’s role in delivering trusted, real-time data becomes increasingly vital for securing and stabilizing blockchain networks and enabling broader industry expansion.

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Source

Add to Favorites

Share to Social Media

Author

Pyth Network

@PythNetwork

What is a blockchain oracle?

TechFlow Selected TechFlow Selected