Discussion on the Principles and Technical Details of Ordinal Inscription Protocol

2024.02.02

Discussion on the Principles and Technical Details of Ordinal Inscription Protocol

How are the sats within UTXOs actually tracked, where exactly in the script is the inscription data stored, and why do BRC-0 transfers require two separate operations?

2024.02.02 - 09:21:06

Navigating Web3 tides with focused insights

How are the sats within UTXOs actually tracked, where exactly in the script is the inscription data stored, and why do BRC-0 transfers require two separate operations?

Author: @hicaptainz

Over the past two weeks, while researching the BTC ecosystem and various inscription projects, I noticed a lack of articles that clearly explain the underlying principles and technical details—such as how transactions are initiated during inscription minting, how sats within UTXOs are tracked, where exactly the inscribed content is stored in scripts, and why BRC20 transfers require two steps. Without understanding these technical aspects, it's difficult to differentiate between protocols like BRC20, BRC420, Atomicals, Stamps, and Runes. This article dives into the fundamentals of the Bitcoin blockchain to answer these questions.

BTC Block Structure

At its core, blockchain is a multi-user accounting technology—in computer science terms, a distributed database. Records (transactions) over a given time period form a block, and the ledger expands sequentially based on time.

We’ve used an Excel spreadsheet to illustrate how blockchain works. Each file represents a blockchain, with individual sheets representing blocks numbered sequentially from 560331, 560332, up to the latest 560336. Block 560336 packages recent transactions. The main body of each block uses double-entry bookkeeping: one side lists input addresses ("debit"), the other output addresses ("credit"). The value column shows the amount of BTC associated with each address. The total input amount exceeds the output amount—the difference being transaction fees paid to miners (the accountants). The block header contains the previous block’s height and hash, the current timestamp, and a nonce. In this decentralized system, who gets the right to write the next block? That’s determined by the nonce and its resulting hash. Miners compete by performing hash calculations on the current block; the first miner to produce a valid hash earns the right to create the next block and receives both the block reward and transaction fees. Finally, there's the script area, which can support extended applications—for example, the op_return opcode acts like a memo field. Note that in actual blocks, script data is embedded within input and output fields rather than existing as a separate section. For instance, the script attached to an input is called the unlocking script (ScriptSig), requiring private key signature authorization for spending, while the output carries a locking script (ScriptPubKey), setting conditions for redeeming the BTC (typically “only the owner of the corresponding private key can spend”).

The above two diagrams show raw input and output data structures. At execution level, scripts appear as auxiliary parameters of transaction data. The unlocking script (ScriptSig), requiring private key authorization, is also known as "witness data".

Segregated Witness and Taproot

Although the Bitcoin network has operated for over a decade without major incidents, transaction costs have occasionally spiked to unsustainable levels. As a result, developers have long debated how best to scale the network to handle growing transaction volume.

In 2017, this debate peaked, splitting the Bitcoin development community: one faction supported implementing SegWit via soft fork, while the other advocated direct block size increases (“big block” approach).

As previously mentioned, unlocking scripts generate "witness data" through private key signatures. Could we move this witness data out of the main block to effectively increase transaction capacity per block? Segregated Witness (SegWit) was activated in August 2017 precisely to achieve this. It splits transaction data into two parts: basic transaction data (Transaction Data) and signature data (Witness Data). The latter is stored separately in a new structure—the so-called “witness block”—transmitted independently from the original transaction.

Technically, SegWit means transactions no longer include witness data (thus not consuming Bitcoin’s original 1MB block space). Instead, an additional dedicated space is added at the end of each block for witness data. This supports arbitrary data transfer under a discounted “block weight” metric, cleverly keeping large amounts of data within Bitcoin’s block size limits without requiring a hard fork. Consequently, transaction data size limits increased, and signature-related fees decreased. Before SegWit, Bitcoin’s block limit was 1MB; after SegWit, although standard transaction space remains capped at 1MB, the segregated witness space reaches up to 4MB.

Taproot was implemented in November 2021 and consists of three Bitcoin Improvement Proposals (BIPs): Taproot, Tapscript, and Schnorr Signatures. Taproot aims to enhance user privacy, reduce fees, and enable more complex transactions (via new opcodes), expanding Bitcoin’s application scope.

These upgrades were crucial enablers for Ordinals NFTs, allowing NFT data to be stored in the spent script path of Taproot (within the witness space). This made structured storage of arbitrary witness data much easier, laying the foundation for the "ord" standard. With relaxed data requirements, a single transaction could theoretically fill an entire block—including both transaction and witness data—up to the 4MB witness space limit, greatly expanding the types of media that can be stored on-chain.

One might ask: if strings can be placed in scripts, aren’t there restrictions? What if those scripts get executed accidentally? Could invalid code prevent block confirmation? This brings us to the OP_FALSE instruction. OP_FALSE (represented as “0” in Bitcoin script) ensures that execution paths never enter OP_IF branches and remain inactive. It acts as a placeholder or no-operation (NOP) in scripts—similar to comments in high-level programming languages—ensuring subsequent code isn't executed.

UTXO Transfer Model

So far, we've discussed Bitcoin’s mechanics from a computer data structure perspective. Now let’s examine the UTXO model from a financial standpoint.

UTXO stands for Unspent Transaction Output, literally meaning “unspent outputs.” Practically, it refers to leftover funds after a transaction. Why does Bitcoin use this concept? To understand that, we need to compare two accounting models: account-balance vs. UTXO.

Having lived in centralized systems for so long, we’re accustomed to the account-balance model. When user A sends $100 to user B, banks check if A has sufficient balance. If yes, they deduct $100 from A’s account and add it to B’s—completing the transfer.

However, Bitcoin’s accounting algorithm doesn’t track balances directly. The blockchain only records individual transactions—not final account balances (which would require centralized servers, defeating decentralization). Suppose user A initially holds 1000 units. If A sends 100 units to B, the ledger records:

Transaction 1: User A sends 100 units to User B

Transaction 2: User A sends 900 units to themselves (UTXO)

Although Transaction 2 appears just like any other, functionally it serves as a balance indicator—showing that A retains 900 units post-transfer.

Why go through all this trouble? Because Bitcoin can only record transactions, not account balances. Without UTXOs, calculating a balance would require summing all incoming and outgoing transactions—an extremely resource-intensive process. UTXOs elegantly solve this by eliminating the need to trace back every historical transaction when computing balances.

A key feature of UTXOs is their indivisibility—they cannot be split like paper money. So how do users combine inputs or receive change during transactions? Think of them like physical coins (in fact, whenever you see “UTXO,” mentally replace it with “coin”).

For example, Xiao Ming wants to send 1 BTC to Xiao Gang. He needs to gather enough input value. His wallet finds a prior UTXO worth 0.9 BTC—insufficient. But since multiple inputs are allowed, he adds another 0.2 BTC UTXO. Thus, this transaction has two inputs. Outputs also consist of two: one sending 1 BTC to Xiao Gang, and another returning 0.1 BTC to Xiao Ming as change (this example ignores gas fees).

In essence, Xiao Ming has two coins—one valued at 0.9 and another at 0.2. To pay exactly 1 BTC, he hands over both coins, and Xiao Gang returns 0.1 BTC as change. This accounting model avoids explicit balance calculation by using the “change” mechanism.

Ordinal Protocol’s Numbering System

The Ordinal protocol sparked the recent surge in the BTC ecosystem by breaking down homogeneous BTC into smallest units—sats—and assigning each sat a unique ordinal number. How is this achieved?

We know BTC has a fixed supply of 21 million, and each BTC can be divided into 100 million units (sats). These sats, like whole BTCs, are fungible tokens (FTs). The Ordinal protocol assigns each sat a serial number (ordinal).

Earlier, when discussing block structure, we noted that transaction data includes input/output addresses and values. Each block contains two types of transactions: block rewards and fee payments. Fee-based transactions always have inputs and outputs, but block rewards generate BTC out of thin air—so they lack an input address. This special case is called a “coinbase transaction.” All 21 million BTC originate from coinbase transactions, which appear first in every block’s transaction list.

The Ordinal protocol defines two rules:

Numbering: Sats are numbered in the order they are mined
Transfer: Follows FIFO (First-In, First-Out), moving from transaction inputs to outputs

The first rule is straightforward—it implies numbering originates solely from coinbase transactions. For example, if the first block rewards 50 BTC, its sats are numbered [0;1;2;...;4,999,999,999]; the second block, also rewarding 50 BTC, assigns numbers [5,000,000,000;5,000,000,001;...;9,999,999,999].

The tricky part lies in ordering individual sats within a UTXO containing many sats. This is governed by Rule 2. Let’s consider a simplified example:

Assume BTC’s smallest unit is 1, and only 10 blocks have been mined, each rewarding 10 BTC—totaling 100 BTC. We assign these a sequence from 0–99. Without any transactions, we only know Block 1’s 10 BTC are numbered 0–9, Block 2’s are 10–19, ..., and Block 10’s are 90–99. Since no spending occurs, there are no outputs—so we can only assign ranges of 10 BTC each.

Now suppose Block 2 includes two outputs: one spending 3 BTC and another returning 7 BTC as change. In the transaction list, assume the 7 BTC change ranks first (assigned sats 10–16), followed by the 3 BTC sent to someone else (sats 17–19). This output ordering determines the specific sat sequence contained in each UTXO.

Note: It's individual sats—not UTXOs! Since UTXOs are the smallest indivisible transaction units, sats exist only within UTXOs. A UTXO contains a continuous range of sats and can only redistribute sat numbering upon being spent to create new outputs.

Ordinal supports various formats for expressing these numbers: integer notation, decimal degrees, percentage, and alphabetical naming.

Once sats have unique ordinals, inscription becomes possible. As noted earlier, any type of file—text, image, video—can be uploaded into the 4MB witness space and automatically converted into hexadecimal format stored in the Taproot script area. Thus, one UTXO corresponds to one Taproot script area. This UTXO contains many sats (a sequential set); to prevent dust attacks, a single UTXO must contain at least 546 sats. To simplify tracking, the Ordinal protocol stipulates that “the first sat in the set represents the binding relationship” (as stated in the whitepaper: the ID of the first sat in the first output). So a UTXO containing sats 17–19 is represented simply by sat #17, linked to the inscribed content.

Minting and Transferring Ordinal Assets

Clearly, Ordinal NFTs involve uploading files into the Taproot script area of the segregated witness zone and binding them to a sat sequence, thereby issuing NFT assets on the Bitcoin chain. But here's a question: the witness script contains both input unlocking scripts and output locking scripts—which one stores the content? The correct answer is both. This leads us to the commit-reveal mechanism in blockchain technology.

The commit-reveal mechanism ensures fairness and transparency when submitting hidden information (e.g., votes or bids) and revealing it later. It operates in two phases: Commit and Reveal.

1. **Commit Phase**: Users submit encrypted versions of their data—typically the hash (cryptographic digest)—to the blockchain. Due to hash properties, the original data cannot be reverse-engineered from the hash, ensuring confidentiality during submission.

2. **Reveal Phase**: At a later time, users reveal the original data along with any required metadata (like salt or nonce). The network verifies whether hashing the revealed data matches the previously submitted hash. If matched, the data is accepted as valid.

Earlier, we established that inscribed content must be bound to a UTXO (an output), so it should reside in the output’s locking script. However, full nodes must locally store and propagate the entire UTXO set. Imagine uploading 10,000 videos (each 4MB) directly into UTXO locking scripts—full nodes would require enormous storage and bandwidth, potentially crashing the network. Therefore, the only viable solution is placing content in the input’s unlocking script and having it “point” to an output.

Thus, minting Ordinal assets requires two steps (wallets often merge these steps, constructing both commit and reveal transactions together, giving users the impression of a single step and saving gas).

During minting, the user first uploads the hash of a file into the locking script of a UTXO created in a “commit” transaction (from address A to B). Since only a hash is stored, it consumes minimal space in the full node’s UTXO database. Next, the user creates a new transaction (from B back to A), called the “reveal” transaction. Its input references the UTXO from the commit transaction, and its unlocking script contains the original inscribed file. In the words of the whitepaper: “First, in the commit phase, create a Taproot output committing to a script containing the inscription. Second, in the reveal transaction, spend the output from the commit transaction to reveal the inscription on-chain.”

In transfer, Ordinal NFTs differ slightly from BRC20. Ordinal NFTs are transferred entirely—just like regular BTC transfers—by sending the UTXO-bound NFT directly to the recipient. But BRC20 involves custom-amount transfers and thus requires two steps: first, an “Inscribe TRANSFER” transaction, then a “Transfer TRANSFER” transaction. The first step resembles Ordinal NFT minting and implicitly includes a commit-reveal pair. The second step mirrors a standard Ordinal NFT transfer, sending the BRC20 asset (bound to a UTXO) to the receiver. Some wallets construct all three transactions (grandparent-parent-child) simultaneously to save time and gas.

In summary, the commit transaction binds the inscription (hash of the content) to a numbered sat (UTXO), while the reveal transaction displays the actual content. Together, this parent-child transaction pair completes NFT minting.

P2TR and an Example

Our technical discussion on minting isn’t complete yet. One might wonder: how does the reveal transaction verify the inscription data in the commit transaction? And why must users send between their own two addresses A and B during construction? When minting, we don’t even see two wallets. Here enters P2TR—a key upgrade introduced by Taproot.

P2TR (Pay-to-Taproot) is a new Bitcoin transaction type enabled by the Taproot upgrade. It allows users to spend BTC using either a single public key or more complex scripts (e.g., multisig wallets or smart contracts), enhancing privacy and flexibility. This is achieved using Merkleized Abstract Syntax Trees (MAST) and Schnorr signatures, enabling efficient encoding of multiple spending conditions within a single transaction.

Define Spending Conditions

To create a P2TR transaction, users first define spending conditions—such as a single public key or a complex script—specifying requirements for spending BTC (e.g., multisig or contract logic).

Generate Taproot Output

Then, a Taproot output is generated, including a single public key (representing the spending condition). This public key derives from a combination of the user’s public key and the script hash via a process called “tweaking.” This makes the output indistinguishable from a standard public key on-chain, enhancing privacy.

Spend Bitcoin

When spending BTC, users can either use their single public key (if conditions are met) or reveal the original script and provide necessary signatures/data. This is done via Tapscript, allowing more efficient and flexible execution of spending conditions.

Verify Transaction

Miners and nodes validate the transaction by checking provided Schnorr signatures and data against the spending conditions. If satisfied, the transaction is valid and the BTC can be spent.

Enhanced Privacy and Flexibility

Since P2TR transactions only disclose necessary spending conditions upon use, they maintain high privacy. Moreover, MAST and Schnorr signatures allow efficient encoding of multiple conditions, supporting complex and flexible transactions without increasing overall size.

This explains how the commit-reveal mechanism works under P2TR. Let’s walk through a real-world example.

Using the blockchain explorer https://www.blockchain.com/, let’s analyze the minting process of an Ordinal image NFT, covering both commit and reveal stages.

First, the commit transaction hash is (2ddf90ddf7c929c8038888fc2b7591fb999c3ba3c3c7b49d54d01f8db4af585c). Notice that this transaction’s output does not display inscription data (it actually contains the hex hash of the image file), and the webpage shows no inscription info. The output address (bc1p4mtc.....) is a temporary address generated via “tweaking” (representing script unlock conditions) and shares a private key with the main Taproot address (bc1pg2mp...). The second UTXO in this transaction is the change return. This achieves binding between the inscription content and the sats in the first UTXO.

Next, we examine the reveal transaction with hash (e7454db518ca3910d2f17f41c7b215d6cba00f29bd186ae77d4fcd7f0ba7c0e1). Here, the Ordinals inscription information is visible. The input address is the temporary output (bc1p4mtc.....) from the previous transaction, and its unlocking script contains the original image’s hex data. The output of 0.00000546 BTC (546 sats) sends the NFT to the owner’s main Taproot address (bc1pg2mp...). Based on FIFO and the rule “binding to the first sat in the first output,” even though the number of sats changes across UTXOs, the bound sat number remains unchanged. Hence, we can locate this inscription at sat 1893640468329373.

(https://ordinals.com/sat/1893640468329373)

These two transactions (parent and child) are typically submitted together by wallets to the mempool, incurring only one gas fee and often ending up in the same block (as seen in our example, both are recorded in block 790468). Miners and nodes then verify that the Schnorr signature and image hash in the reveal transaction’s input match the image hash stored in the commit transaction’s output locking script. If identical, the transaction is valid, the UTXO is spendable, and both transactions become permanently recorded on the Bitcoin blockchain—with the NFT image successfully stored and displayed. If hashes differ, both transactions are rejected and inscription fails.

BRC20 Protocol and Indexers

With the Ordinal protocol, inscribing text creates text NFTs (akin to Loot on Ethereum), images become image NFTs (like PFPs on Ethereum), and music becomes audio NFTs. But what if we inscribe code—specifically, code defining a fungible token issuance?

BRC20 leverages the Ordinal protocol to deploy, mint, and transfer tokens by setting inscriptions as JSON-formatted data. This JSON includes code snippets describing token attributes such as total supply, maximum mint per transaction, and unique ticker. As discussed in our previous article, BRC20 tokens are semi-fungible tokens (SFTs)—meaning they can behave as NFTs in some contexts and FTs in others. How is this control achieved? Through indexers.

An indexer is essentially an accountant that categorizes and records received information in a database. Under the Ordinal protocol, indexers track input/output flows to determine how ordered sats move across addresses. In the BRC20 protocol, indexers gain an additional role: recording token balance changes across addresses.

From an accounting perspective, BRC20 tokens exist across a three-layer database system: Layer 1: Accountant = BTC miners, Database = “chain database”, Asset = BTC (FT). Layer 2: Accountant = Ordinal indexer, Database = “relational database”, Asset = numbered sats (NFT). Layer 3: Accountant = BRC20 indexer, Database = “relational database”, Asset = BRC20 tokens (FT). When viewing BRC20 in “pieces,” we adopt the Ordinal indexer’s view—it sees each piece as an NFT. When thinking in divisible “units” (especially after depositing to centralized exchanges), we take the BRC20 indexer’s (or exchange server’s) view—where it behaves as an FT. Thus, the existence of semi-fungible tokens (SFTs) stems from layered accounting systems. Blockchain is just a distributed database, hence the emergence of miners as a collective accounting force maintaining the “chain database” (only chain databases enable true decentralization). Yet ironically, we’ve returned to centralized “relational databases.” This explains why the creators of the Ordinal protocol, BRC20 protocol, and Unisat wallet recently clashed fiercely over indexer upgrades—the accountants disagree. Yet after decades of industry development, we’ve accumulated experience in “decentralization.” Can indexers replace relational databases with chain databases? Can fraud proofs or ZKPs ensure security and decentralization? Will Bitcoin’s DA (data availability) demands spill over to other DA layers, promoting cross-chain ecosystem growth and integration? I see many possibilities ahead.

Join TechFlow official community to stay tuned

Telegram:https://t.me/TechFlowDaily

X (Twitter):https://x.com/TechFlowPost

X (Twitter) EN:https://x.com/BlockFlow_News

Add to Favorites

Share to Social Media

Author

Gametaverse

Discussion on the Principles and Technical Details of Ordinal Inscription Protocol

TechFlow Selected TechFlow Selected