Stage 2 of 6 · Estimated 4–6 weeks
This stage takes the foundational primitives from Stage 1 and assembles them into a working understanding of blockchain protocols. You will learn how transactions are created, validated, and ordered into blocks; how nodes communicate to form a coherent network; how consensus mechanisms (Proof of Work, Proof of Stake) achieve agreement among untrusted participants; and how forks, reorganizations, and chain selection rules maintain a single coherent history.
This is where theory becomes protocol. Every blockchain — Bitcoin, Ethereum, Solana, Cosmos — is a specific instantiation of the primitives you learned in Stage 1. Understanding the protocol layer means understanding why design decisions were made, not just what they are. Engineers who skip this stage build dApps without understanding the substrate they run on — they are surprised by reorgs, confused by finality, and vulnerable to consensus-level attacks.
You can trace a transaction from creation to finality across the entire protocol stack. You can explain the security model of PoW and PoS from first principles. You can reason about fork choice rules, chain reorganizations, and finality delays. You can analyze a blockchain's data model and understand why it constrains or enables certain application patterns.
Blockchain as State Machine
├── State, transitions, determinism
└── Accounts vs UTXO models
│
▼
Transactions
├── Structure and encoding
├── Signature validation
├── Nonce management
├── Fee mechanisms
└── Transaction lifecycle
│
▼
Blocks
├── Block structure and headers
├── Block production
├── Block validation rules
└── Block propagation
│
▼
Nodes & Networking
├── Full nodes, light nodes, archive nodes
├── Mempool dynamics
├── Block gossip
└── Sync strategies (full, fast, snap)
│
▼
Consensus Mechanisms
├── Proof of Work
│ ├── Mining difficulty
│ ├── Hash rate and security
│ └── Energy and hardware
├── Proof of Stake
│ ├── Validator selection
│ ├── Slashing
│ └── Finality (Casper FFG)
└── Fork choice rules
│
▼
Forks & Reorganizations
├── Soft forks vs hard forks
├── Chain reorganizations
├── Finality models (probabilistic vs deterministic)
└── Governance and upgrade mechanisms
│
▼
Token Standards & Data Models
├── Native tokens vs smart contract tokens
├── ERC-20, ERC-721, ERC-1155
├── UTXO vs Account model (deep comparison)
└── State growth and pruning
A blockchain is a replicated, deterministic state machine. This is the most precise and useful mental model.
State: The complete snapshot of all relevant data at a given point in time. In Ethereum, this is the set of all accounts (each with balance, nonce, code, and storage). In Bitcoin, this is the set of all unspent transaction outputs (UTXOs).
State transition function: A deterministic function that takes the current state and a transaction, and produces a new state:
S' = STF(S, tx)
"Deterministic" means: given the same state and the same transaction, every node in the network must compute exactly the same new state.
Block: A batch of transactions applied sequentially to transition the state:
S_{n+1} = apply(S_n, Block_n)
where apply processes each transaction in the block in
order.
Blockchain: A sequence of blocks, each referencing the previous block, forming a chain of state transitions from genesis (initial state) to the current state.
The UTXO Model (Bitcoin)
State = set of unspent transaction outputs. Each UTXO is
(txid, output_index, value, locking_script). A transaction
consumes UTXOs (inputs) and creates new UTXOs
(outputs). The sum of input values must equal the sum of output values
plus fees.
Key property: UTXOs are atomic — they are either fully spent or fully unspent. There is no concept of a "balance" at the protocol level; a wallet's balance is the sum of its UTXOs.
The Account Model (Ethereum)
State = mapping from addresses to account objects. Each account has:
{nonce, balance, codeHash, storageRoot}. Externally Owned
Accounts (EOAs) have no code; Contract Accounts have code that executes
when called.
Key property: State is mutable in place — a transaction updates the sender's and receiver's balance directly.
The state machine model is the foundation for reasoning about correctness, determinism, and consensus. If the state transition function is not perfectly deterministic, nodes diverge and the chain forks. Understanding which model (UTXO vs account) a chain uses shapes everything: transaction construction, parallel execution, privacy, and smart contract capability.
| Tool | Purpose |
|---|---|
Bitcoin Core (bitcoind) |
Reference Bitcoin UTXO implementation |
Geth (go-ethereum) |
Reference Ethereum account-model implementation |
btcd (Go) |
Bitcoin full node in Go, well-documented |
ethereumjs-vm |
JavaScript EVM implementation for study |
time.time() in a state
transition). Show how two nodes processing the same block arrive at
different states.
Implement both UTXO and Account models side by side. Process the same set of logical transfers (Alice sends Bob 5 coins, Bob sends Charlie 3 coins, etc.) through both models. Compare: state representation size, transaction size, ease of implementing multisig, and parallel validation capability. Write a comparison report.
Transaction Structure
A transaction is a signed instruction to transition the state. In Ethereum, a transaction (Type 2, EIP-1559) contains:
| Field | Description |
|---|---|
chainId |
Network identifier (prevents replay across chains) |
nonce |
Sequential counter per sender account |
maxPriorityFeePerGas |
Tip to the validator (EIP-1559) |
maxFeePerGas |
Maximum total fee per gas |
gasLimit |
Maximum gas units for execution |
to |
Recipient address (or empty for contract creation) |
value |
Amount of ETH to transfer (in wei) |
data |
Calldata (function selector + arguments, or contract bytecode) |
v, r, s |
ECDSA signature components |
The transaction is RLP-encoded, hashed with Keccak-256, and signed with the sender's private key. The sender's address is recovered from the signature — it is not explicitly included in the transaction.
Transaction Lifecycle
Block Structure
A block consists of:
The block header is the most critical component for consensus — nodes can verify headers without processing the full block body.
Fee Mechanisms
Understanding transaction structure enables you to construct, sign, and decode transactions at the byte level. Understanding the block lifecycle reveals the incentive dynamics that drive block production, transaction ordering, and MEV extraction. Understanding fee mechanisms is essential for building applications where transaction cost predictability matters.
| Tool | Purpose |
|---|---|
ethers.js / web3.js |
Construct, sign, and send Ethereum transactions |
cast (Foundry) |
CLI tool for decoding and sending transactions |
bitcoin-cli |
Bitcoin Core RPC for transaction operations |
| Etherscan / Blockchair | Block explorers for transaction inspection |
| Flashbots Protect | Private transaction submission (MEV protection) |
ethers.js, construct an Ethereum transaction from scratch
(set all fields manually), sign it, serialize it, and decode the raw
bytes. Verify the signature and recover the sender address.
Build a transaction explorer. Given a block number,
fetch the block, decode all transactions, and display: sender,
recipient, value, gas used, and effective fee. For each transaction,
verify the signature and confirm the sender matches. Implement in
JavaScript using ethers.js.
Node Types
Full Node: Downloads and validates every block and transaction. Maintains the current state. Does not necessarily store all historical state.
Archive Node: A full node that additionally stores all historical state at every block. Enables queries like "What was this account's balance at block 5,000,000?"
Light Node (Light Client): Downloads only block headers and uses Merkle proofs to verify specific data on demand.
Validator/Miner Node: A full node that additionally participates in consensus — proposing blocks (PoW mining or PoS proposing) and voting (PoS attesting).
Mempool Dynamics
The mempool is each node's local pool of unconfirmed transactions. Key dynamics:
Sync Strategies
Peer Management
Nodes maintain peer connections with scoring systems:
Node architecture determines who can participate in the network, what they can verify, and how much resource commitment is required. The shift from PoW to PoS changed validator economics but didn't change the fundamental role of full nodes. Light clients are critical for mobile wallets and embedded devices but come with weaker trust assumptions. Understanding sync strategies matters when running infrastructure.
| Tool | Purpose |
|---|---|
| Geth | Ethereum execution client (Go) |
| Nethermind | Ethereum execution client (C#) |
| Prysm / Lighthouse / Teku / Lodestar | Ethereum consensus clients |
| Bitcoin Core | Bitcoin full node |
nodewatch.io |
Ethereum node distribution tracker |
| Ethernodes | Ethereum client diversity dashboard |
curl to make
JSON-RPC calls to your node. Fetch the latest block, get an account
balance, and trace a transaction.
Build a block header sync tool. Connect to an Ethereum node via RPC and download the last 1,000 block headers. Verify the chain: each header's parentHash must match the hash of the previous header. Report any gaps or inconsistencies. Measure download time and compare header-only sync vs. full block download.
Proof of Work (PoW) is a consensus mechanism where block producers
(miners) compete to find a value (nonce) such that:
H(block_header || nonce) < target
where H is a cryptographic hash function and
target is a number that determines the difficulty. A
smaller target means more hashes must be tried on average — higher
difficulty.
Mining Process
H(header || nonce) for each.
Difficulty Adjustment
Hash Rate and Security
The network's total hash rate determines the cost of a 51% attack. An attacker needs >50% of the hash rate to (probabilistically) produce a longer chain and rewrite history. The cost of this attack is the capital and electricity cost of the necessary mining hardware.
Mining Pools
Individual miners have high variance in rewards (may mine for months without finding a block). Pools aggregate hash power and distribute rewards proportionally, reducing variance. Pool protocols: Stratum v1, Stratum v2.
ASIC Resistance
Some PoW algorithms (Ethash, RandomX) attempt to resist Application-Specific Integrated Circuits (ASICs) by requiring large memory access patterns that are inefficient on specialized hardware. This is a contentious design choice — ASIC resistance promotes GPU mining (wider participation) but may reduce total hash rate (weaker security).
PoW was the original blockchain consensus mechanism and remains the security model for Bitcoin (the largest blockchain by market cap). Understanding PoW is essential for reasoning about mining economics, 51% attack costs, and why the industry moved toward PoS (energy efficiency, different security model).
| Tool | Purpose |
|---|---|
cpuminer |
Simple CPU miner for educational use |
hashcat |
High-speed hashing (demonstrates SHA-256 throughput) |
| Bitcoin mining calculator | Estimate profitability given hash rate and electricity cost |
| Stratum V2 | Modern mining pool protocol |
SHA256(header + nonce) starts with a specified number of
zero bits. Measure how nonce attempts scale with difficulty.
Build a complete PoW blockchain. Extend your Stage 1 capstone (data layer) with:
Proof of Stake (PoS) replaces computational work with economic stake as the Sybil resistance mechanism. Instead of mining, validators lock (stake) cryptocurrency as collateral. The right to propose and validate blocks is allocated based on stake weight.
Validator Selection
Different systems use different selection mechanisms:
Ethereum's PoS (Gasper = Casper FFG + LMD-GHOST)
Ethereum's PoS combines two mechanisms:
LMD-GHOST (Latest Message Driven Greediest Heaviest Observed SubTree): Fork choice rule. Instead of following the longest chain, nodes follow the fork with the most recent attestation weight. This provides fast chain growth.
Casper FFG (Friendly Finality Gadget): Finality mechanism. Every 32 slots (~6.4 minutes, one "epoch"), validators vote on checkpoint blocks. A block becomes "justified" when 2/3+ of validators vote for it. When two consecutive checkpoints are justified, the first becomes "finalized" — irreversible unless 1/3+ of the total stake is slashed.
Slashing
Validators who violate protocol rules lose a portion of their stake:
Delegation and Liquid Staking
PoS is now the dominant consensus mechanism for smart contract platforms (Ethereum, Cosmos, Solana, Polkadot, Cardano). Understanding its security model — what attacks it defends against, what economic assumptions it requires, and where it differs from PoW — is essential for any blockchain engineer.
| Tool | Purpose |
|---|---|
| Ethereum Staking Launchpad | Guide for becoming an Ethereum validator |
| Lido | Liquid staking protocol |
ethdo |
CLI for Ethereum staking operations |
| Beaconcha.in | Ethereum beacon chain explorer |
| Rated.network | Validator performance analytics |
3f + 1 stake (at least 2/3 honest by
stake) for safety, matching BFT bounds?
Build a simplified PoS consensus simulator. 16 validators with varying stake weights. For each slot: (a) select a proposer weighted by stake, (b) have all validators attest to the proposed block, (c) implement a simple fork choice (heaviest chain). Run for 100 epochs. Track: proposer distribution, attestation agreement rates, and simulated rewards. Then introduce 4 Byzantine validators that double-attest—show that the slashing mechanism detects them.
Forks
A fork occurs when two valid blocks reference the same parent — the chain "splits" into two competing branches.
Natural forks: In PoW, two miners may find valid blocks nearly simultaneously. Nodes that receive different blocks first follow different chains. This resolves when the next block extends one of the forks (the other becomes an "orphan" or "stale" block).
Soft forks: A backward-compatible protocol change. Old nodes still accept new blocks (they don't enforce the new rules). New nodes reject blocks that violate the new rules. Soft forks tighten the rules.
Hard forks: A backward-incompatible protocol change. Old nodes reject new blocks that follow the new rules. The network splits unless all nodes upgrade.
Chain Reorganization
A reorganization (reorg) occurs when a node discovers a heavier/longer chain that diverges from its current chain at some past block. The node abandons its current chain tip and switches to the new fork. All transactions that were in the abandoned blocks but not in the new fork are returned to the mempool.
Reorg depth matters:
Finality Models
Forks and reorgs are not theoretical concerns — they happen regularly and have real consequences. Applications must handle reorgs gracefully: a payment confirmed in block N may be reversed if a reorg replaces that block. Exchange deposit policies (requiring N confirmations) are directly informed by reorg risk. Protocol upgrades (soft and hard forks) are the mechanism for blockchain evolution.
| Tool | Purpose |
|---|---|
| Fork Monitor (bitcoin.ninja) | Real-time Bitcoin fork detection |
| Reorg tracker (blockchair.com) | Multi-chain reorg monitoring |
eth_getBlockByNumber |
RPC call to track block changes |
Bitcoin Core getchaintips |
Detect alternative chain tips |
Build a fork visualizer. Given a blockchain with forks (use your simulator from Topic 4 or fetch real data), render a tree visualization showing the main chain and orphaned forks. For each fork, show: depth, number of transactions affected, and time to resolution. Highlight any transactions that appeared in the abandoned fork but not in the canonical chain.
Native Tokens vs. Smart Contract Tokens
Every blockchain has a native token (BTC, ETH, SOL) used for transaction fees and consensus incentives. This token exists at the protocol level — its transfer is a primitive operation of the state transition function.
Smart contract tokens are implemented as state within a smart contract.
The "token" is just a mapping
(address → balance) maintained by contract code. Token
transfers are contract calls, not protocol-level operations.
Ethereum Token Standards
ERC-20 (Fungible Tokens): A standard interface for
fungible tokens. Functions: transfer,
transferFrom, approve,
balanceOf, totalSupply,
allowance. Every ERC-20 token has identical behavior
from the protocol's perspective — the VM doesn't know or care about
the "meaning" of the token.
ERC-721 (Non-Fungible Tokens): Each token has a
unique tokenId. Functions: ownerOf,
transferFrom, approve,
safeTransferFrom. Ownership is a mapping
(tokenId → address).
ERC-1155 (Multi-Token Standard): Supports both fungible and non-fungible tokens in a single contract. Enables batch transfers. More gas-efficient for games and applications with many token types.
UTXO vs. Account Model — Deep Comparison
| Dimension | UTXO (Bitcoin) | Account (Ethereum) |
|---|---|---|
| State representation | Set of unspent outputs | Map of addresses to accounts |
| Transaction model | Consumes and creates UTXOs | Debits sender, credits receiver |
| Parallelism | High (UTXOs are independent) | Lower (shared state per account) |
| Privacy | Better (new UTXO per transaction) | Worse (persistent address) |
| Smart contracts | Limited (Script) | Rich (EVM) |
| State growth | Bounded by UTXO set | Unbounded (accounts persist) |
| Double-spend prevention | UTXO can only be spent once | Nonce ordering per account |
State Growth and Pruning
As the blockchain processes more transactions, the state grows. In Ethereum, every new account and every new storage slot increases the state trie. This creates a "state bloat" problem:
Token standards define the interface layer for the entire DeFi and NFT
ecosystem. Understanding why ERC-20's
approve/transferFrom pattern creates a specific security
surface (infinite approvals, front-running) enables better dApp design.
Understanding state growth is critical for long-term protocol
sustainability — a blockchain that grows without bound eventually
becomes impractical to operate.
safeTransferFrom function checks whether the
recipient can handle NFTs, preventing accidental locks.
| Tool | Purpose |
|---|---|
| OpenZeppelin Contracts | Audited ERC-20, ERC-721, ERC-1155 implementations |
erc20-watcher |
Monitor ERC-20 token events |
| Dune Analytics | On-chain analytics and token data queries |
| Ethereum State Size dashboards | Monitor state growth |
approve/transferFrom pattern
instead of a simpler transferTo function? What problem
does it solve, and what new problem does it create?
transfer call and a
approve + transferFrom call through the
code. Identify where balances change and where events are emitted.
gettxoutsetinfo). Compare the UTXO set
memory footprint to Ethereum's state.
Build a multi-token system. Implement a simplified token registry (in Python, not Solidity—this is about the data model, not the smart contract platform) that supports: (a) creating new fungible token types, (b) minting tokens, (c) transferring tokens between addresses, (d) querying balances. Then extend it to support non-fungible tokens with unique IDs. Compare the data structures required for each type.
Extend your PoW blockchain to include basic P2P networking. Two nodes communicate over TCP. Node A mines a block and gossips it to Node B. Node B validates and adds it. Implement basic fork resolution (longest chain wins).
Build a tool that connects to an Ethereum RPC endpoint, fetches a range of blocks, and produces a report: average gas used, transaction count distribution, fee distribution (base fee vs. priority fee), and ERC-20 transfer events decoded from logs.
Build a dashboard (web or CLI) that compares PoW and PoS across dimensions: energy consumption (estimated), finality time, throughput (TPS), hardware requirements, and minimum stake/investment to attack. Use real data from Bitcoin and Ethereum.
Simulate a mempool with 1,000 pending transactions of varying gas prices. Implement a block producer that greedily fills a block (30M gas limit). Compare strategies: highest gas price first, highest total fee first, and include MEV bundles. Measure producer revenue under each strategy.
Build a simplified but fully functional blockchain node that integrates everything from this stage:
Deliverable: Two nodes running on different ports, independently mining, gossiping blocks, and converging on the same chain. A client can submit a transaction to either node and see it reflected in both.
| Mistake | Reality |
|---|---|
| "Transactions are instant on the blockchain" | Transactions are pending until included in a block. Even after inclusion, they are not final until sufficient confirmations (PoW) or finalization (PoS). |
| "Miners/validators can steal your funds" | Miners can reorder, delay, or censor transactions. They cannot forge signatures or spend funds they don't control. |
| "PoS is less secure than PoW" | Different security model, not weaker. PoS has economic finality (slashing); PoW has thermodynamic security (energy cost). |
| "More TPS = better blockchain" | TPS is meaningless without context: what is the decentralization? What is the finality model? What are the hardware requirements to run a node? |
| "The mempool is a single place" | Every node has its own mempool. There is no canonical global mempool. |
| "Hard forks always create two chains" | Hard forks only create persistent splits when a significant minority refuses to upgrade (e.g., ETH/ETC). Most hard forks are coordinated upgrades where the old chain dies. |
At this stage, internalize:
Books
Papers
Documentation
Technical Blogs