Recently I have tried to understand blockchains and how they actually work. It seems the internet is full of high-level descriptions that always stop before actually explaining how the different blockchain variants really work. What is proof-of-work in practice, and what does proof-of-stake actually mean? What actually happens when you deploy a blockchain node in different variants, how does the blockchain “network” reach “consensus”? What is the blockchain “network”? What is “concensus”? What is “byzantine fault tolerance” and how does it relate to blockchains?
I started with looking into Bitcoin, since it is the most popular and has the most information available. There is a nice book about it even, with reasonable amounts of implementation level details as well. The book is named “Mastering Bitcoin: Programming the Open Blockchain 2nd Edition”. Just in case the Amazon link breaks. I am sure there are many other great books and resources but I really liked that one. The Naivecoin tutorial I discussed before is also good stuff. So generally, resources are there for understanding Bitcoin style PoW blockchains, also at more detailed level. But let’s see what I think, and you can then correct me. I start with PoW and then give some PoS assumptions.
Proof of Work (PoW)
Proof of Work simply refers to running a specific hash function for long enough to find a set of values that produces a hash value matching some given criterion. An example using a generic block structure, mostly imitating Bitcoin block structure:
- prev_hash: Previous block hash
- merkle_root: Hash for data in this block (Merkle root)
- timestamp: Current timestamp
- diff_target: Difficulty target
- nonce: A value that is changed to try to find different hashes
- transaction_list: List of transactions included in this block
I use the above properties as an example for hashing. Additionally, there would be a number of more generic meta-data, such as magic number, block size, version. Such metadata are used in parsing to find expected data size of block (e.g., bytes to read from socket), and to verify the block data is valid (correct magic number, version number, ..). For more details on actual Bitcoin block elements, see e.g., here.
To build a hash for PoW, the above list of data can be all hashed together. So it becomes something like this:
block_hash = SHA256(prev_hash+merkle_root+timestamp+diff_target+nonce+transaction_list)
This is slightly simplified, but serves the purpose. Here SHA256 is a specific hash function, and various other hash functions are often used as well. The resulting block_hash can be used to verify the contents of the block, since to get the same hash value you need to have the exact same block contents as input for the hash function. This protects from anyone changing a processed block (e.g., modifying existing transactions already in the blockchain). Since the attacker would have to be able to re-calculate a new hash for this block, as well as all blocks coming after it (so the longer/deeper the block is in the chain, the “safer” it is from modification as it is buried deeper under new blocks).
The blockchain itself is just a long list of blocks, each referencing to the previous block (with prev_hash). The term chain refers to each block linking to previous one this way. This connection is enforced by including the prev_hash in the next block along with block_hash of the block itself. So you cannot just change the order of the blocks in the chain, as the linked prev_hash values would no longer match.
Look, I made a pretty picture:
Here, the block itself contains the block_hash, prev_hash, and block_data. So block_hash is calculated over all the rest of the data in the block (including prev_hash). Prev_hash links the block to the previous block in the chain, as it is the block_hash of the previous block. Block_data contains all the rest of the data, and is of course included in the block_hash, meaning block_hash verifies the integrity of all this data. Because changing block_data (or prev_hash) would change block_hash, and thus invalidate the whole chain (links would no longer be valid as prev_hash would differ from block_hash of previous block). Cute.
Anyway. What is proof of work? Anyone can calculate a block hash using the above hashing formula. But not just any hash will be accepted by the blockchain network. The acceptance criteria is defined in the software designed to run the blockchain network. The most common one is the difficulty based criteria. The Naivecoin tutorial has a good and practical description of one. The difficulty is calculated using an algorithm with several parameters. A typical example:
- current_difficulty: Current mining (PoW) difficulty.
- adjustment_interval: Number of blocks to produce until difficulty should be adjusted.
- target_interval: The target time that should be spent between generating blocks.
The blockchain always targets to produce blocks with an interval time of target_interval between them. For example, Bitcoin uses a target_interval value of 10 minutes, and a adjustment_interval of 2016 blocks. With a good difficulty value, 2016 blocks should take 2 weeks to generate. This is 2016*10/60/24=14 days. 2016 is the number of blocks in adjustment interval, 10 is for 10 minutes per block, 60 is for minutes in hour, 24 for hours in a day. If mining 2016 blocks takes less time than 14 days, the difficulty will be adjusted higher. If the mining time is more than 14 days, the difficulty is adjusted lower. This way the blockchain network adjusts its parameters over time to keep the amount of coins mined close to the specified rate. Many blockchains use shorter than 10 minute time_interval and shorter than 14 days adjustment_interval. But generally, the idea is again the same. This is the basic idea, which can be built on by other difficulty adjustment algorithms such as LWMA, which seems to be quite popular with many altcoins currently.
The process of mining in PoW systems is to try hashing different block configurations (thus different hash inputs), until you get one that passes the difficulty check. The nonce parameter is there specifically for this purpose, to let miners try different values to get a hash that passes the difficulty check. Additionally, one can play to some extent with the choice of transactions to include as well as the exact timestamp used. These all change the input to the hash function, and thus the produced hash value. But nonce is the intended main value to change. Additional considerations can also be applied, such as extra_nonce etc. but the main concept is the same.
What is the difficulty exactly? It is simply a reference to a hash value that needs to be found. With higher difficulty, the target hash value is set to be harder to find. With lower difficulty, it is set to be easier to find. This is based on the fact that a hash value is just a very large number, even if usually represented as a hex-string. Examples:
- Difficulty target set to 100k: You need to find a hash value that is less than 100k.
- Difficulty target set to 1M: You need to find a hash value that is less than 1M.
Out of these two, the 100k difficulty target is higher in difficulty, because there are much fewer possible target values (100000). 1M is a much “easier” difficulty target since there are 10x the possible accepted hash output values (100000 vs 1 million). Thus, a higher difficulty target means a lower difficulty (i.e., easier to find a suitable hash and thus create a valid block). Most confusing, I am sure. There can be variants of this, such as using a number of leading zeroes required for the hash output instead of under a specific number, but the concept is still the same.
The hashing algorithm presented far above (block_hash=…) produces a single value. And we cannot control what value it produces, since hash-functions are one-way and no-one has been able to figure a way to reverse them (directly generate input to give specific hash output). So the only way to find a hash matching the target difficulty is to keep trying different input combinations until we hit one which produces a suitable hash value (below the minimum difficulty target). The “Proof of Work” is then just proof that you have generated huge numbers of these hashes, enough to find one matching the difficulty criteria. Hashing is the work, the proof is producing a hash matching the difficulty target.
Peer to Peer (P2P) consensus
Once someone finds a block (produces an accepted hash..), the block needs to be added to the blockchain. This happens when the finding node propagates the new block with the valid block hash to its peers, which propagate it further to their peers, and so on for the whole network. The peers can all check that the data in the block produces a valid hash in relation to the difficulty target, and that the data in the block is verified (prev_hash matches the block_hash on previous block added to the chain, transactions in transaction_list are valid, etc.). This of course requires all nodes to be connected in a peer-to-peer fashion.
The peer-to-peer network is established and continously updates itself over time. When a blockchain node boots up, it typically establishes a connection to the blockchain network via a pre-configured set of seed nodes. These can be static IP addresses stored in the client, or a set of pre-defined DNS server addresses. Once the node has connected this way to its first set of peers, these peers provide it with additional peer nodes to connect to. The node can request more peers from these peers, and so on. Connections to nodes may drop over time, and the node can use peer lists it receives from others to find new ones.
The initial synchronization of the blockchain state happens by the nodes sending their confirmed blocks, transactions, and other blockchain state information to each other and any new peer joining the network. Once nodes are synchronized, the new mined blocks are added to the blockchain as soon as some node finds them. New “money” is created by the mining node adding a “coinbase” transaction to any new block it finds (gets an accepted hash for), giving itself the number of coins agreed in the network specification for finding a new block.
So this is how distributed consensus is achieved in PoW networks. But how about PoS networks? Not too different (or so I think), but the way of creating and assigning new blocks has to be different. That is the whole point, of course..
Proof of Stake (PoS)
The above information on Proof of Work is actually relatively simple to find on the internets and books. The Bitcoin book I linked above is one great source for such information. This is maybe because Bitcoin is such a well known and long-standing project, it has a huge market for information, and so there a lot of information about it. For Proof of Stake, there is much less information, and I cannot name a single coin in PoS side that would be such a “canonical” project. Ethereum is talking about making the switch from PoW to PoS, and some proposals for this in the form of “Casper” proposals are available. However, these are written in a very theoretical manner, and digress all over the place. Not so great for someone like me, who just wants a good understanding at a bit more detailed level.
So what do I think is Proof of Stake? At a high level it is always described as distributing new coins based on the amount of coins you already have. So each round (staking interval) gives a set of new coins to existing owners, based on their “stake” in the network. The stake would be the number of coins you already have, and are willing to “stake”. And quite often this is referred to as requiring the users wallet to be online to be “staking”, that is to be able to receive these rewards. So one might expect that the wallet is connected to other peers in a similar peer-to-peer network as for PoW, but the network somehow agrees to give new coins pediodically to someone “staking” instead of using a PoW hashing contest. But how does this staking actually happen? How does a distributed and assumably decentralized network ever agree on who gets the coins?
I found some Reddit post (which I lost the link for) and a couple of StackExchange posts (post1, post2). And the Blackcoin and Peercoin whitepapers. These all seem to describe the same thing, even if with a bit of a different twist. In any case, what they describe makes sense, so I will go with that. There is also Ouroboros whitepaper, Tendermint whitepaper, and the Ethereum PoS FAQ. But these drift off in way too academic way for me.
The algorithm I get from these is very similar to that of PoW hashing. The main difference is that, there is no field in the block data that is hashed that can be easily varied to such a large extent that is done in the PoW implementations. Instead, only a selected set of core values in the block is selected as the state given as input to the hash function. This set of core values is called the “kernel”. Since the data selected for this input has only a limited amount of possible values, and thus hash outputs, it makes no sense for the “miner” to spend a lot of effort and energy on trying different values. Mainly because there are only a few values to try at each point in time. This is how you try to escape the hashing contest of PoW coins, and save the planet by reducing energy consumption.
So in this scenario, I would assume each “staker” takes their set of kernel values, along with a timestamp, hashes these, and if they get a value that fits the difficulty target of the network, they can be selected to “mint” new coins. The timestamp is masked to remove the lowest bits, in order to make it unfeasible to try a large number of close-to-each-other timestamps. The number of coins owned is used as a way to weight the hash outcome higher, and to increase the likelihood to “mint” the new coins on each round based on the number of coins staked. A number of previous blocks hashes is also included in the equation to make it more difficult to try to come up with different hash values for future blocks.
From the posts I linked above, I get the more sensible explanation of how the node to create a block is selected. So the hash you end up with from your kernel data is compared against the network difficulty (similar to PoW), and the difficulty is also adjusted at similar intervals as with PoW. Some of the early coins (e.g., the Peercoin whitepaper) also discuss adding coin age into the equation to favor coins that have not been selected recently. Later on the Blackcoin whitepaper sees this as a potential problem (even possible attack vector) and removes age from the equation. And from there I guess it has diverged to all sorts of academic exercise, that are simple but made to look complex.
In any case, I guess there are other ways to do different parts of PoS. But the process in general as I described above seems like it would make sense. Please do correct and tell me how it really works.. 🙂
Something I wonder after all this is, why is it always said you need to have your wallet online to “stake”? I don’t see a need, other than to have something around to calculate your kernel hash. You could just grab the kernel input data from the wallet, and keep a separate program online to calculate the PoS hashes. This way there would be much less risk of wallet being compromised, which is often cited as a risk of PoS staking. I must be missing something again?