We’re working on changing the Tezos consensus algorithm from the current Emmy+ algorithm, to a new algorithm called Tenderbake. We’d like to discuss this development, and explain why we’re considering it, and what advantages it will bring.
Tenderbake and Emmy+ belong to different algorithm families:
- Emmy+ is a Nakamoto style algorithm, whereas
- Tenderbake is a classical BFT-style algorithm.
So moving to Tenderbake would be a significant development on the Tezos network.
We made every effort to keep this blog post self-contained, but just in case, you might also find this glossary of the technical terms useful!
Tenderbake has quick, deterministic finality
From the point of view of the user, Tenderbake’s killer feature, relative to Emmy+, is that it offers deterministic finality: a block that has just been appended to the chain of some node is known to be final once it has two additional blocks on top of it, regardless of network latency.
Tenderbake is also fast, in the sense that it has a small (quick) time to finality: under typical good network conditions, and making a standard assumption of an attacker (“byzantine”) stake of at most 33% — meaning that at most 1/3 of the network is trying to undermine correct behaviour:
- in Tenderbake, one would expect to wait less than 1 minute for a block to be considered final, whereas
- in Emmy+, one would expect to wait at least 6 minutes.
How this estimate is made will become clearer below, when we introduce the notion of round duration.
We now expand on some of these claims:
Probabilistic finality (Nakamoto style)
Blockchains are decentralized, so consensus must likewise be decentralized. So suppose there exist blocks \(b\), \(b_1\), and \(b_2\) such that some participants on the blockchain’s network believe that \(b_1\) immediately follows \(b\), whereas others believe that \(b_2\) immediately follows \(b\): there is no central authority to enforce consensus on which participants are “right” and which are “wrong”, and we say the blockchain is forked, or in a forked state.
In Emmy+, like in all Nakamoto-style consensus algorithms, forks can have arbitrary length. However, forked states become exponentially unstable and tend to collapse down to a single branch (assuming decent bounds on network latency).
When a fork collapses to a single branch, we say that its blocks have reached finality. So, we say that Emmy+ has probabilistic finality because forks of arbitrary length are possible but they collapse with probability that increases suitably rapidly with fork length.
Deterministic finality (classical BFT-style)
In Tenderbake, forks are impossible. Or to put it slightly differently: forks collapse down to finality after just two blocks, always and regardless of network latency. This is deterministic finality. How this is achieved, is the topic of this post.
- The head of the blockchain is a candidate block to be agreed upon, and
- its parent is a block whose non-consensus operations have been agreed, but the consensus operations might still change (on rare occasions).
Tenderbake is safe under asynchrony
From the point of view of a security analyst, Tenderbake has an advantage over Emmy+: no fork is possible, regardless of network delays, even during an asynchronous period.
Let’s explain. Our network model assumes partial synchrony, which splits time into two kinds of periods:
- During a synchronous period, there’s a global bound on the delivery delay for any network message.
This is (hopefully) the usual state of affairs. Messages are delivered promptly (for a certain global value of “promptly”).
- During an asynchronous period, network performance may be degraded: but, every asynchronous period is finite.
This is (hopefully) an exceptional state of affairs.
This is a realistic scenario because in the real world,
- networks work …
- … until they don’t, and then action is taken to fix the problem and service is restored in finite time.
So it is important for a consensus algorithm to
- function efficiently during synchrony, and just as important to
- (perhaps degrade but then) recover gracefully from asynchrony.
So back to our security analyst: Tenderbake guarantees that even if the network degrades during an asynchronous period, once synchrony is restored our blockchain will just pick up where it left off. Nakamoto-style consensus does not have this guarantee, and may emerge from an asynchronous period having developed long forks.
Nakamoto is “live”, classical BFT is “safe”.
As is so often the case in consensus, design boils down to making trade-offs: given that some asynchronous periods will eventually occur, do we prefer to be “live” or “safe” during them?
Nakamoto-style consensus algorithms favour being live: blocks are always produced, even in an asynchronous period. Classical BFT-style algorithms favour being safe: production of blocks pauses during an asynchronous period.
Our network model when designing Tenderbake assumes that any period of asynchrony is finite — we assume that somebody is going to rush to fix the broken cable, protect against the DoS attack, and/or bring the server back online. So it makes sense to be safe and (essentially) just wait for the network to come back online.
From Tendermint to Tenderbake: a journey
The starting point of our work was Tendermint, one of the first classical BFT-like blockchain algorithms.
Tenderbake is ‘just’ a version of Tendermint adapted for the Tezos blockchain, but the adjustments required are substantive, as we discuss below. In summary:
- Tenderbake is tailored to match the Tezos architecture by using only communication primitives and network assumptions which Tezos supports.
- In particular, Tenderbake makes weaker network assumptions than Tendermint, at the price of adding the extra assumption that participants have loosely synchronized clocks1 — which is fine, because Tezos uses them.
The Tezos architecture
When we adapted Tendermint consensus to the Tezos architecture to arrive at Tenderbake, we had to account for the following structural features of Tezos:
Tezos’ self-amending feature means there is a separation between
- the shell, which handles low-level communication, and
- the economic protocol, which actually does the blockchain stuff and which can be updated by a Tezos amendment.
The Tezos shell is for our purposes immutable, so our Tenderbake implementation has to fit into the more easily-updated economic protocol. In particular, we can use only communication primitives supported by the Tezos shell.
Furthermore, Tezos distinguishes between
- a node, which is on the peer-to-peer network and whose job is to manage communication with other nodes and the validation and the storage of blocks, and
- a baker, who produces blocks and communicates with just one node.
We recall that bakers hold delegates’ keys and therefore for safety they are sandboxed — not directly exposed to the peer-to-peer network. So nodes manage communication and shield bakers from network attack, and bakers hold secrets and bake blocks into the blockchain. This arrangement affects what network assumptions we can make.
So, treating these points in turn:
Tendermint uses three types of consensus messages: proposals, prevotes, and precommits. We must map these to the following available shell communication primitives:
- an operation; and
- a block, which consists of
- some block contents which is a list of operations, and
- a block header which contains a hash of the block contents (for efficiency and as a checksum for data integrity).4
- Tenderbake maps Tendermint proposals to the native Tezos action of proposing Tezos blocks, and
- Tenderbake maps Tendermint prevotes and precommits to Tezos operations.5
Bakers are semi-isolated from the network for their own safety, as discussed above, so Tenderbake must weaken the network assumptions of Tendermint3 because those assumptions are too strong to be guaranteed by the Tezos peer-to-peer layer.
Tendermint assumes reliable broadcast: if a correct validator receives a message then all correct validators receive it. Notably, this requires that messages may be delayed, but messages may not be lost, during an asynchronous period.
Tenderbake does not assume communication is reliable. Messages sent during an asynchronous period might never arrive. In fact, Tenderbake is designed to work without any additional assumption except partial synchrony. The technical jargon is: it assumes a best-effort broadcast primitive. Best-effort broadcast means that if a correct validator sends a message during a synchronous period, then all correct (non-byzantine) validators receive it. Best-effort broadcast is the broadcast primitive implicit in the definition of partial synchrony above.
The Tenderbake consensus algorithm in brief
- The level of a block is the number of blocks since the genesis block, where the genesis block is at level 0.7
- The fundamental unit of identity in Tezos is a cryptographic key. A delegate is then a cryptographic key whose owner can participate in the consensus algorithm and in the governance process by virtue of registering (the public part of) their key on the chain, holding at least one roll (= 8,000 tez), and having been active recently.
Tenderbake is executed for each new block level by validators, which are delegates selected at random based on their stake,8 in the same way as endorsers are selected in Emmy+.
The validators’ task is to agree on which block to add next. In Tendermint, this process is
- started by each validator emitting a proposal message, which is just an abstract message in the algorithm proposing a block, and
- continued by validators voting on which proposal to accept, by a voting mechanism which we will describe in more details shortly; what matters for now is that voting takes place so voting messages must be communicated.
Tenderbake has to choose a concrete representation for this process. The natural way to propose a block is for a validator to use the native Tezos proposal mechanism to actually propose its preferred block as the next block to add to the next level in the Tezos blockchain. Validators can then vote on the proposed blocks by communicating their votes across the Tezos network as Tezos operations.9
So schematically, Tenderbake acts as follows:
- a validator injects a candidate block (representing a proposal) and consensus operations (representing votes) into the node to which it is attached, which then
- diffuses those blocks and consensus operations to other nodes of the network, and thus
- communicates them to the validators attached to those nodes,
- to carry out voting on which block to accept.
We now consider how the voting process works, in more detail.
Levels are composed of rounds
For each level, Tenderbake proceeds in rounds. Each round represents an attempt by the validators to agree on some block for the current level.
Each round has an associated duration. Round durations are set to increase so that for any possible message delay and/or asynchronous period (when the network may be slow or unreliable), there is a round that is longer.
Each round has three phases:
- a block proposal phase;
- a preendorsement voting phase, and
- an endorsement voting phase.
In more detail:
- In the block proposal phase, one of the validators is designated as the round’s proposer. The proposer’s task is to propose a candidate block (more on how this is chosen in a moment).
- In the preendorsement phase, validators send a supporting vote (a preendorsement) on the candidate block’s contents. So, this is a vote in support of the non-consensus operations10 that it contains.
Concretely, a preendorsement is a Tezos operation containing a tuple \((x,bc,r)\) with intended semantics
“I, validator \(x\), hereby endorse block contents \(bc\) for round \(r\)“.
- In the endorsement phase, provided validators have observed a quorum11 of preendorsements for the block contents, validators send a confirmation vote (an endorsement) for the contents of the candidate block and for the preendorsement quorum.
Concretely, an endorsement has the intended semantics
“I, validator \(x\), hereby affirm to have observed a quorum of preendorsemeents for block contents \(bc\) at round \(r\)“.
First round vs. subsequent rounds
In the first round, the proposer is free to propose some block contents taken from its node’s mempool — this being a pool of pending operations that the node has accumulated, but which have not yet been baked into a block on the blockchain. In subsequent rounds, the proposer must propose the contents of the block candidate from the previous round, provided a quorum of preendorsements was observed for the block contents proposed in the previous round; otherwise, the proposer is free to choose from its node’s mempool, as for the first round.
If a validator observes an endorsement quorum at its current round for that round’s candidate block, then the validator considers that agreement has been reached on the block candidate’s contents for level \(\ell\),12 and the validator can move to level \(\ell+1\). Each block candidate at level \(\ell+1\) will have to include the endorsement quorums of level \(\ell\), as a proof that agreement was indeed reached at level \(\ell\).13
If a validator does not observe an endorsement quorum at its current round for that round’s candidate block, then the validator considers that agreement has not been reached, and the validator loops back into a next round. This may be caused by:
- a byzantine proposer which does not make any proposal, or proposes two candidate blocks, or
- network delays: (pre)endorsements are sent but just don’t arrive in time, so that reaching a quorum times out.
A final round
Eventually there will be a round with
- a correct (non-byzantine) proposer who proposes precisely one candidate block as required, and
- the round is long enough so that any asynchronous periods have passed and network messages arrive in time, so that
- a quorum is observed and a candidate block for that round is agreed upon,15
and our loop terminates with consensus.14
How long does this all take?
Assuming good network conditions, validators will agree on a block after just one round, so that each level lasts one round. Also, finality is in two blocks, so that finality is achieved after two levels = two blocks.16 Thus, assuming good network conditions, the estimated time for block finality is one minute at most.
On first experiments on a private testnet for Tenderbake, the duration of the first round was set to 15s, so we confirmed that this one-minute estimate is experimentally accurate.
Why do we need (pre)endorsements?
Tenderbake has two voting phases: preendorsements and endorsements. Why vote twice?
These algorithms are genuinely subtle, and this particular design is typical of classical BFT-style consensus algorithms, going back for example to the seminal 1984 DLS (Dwork-Lynch-Stockmeyer) paper. We offer intuition as to why having two phases can be helpful, which — to be clear — is not intended as a definitive explanation:
Intuitively, having two phases helps to ensure agreement and progress even if the network gets partitioned or participants crash.
Suppose we had only one voting phase (so no preendorsements). Then either we allow validators to change their vote in later rounds, or we don’t:
- If we allow validators to change vote in later rounds…
… then suppose only one validator \(v\) sees an endorsement quorum and decides on some block \(b\), but then it crashes. The other validators are not aware of \(v\)‘s decision, so they may later vote on a different block. Thus, the agreement property may be broken.
- If we don’t allow validators to change vote in later rounds…
… then suppose the proposer for a round is Byzantine and it proposes two blocks \(b\) and \(b'\); then some validators vote on \(b\), and some vote on \(b'\), and neither \(b\) nor \(b'\) gathers a quorum of votes. Then there will be no agreement, because validators have to stick with the votes forever. Thus, the progress property is broken.
This is solved by using two voting phases. Indeed, with two voting phases, participants only endorse a block once they are confident that a preendorsement quorum already exists for it. This adds stability: a fragmented network might delay consensus, but it cannot enable a fragmented consensus.
To quote Pierre Chambart, answering the question “Why preendorsements?”:
Ça permet de mesurer si tu es dans la majorité, avant de prendre la décision de voter pour de vrai. (J’imagine la gueule d’une vraie élection si les gens ne voulaient voter que pour le vainqueur).
This allows you to know you’re voting with the majority, before casting your final vote. (Imagine what a real election would look like, if people only wanted to vote for the winner.)
The interested reader can also watch Ittai Abraham’s excellent tutorials:
- Byzantine fault tolerance, state machine replication and blockchains and
- The HotStuff approach to BFT (Hotstuff is a successor to Tendermint).
Where are we?
- We proved on paper that Tenderbake is correct, in a sense made formal by Theorems 5 and 6 of that paper.
- Then, we worked on a prototype to test our approach. This prototype took the form of a “demo” economic protocol, which includes just the consensus algorithm and a basic account system (but nothing more: e.g. no rolls, delegation, smart contracts).
- We are now close to a fully-featured economic protocol (a modified version of Delphi using Tenderbake instead of Emmy+).
- A private testnet is running a modified version of Delphi using Tenderbake instead of Emmy+. After fixing some observed issues, we plan to make this testnet public.
So that’s it! Tenderbake is classical BFT-like whereas Emmy+ is Nakamoto-like. Emmy+ has probabilistic finality making it more live but less safe, whereas Tenderbake has deterministic finality making it more safe but less live. Tenderbake also has noticeably quicker time to finality.
We’re testing it now: so watch this space.
Recall our network model assumes that we are always either: in a synchronous period, when there exists a global bound \(\delta\) such that for every pair of nodes on the Tezos network, messages get delivered within time \(\delta\); or in an asynchronous period, when they don’t, but asynchronous periods only ever last finite time.
Say a network has loosely synchronized clocks when for every synchronous period there exists some constant \(\rho\) such that and for any blockchain participant, the time error (also called “clock drift”; the difference between the real time and the participant’s local clock) is bounded by \(\rho\).
The values of \(\delta\) and \(\rho\) in the two paragraphs above are a priori unknown.2
During an asynchronous period, Tenderbake does not assume any bound on clock drift. This is reasonable: if network delays are unbounded then it is reasonable to suppose that timing messages might be arbitrarily delayed. Conversely, during a synchronous period when there is a global bound on message delays, we may assume that timing messages arrive after a bounded delay, so that clock drift is also bounded.
Technical note: There is a stability assumption in the implemented system, that the duration of the first round must be greater than the clock drift \(\rho\). This guarantees that a validator can ignore all messages that do not match its current round, which ensures that message buffers remain bounded, and the validator cannot be spammed with irrelevant messages. ↩
‘Unknown’ here means that \(\delta\) and \(\rho\) are parameters: an external observer of the system might be able to observe the system and calculate or deduce a value of \(\delta\) or \(\rho\), but no algorithm within the system itself is allowed to depend them having any particular value. See also a similar discussion for partial synchrony. ↩
The block header contains other useful metadata, including: a hash of its corresponding block contents as mentioned above, a hash of its predecessor block, the level, the round, and a timestamp. This absolutely doesn’t matter for this blog post, but if you’re reading this footnote then you’re probably the kind of person who would want to know. You asked your parents to read the encyclopaedia to you at bedtime instead of “made-up stories”, too, didn’t you? ↩
A Tezos “operation” is just the basic unit of information in Tezos. An operation is labelled by a kind (called a “pass” in the source code) which is an integer such that \(0\) indicates “this operation is a consensus message” and any other number indicates “this may be something else”. It is up to the protocol to decide how to interpret operations — an operation is first created, then injected into some node’s mempool, transmitted on the network, and finally possibly (but not necessarily) inserted in a block. So there is nothing particularly fancy about encoding prevotes and precommits: they are just data which can be encoded as operations6 and — maybe, maybe not — baked into a block on the blockchain. ↩
We could map Tendermint prevotes to blocks (instead of operations), but it would be inefficient. The algorithm requires a quorum of prevotes, so we’d need \(2f+1\) blocks to reach a quorum, where \(n\) is the number of validators. With \(150\) validators, prevotes would generate \(100\) blocks just to generate consensus on the next ‘real’ block to be added.
It might be possible to map block proposals to operations. This was discussed, but it would require changing the shell: a proposal would have to be a special bundled operation that behaves almost like a block; when a node sees a proposal operation it would need to ask its neighbors to provide the operations contained in the bundle (the bundle operation would not actually contain operations but just pointers to, i.e. hashes of, operations). Changing the shell is anyway a big deal, and there’s clearly an easier way of just using the native Tezos block proposal mechanism, so proposals are blocks and not operations. ↩
If blocks were years then a block of level 1984 would be the year 1985 AD. Fun fact: our calendar goes from 1 BC to 1 AD. There is no year zero. ↩
The running economic protocol makes a random choice from the available delegates (weighted by their stake) to be validators for the current block level. To be quite precise, at each level, \(n\) rolls are selected at random and their owners are the validators for that level. ↩
Non-consensus operations means operations other than preendorsements and endorsements. Notably, non-consensus operations include transactions, which for a consensus algorithm are just data, but which for the end user are the whole point of the exercise. By block contents we intend all operations other than preendorsements and endorsements. (Pre)endorsement operations contain the hash of the putative “block contents”, not the operations themselves. ↩
An aside on quorums and the quorum intersection property.
Suppose for simplicity that everybody has the same stake, so we can ignore stake-based weightings. Suppose there are \(n=3f+1\) validators, where \(f\) is the number of incorrect, Byzantine validators — the standard assumption is that at most one third of validators are Byzantines, so taking \(n=3f+1\) is the worst case. A quorum is a set of \(2f+1\) signatures. Then by the pigeonhole principle, quorums intersect at \(f+1\) signatures, and given the bound \(f\) on Byzantine actors, it follows that there is at least one signature of a correct validator. This is called the quorum intersection property. ↩
By the quorum intersection property, there cannot be two endorsement quorums on two different block contents at the same round, because then a correct validator would have endorsed two different block contents at the same round, which is forbidden by the algorithm. This observation is at the core of the proof that there cannot be two different blocks agreed upon at the same level. ↩
Anyone can check that the endorsements are produced by the right delegates, that is, by the validators for the corresponding level. This is because validators are known well in advance and their public keys are available on-chain. ↩
For each validator, there is a final round, but final rounds may differ for different validators, and different validators may execute the loop a different number of times. So: rounds are a per validator quantity. ↩
The property that agreement is eventually reached is expressed precisely by a termination property — which in the context of a blockchain becomes a progress property (each level terminates, so we make progress). An attacker could try to delay the network at a rate calculated to be just a little more than the increase in round durations, so that for each round \(r\) the proposer at round \(r\) times out, though our network model assumes this cannot go on forever: any asynchronous period must be bounded. Following our discussion above of how classical BFT-style consensus algorithms (like Tendermint and Tenderbake) favour safety over liveness, we can draw the following chain of informal entailments: “asynchronous periods assumed bounded” \(\Rightarrow\) “we can safely err on the side of safety” \(\Rightarrow\) “deterministic finality”. ↩
Tendermint has immediate finality, meaning deterministic finality after just one block: once a block that was proposed in a proposal message is agreed upon, it is final. Tenderbake requires the marginally slower deterministic finality after two blocks because of its stronger timing assumptions: in order for validators to synchronize in their current round, proposal messages contain a round identifier, and in Tenderbake, proposal messages are encoded as blocks, thus blocks contain round identifiers — however, validators may take their decision at different rounds, so their decision rounds may differ! Thus, Tenderbake has more to agree on than Tendermint: its committee of validators must first agree on the “true” decision round to be able to agree on which is the “true” block. ↩