¹¹institutetext: Univ Rennes, Inria, CNRS, IRISA ²²institutetext: The University of Sydney ³³institutetext: Fantom Research

Reusable Formal Verification
of DAG-based Consensus Protocols

Nathalie Bertrand 11 0000-0002-9957-5394 Pranav Ghorpade 22 0009-0001-0421-4490 Sasha Rubin 22 0000-0002-3948-129X Bernhard Scholz 33 0000-0002-7672-7359 Pavle Subotić 33 0000-0002-6536-3932

Abstract

DAG-based consensus protocols are being adoption by blockchain companies to decrease energy footprints and improve security. A DAG-based consensus protocol collaboratively constructs a partial order of blocks of transactions and produces linearly ordered blocks. The ubiquity and strategic importance of blockchains call for formal proof of the correctness of key components, namely, consensus protocols.

This paper presents a safety-proven formal specification of two DAG-based protocols. Our specification highlights several dissemination, DAG construction, and ordering variations that can be combined to express the two protocols. The formalization requires a refinement approach for modeling the consensus. In an abstract model, we first show the safety of DAG-based consensus on leader blocks and then further refine the specification to encompass all blocks for all processes. The TLA+ specification for a given protocol consists of 492-732 lines, and the proof system TLAPS verifies 2025-2294 obligations in 6-8 minutes.

Keywords:

Formal verification Theorem Proving TLA+ Consensus Blockchain

1 Introduction

Cryptocurrencies, and by extension decentralized finance (DeFi), provide a decentralized alternative to conventional banking and financial institutions, changing how financial services are accessed, managed, and implemented. Blockchains are the core technology behind cryptocurrency, and a core component of modern blockchains is the consensus protocol. These protocols coordinate processes so that they agree on the state of the blockchain. Early blockchain consensus protocols assume degrees of synchrony in the environment to ensure safety and liveness [34, 8, 28, 21]. Recently, a family of asynchronous probabilistic consensus protocols have been introduced that are based on Directed Acyclic Graphs (DAGs) [24, 25, 4, 18, 16, 22]. These protocols report high performance while guaranteeing Byzantine Fault Tolerance (BFT), utilize processes fairly, and exhibit low communication complexity. Several leading blockchains have adopted DAG-based protocols as their main consensus mechanism [3, 16, 23]. Given the trillions of dollars locked in various blockchains [19], manipulating the blockchain’s consensus protocol is a natural attack vector. Notably, double-spending attacks [27] attempt to exploit unsafe protocols by discovering inputs and environment settings that cause the blockchain to have inconsistent states, resulting in a unit of currency being used twice. While consensus protocols are almost exclusively designed with the intent to be safe and thus mitigate such attacks, ad-hoc software design makes it difficult to ensure that a consensus protocol is safe for all inputs. For instance, well-established, state-of-the-art blockchains are unsafe [35].

A rigorous method to ensure safety is to employ formal verification, and thus construct mathematical objects describing the system behavior and mathematical proof that the consensus protocol model is correct for all possible inputs and configurations. While such proof efforts exist for a small number of consensus protocols [37, 9, 38, 7, 5, 2], the vast majority of consensus protocols operate with an informal assumption of safety. This is because formal models of protocols are seen as tedious, time-consuming, and complex to prove since there is a lack of compositional building blocks that can be used for proving the correctness of a new protocol. While reuse is often supported at the specification level, proofs for the given properties are not compositional and require a rewrite of the proof from scratch. On the other hand, forgoing modeling and safety proofs is not a safe option because the number of possible interleavings in an asynchronous environment is prohibitive. A testing regime for a software implementation might not reveal deficiencies in the protocol design itself, and malicious behaviors of Byzantine processes may not be fully understood and covered by test cases. Generally, testing and traditional model checking do not uphold correctness for participating processes and their configurations and behavior.

In this paper, we present an industrial case study where proven models of consensus protocols are used to drive development. In our case study, we formally verify the safety of two DAG-based protocols: DAG-Rider [24] and Cordial Miners [25]. Our experience shows that while these (and many other) DAG-based protocols collectively operate on the same basic principles, the nuances introduced to improve performance can impact the protocol’s correctness. Thus, the challenge is to produce specifications and proofs that become compositional and extensible for existing and new DAG-based protocols. We create compositional specifications that encapsulate commonalities between protocols and allow for the reuse of proofs. This relies on equality by abstraction [39]: Choosing the right abstractions for DAG-based protocols allows us to decompose protocol properties and proofs into components. Thus, our work is an important step towards creating specifications and a collection of proofs that can be largely reused and require only small changes to handle DAG-based protocols.

We have implemented an open-source formal specification and proof of DAG-rider and Cordial Miners, publicly available at [20]. Our specification is in the Temporal Logic of Actions (TLA+) [29]. We perform proofs using the TLA+ proof system (TLAPS) [12], a proof system for mechanically checking proofs written in TLA+. Our approach applies to other popular DAG-based protocols, including BullShark [22], Aleph [18], Hashgraph [4], Lachesis [16] with limited effort. Additionally, apart from formally proving the safety of DAG-Rider and Cordial Miners, our proof effort has led us to discover many similarities between DAG-based consensus protocols, that are not immediately obvious.

The rest of the paper is structured as follows. In Section 2, we give an introduction to DAG-based consensus protocols. In Section 3, we describe the abstractions that render verification modular and reusable. In Section 4, we present the TLA+ specifications of our framework and explain how we instantiate TLA+ specifications for two popular DAG-based protocols in our framework. Section 5 discusses how we prove safety compositionally. In Section 6, we evaluate our specification and proof. In Section 7 we discuss how other DAG protocols can be specified and verified in our framework. We present related work in Section 8 and conclude in Section 9 with our experience and future work.

2 DAG-based Consensus Protocols in Blockchains

DAG-based consensus protocols [24, 25, 4, 18, 16, 22] solve the Byzantine Atomic Broadcast (BAB) problem [33, 10, 15], which is to totally order blocks (a block is a sequence of messages representing transactions, e.g., exchange of assets) in the presence of a certain proportion of Byzantine processes. They assume a network of $n$ processes where up to $f$ are byzantine [30] with $n\geq 3f+1$ . Processes can create, sign, and send messages to each other. It is assumed that every message sent from one correct process to another is eventually received. In addition, each process can sequentially output (aka “deliver”) messages. Thus, each process outputs a sequence of messages. In particular, a DAG-based consensus protocol solves the Ordering Consensus Problem [25] by ensuring the outputs satisfy two properties: safety and liveness. Informally, safety states that the output sequence of any two correct processes is consistent, i.e., one is the prefix of the other; liveness states that every message sent by a correct process is eventually output by all the correct processes with probability one. Note that safety and liveness essentially imply BAB.¹¹1BAB also requires message uniqueness (called non-equivocation in Blockchain). DAG-based protocols are constructed in two phases: (1) a communication phase also referred to as the DAG construction phase, and (2) an ordering phase. The rest of our description focuses on two specific protocols called DAG-Rider [24] and Cordial Miners [25]. However, our description largely extends to other DAG-based protocols as well; we elaborate on this in Section 7.

Communication phase.

In the communication phase, processes create and broadcast vertices in rounds. Effectively, a vertex consists of the following: the creator’s ID, a round number, a set of references to vertices from previous rounds (empty for round 0 vertices), and the block that the creator is proposing. Each process builds a Directed Acyclic Graph (DAG), called its local DAG. Vertices in process $p_{j}$ ’s local DAG are the vertices that process $p_{j}$ created, as well as those received from other processes. Outgoing edges in the local DAG from vertex $v$ go to vertices $v^{\prime}$ referenced by $v$ . These are necessary to ensure that processes understand what other processes see in their local DAG. A vertex is only added to $p_{j}$ ’s local DAG once its references have already been added; in the meantime, the vertex is stored in a buffer. A process $p_{j}$ is said to have completed round $r$ once its local DAG contains at least $n-f$ vertices with round number $r$ . Process $p_{j}$ can only create and broadcast a vertex in round $r$ once it has completed round $r-1$ . Each local DAG has a natural layered structure: the vertices in round zero form the bottom layer, and those in round $r$ have edges to (some of) those in lower layers — this is illustrated in Fig. 1(a), representing the local DAG of each of the $N=4$ processes $P1$ to $P4$ . Thus, each local DAG imposes a partial order on its vertices. Notice that the local DAGs of different processes do not have to be equal or even subgraphs of each other. However, by the above construction, if a vertex $v$ appears in two local DAGs, then the sub-DAG rooted at $v$ (aka “causal history of $v$ ”) is the same in both local DAGs.

Ordering phase.

From their local DAG, which represents a partial order of vertices, the task of each process is to, independently and with no additional communication, totally order the vertices, and output a sequence of blocks. Totally ordering the partially-ordered DAG can be achieved by some topological sort of the DAG. The challenge is to ensure that all correct processes order the vertices identically. For this if processes could agree on some “anchor” vertices, then they can linearize the so-far unsorted vertices in the causal history of anchor vertices by a deterministic procedure like topological sort — agreed upon prior — for producing the linear order. In Fig. 1(b), the first anchor is labeled $1$ , and the second is $2$ . We observe that the difference in causal history between anchors, indicated by the vertices in the shaded regions, can now be totally ordered independently by a pre-defined topological sort for each process. We note that, as a consequence of the FLP impossibility result [17] and the fact that the rest of the protocol is completely asynchronous and deterministic, the anchors must be chosen at random. To do this, each process divides its local DAG into a fixed number of consecutive rounds called waves ( $4$ in DAG-Rider, $5$ in Cordial Miners). The processes use a Global Perfect Coin (GPC) to agree on a random leader vertex in each wave. The waves are used to allow the processes to commit certain leader vertices as anchors. Thus reaching a consensus on the sequence of committed wave leaders (aka anchors) is sufficient to obtain a consistent ordering of vertices in the DAGs. Blocks in these ordered vertices can then be sequentially output, ensuring consistent output sequences (safety).

Refer to caption — (a) Communication phase: building local DAGs for each processes and our grid abstraction depicted for P1.

At a high level, various DAG-based protocols share the same core aspects. However, at the component level, they exhibit subtle yet important differences. For instance, DAG-Rider [24], Aleph [18], and BullShark [22] rely on a reliable broadcast to disseminate blocks in the communication phase. As a consequence, from a given round one may receive at most one vertex from any other process so that non-equivocation — also known as forking — is immediate. However, use of reliable broadcast can be costly in terms of message complexity and latency, so for efficiency reasons, protocols such as Cordial Miners [25], Hashgraph [4], and Lachesis [16] use unreliable communication to disseminate blocks. As a consequence, a process may receive several blocks created in the same round by a faulty process, leading to equivocations. This also complicates the commit wave leader procedure as a wave can have several leader vertices, and committing a unique one adds extra difficulty.

3 Abstractions towards Generic and Simple Specifications

In this section, we describe the overall architecture of our specifications and proofs. The challenge in creating a reusable formalization is to devise the appropriate abstractions to make the specifications and the proofs compositional. The key to the abstraction lies in the representation of the network state itself. The processes in the network may have received different sets of messages, and hence might have different local DAGs (as described in Section 2). Our network state abstraction stores a local DAG for each process. Each local DAG is stored as a two-dimensional array: an index $(q,r)$ in the array stores vertices $v$ created by process $q$ at round $r$ , and we call $(q,r)$ the position of the vertex $v$ in the local DAG. Some positions can be empty; for instance, in the local DAG of $p_{2}$ in in Figure 1(a), the position $(p_{1},4)$ is empty, representing that $p_{2}$ has not heard (yet) from process $p_{1}$ at round $4$ .

Note that the use of reliable broadcast ensures the presence of at most one vertex at each position of any local DAG, whereas, with unreliable broadcast communication, local DAGs may have more than one vertex in each position. To handle both these cases in a uniform way, and to exploit compositionality, we break the problem of agreeing on committed wave leader vertices into two subproblems: (1) agree on a vertex for every position (since a position may contain multiple vertices), (2) agree on committed wave leader positions. With this decomposition, we can separate the specification of the consensus protocols into two specifications. The communication specification encompasses the communication phase in which the DAGs are constructed, as well as the protocol that agrees on a vertex for every position.²²2Thus, we make the subproblem “agree on a vertex for every position” part of the communication phase specification (even if it is described in the ordering phases in certain protocols, such as Cordial Miners). The ordering specification encompasses the protocol that agrees on committed wave leader positions.

Toward a uniform presentation of the ordering specification, we perform an abstraction of the local DAGs. Instead of recording all the positions in a local DAG, we can focus on wave leader positions only. Since each wave has a unique leader position, we construct and store a wave-DAG. Vertices of the wave-DAG are only the wave leader vertices constructed in the communication phase. Edges of the wave-DAG represent: (1) the transitive closure of strong edges for DAG-Rider, and (2) the ratification relation for Cordial Miners between the representative vertex of wave leader positions. On the other hand, we provide separate communication specifications for reliable broadcast and unreliable broadcast. Unreliable broadcast is a basic form of communication and can be implemented in various ways. Thus, towards a uniform presentation of unreliable broadcast communication, we abstract out actual communication. We represent the reception of a vertex by process $p$ from process $q$ as process $p$ non-deterministically copying a vertex from the local DAG of process $q$ . Non-determinism here allows us to represent all possible interleavings that a specific communication implementation would yield. For reliable broadcast, we assume that the network will ignore all but the first message from a process with a given round. Finally, in the refinement step, for each communication specification, we instantiate the ordering specification from it and perform a synchronous product of the two specifications. Although the specifications do not have the same set of actions, we use a function that maps actions in the communication specification to the actions in the ordering specification. This gives us complete specifications and proof of safety for protocols; DAG-Rider and Cordial Miners as shown in Fig. 2.

Miscellaneous Simplifying Abstractions.

In our specification, a vertex in a position not only represents a block but the entire causal history of this vertex in the DAG (inspired by [24]). This abstraction significantly helps us to provide formal proofs of safety, as it implies that agreeing on anchor vertex in our abstraction equates to agreeing on the causal history of the anchor vertex. Since we focus on safety properties in this paper, we simplify the protocol specification by removing features that can be ignored for safety and are only required for liveness. In particular, random coin tosses (used by the GPC) can be replaced with deterministic ones. The choice of leader position for each wave is thus represented by a deterministic function (a parameter to the specification, not a fixed function) that maps wave IDs to positions. Not only does this simplify the specification by overcoming the impossibility of encoding random choices in TLA+, but this also allows us to obtain a unique specification for DAG protocols, even if they differ in the way they propose random wave leaders, as is the case for instance for BullShark [22], Aleph [18], and Lachesis [16].

We employ a procedural abstraction that ignores all states that are internal to a procedure of the pseudo-code of the protocol and only represents the input/output behaviour of each procedure. For instance, in the $\mathtt{wave\_ready}$ procedure described in algorithm 3 in DAG-Rider [24], the relevant variables are $\mathtt{decidedWave}$ , $\mathtt{deliveredVertices}$ , and $\mathtt{leadersStack}$ , but neither the loop variable $\mathtt{w^{\prime}}$ nor the auxiliary variable $\mathtt{v^{\prime}}$ are important in our specification. We also use memoization to efficiently compute the values taken by recursive functions, by introducing a fresh state variable that stores the needed information to evaluate recursive functions in a single step.

4 Specifications of DAG-based Consensus Protocols

In this section, we discuss in detail our TLA+ specifications and how to compose them to get specifications of DAG-Rider and Cordial Miners [20]:

•

ReliableBcastCommunicationSpecification.tla
•

UnreliableBcastCommunicationSpecification.tla
•

LeaderConsensusSpecification.tla

4.1 Specifying Communication in TLA+

4.1.1 Specifying Reliable Broadcast Communication

Reliable broadcast ensures three crucial properties, namely validity, agreement and integrity [24]. Validity and agreement are only necessary to prove the liveness of the consensus. Since we focus on the safety properties of consensus, we simplify the specification of reliable broadcast into a weaker version that does not guarantee validity nor agreement, yet ensures integrity. Integrity in our setting states that every correct process is delivered at most one vertex with a round, from a process. In our weakened specification, we assume that only the first broadcast vertex of a process with a round $r$ can be delivered, thus ensuring integrity.

Our state variables are shown in Listing 1. The variable broadcastNetwork stores the sets of vertices to be delivered for each process, whereas broadcastRecord tracks processes that have broadcasted a vertex in each round. Whenever any process broadcasts vertex $v$ with round $r$ , then $v$ is only added to the broadcastNetwork provided the broadcastRecord indicates no previous broadcast by that process with round $r$ . For each process $p$ , blocksToPropose stores the sequence of blocks proposed by $p$ but whose vertex is not yet created and broadcasted by $p$ ; dag and round, respectively store the local DAG and recently completed round for $p$ ; finally buffer stores vertices received via reliable broadcast that are not yet added to the DAG of $p$ . Variable faulty stores a set of faulty processes.

Listing 1: State variables of Reliable broadcast communication.

⬇

broadcastNetwork \in [ProcessSet \cup {"History"}->SUBSET(TaggedVertexSet)]

broadcastRecord \in [ProcessSet->[RoundSet -> BOOLEAN]]

blocksToPropose \in [ProcessSet->Seq(BlockSet)]

dag \in [ProcessSet->[RoundSet->[ProcessSet->VertexSet \cup NilVertexSet]]]

round \in [ProcessSet->RoundSet]

buffer \in [ProcessSet->SUBSET(VertexSet)]

faulty \in SUBSET(ProcessSet)

The DAG construction is encoded by a state machine with four types of transitions. Propose adds a new block to blocksToPropose; ReceiveVertex picks a vertex to be delivered, verifies its creator’s identity and checks if it has enough parent vertices ( $2f+1$ ) to add it to the buffer. Under specific conditions AddVertex transfers a vertex from the buffer to the dag; NextRound increments round whenever a new round is completed. Additionally, it creates a new vertex for the new next round in the dag and broadcasts it to other processes (i.e. tries to add to the broadcastNetwork).

The safety of the reliable broadcast communication phase is formalized as a state invariant DagConsistency as shown in Listing 2. This invariant states that, at every position, the added vertex is the same for all correct processes.

Listing 2: DagConsistency Invariant. Here VertexSet is set of all possible vertices in the local DAG.

⬇

DagConsistency ==

\A p, q \in ProcessSet, r \in RoundSet, o \in ProcessSet :

(r /= 0 /\ dag[p][r][o] \in VertexSet /\ dag[q][r][o] \in VertexSet ) => dag[p][r][o] = dag[q][r][o]

4.1.2 Specifying Unreliable Broadcast Communication

Recall from section 3, with an unreliable broadcast, processes can have more than one vertex in the same position in their local DAG. Thus specifying unreliable broadcast communication has the additional challenge of specifying a protocol for picking a unique representative vertex for every position. This is done using the concept of Ratification [25].

In unreliable broadcast communication as mentioned in Section 3, we abstract out the actual communication. This abstraction eliminates the need for variables broadcastNetwork and buffer. Similarly, we also abstract away block proposals by non-deterministically choosing to add a vertex, containing a block in its DAG. This eliminates variable blocksToPropose. Finally, we also remove variable round that stores the recently completed round as this can be computed by simply looking at the local DAG. As shown in Listing 3, our unreliable broadcast communication specification only has two state variables, namely, dag and faulty. Variable dag in this case is defined slightly differently. First, at each position, the sets of vertices along with a ratified vertex are stored.

Listing 3: State variables of unreliable broadcast communication.

⬇

dag \in [ProcessSet->[RoundSet->[ProcessSet->[ vertices: SUBSET(VertexSet), ratifiedVertex: VertexSet \cup {"Nil"} ]]]]

faulty \in SUBSET(ProcessSet)

The communication phase here performs two tasks: constructing local DAGs and computing a ratified vertex for each position. These are encoded in the TLA+ state machine using AddVertex and RatifyVertex. For every correct process $p$ ; AddVertex adds a vertex $v$ to the DAG of $p$ If; 1) source of $v$ is $p$ and a round of $v$ is a new round of the DAG construction. Vertex $v$ has $2f+1$ parents in the previous round of the DAG of $p$ ; or 2) source of $v$ is some other process $q$ such that $v$ is in DAG of $q$ . Additionally, all the parents of $v$ are in the DAG of $p$ with at least $2f+1$ of them in the round previous to that of $v$ . Lastly, RatifyVertex ratifies a vertex for a position and stores it in the dag, as prescribed in the Cordial Miners [25].

The safety of the unreliable broadcast communication phase is formalized as a state invariant RatificationConsistency as shown in Listing 4. This invariant states that for every correct process, at every position the ratified vertex is the same as for all other correct processes.

Listing 4: RatificationConsistency invariant.

⬇

RatificationConsistency ==

\A p \in ProcessorSet, q \in ProcessorSet, r \in RoundSet, o \in ProcessorSet: ( p \notin faulty /\ dag[p][r][o].ratifiedVertex =/ "Nil" /\ q \notin faulty /\ dag[q][r][o].ratifiedVertex =/ "Nil" ) =>

dag[p][r][o].ratifiedVertex = dag[q][r][o].ratifiedVertex

4.2 Specifying the Ordering in TLA+

Towards a uniform presentation of the ordering phase in which protocols use a global perfect coin to obtain an order on transactions from local DAG structures, we consider an abstraction of the DAG structures local to each process. As explained in Section 3, reaching a consensus on the sequence of committed wave-leaders is sufficient to obtain an ordering of vertices in the DAGs. Thus each process has a wave-DAG. Vertices of the wave-DAG are only the wave leader vertices in the DAG constructed in the communication phase. Edges of the wave-DAG represent: (1) the transitive closure of edges for DAG-Rider, and (2) the ratification relation for Cordial Miners.

As shown in Listing 5, the state variables include the LeaderRelation to represent a snapshot of the wave-DAG: For every process $p$ and wave ID $w$ LeaderRelation stores information about the presence of the leader vertex of wave $w$ in the wave-DAG of $p$ , along with the set of all other waves, wave $w$ has an edge pointing to. Four extra variables are added: decidedWave stores the last decided wave by $p$ , leaderSeq stores the Sequence of waves committed in increasing order by the current and the last decided wave and for every wave $w$ , commitWithRef stores the sequence of waves that $w$ will commit if decided by process $p$ , and faulty stores IDs of faulty processes.

Listing 5: State variables of ordering phase.

⬇

leaderRelation \in [ ProcessSet->[WaveSet->[exists: BOOLEAN, edges: SUBSET(WaveSet)]]]

decidedWave \in [ProcessSet->WaveSet \cup {0}]

leaderSeq \in [ProcessSet->[current: Seq(WaveSet),last: Seq(WaveSet)]]

commitWithRef \in [ProcessSet->[WaveSet->Seq(WaveSet)]]

faulty \in SUBSET(ProcessSet)

The state machine performs two tasks, each represented by a transition. First, CreateNewWave constructs the wave-DAG. It adds a new wave with its edges to LeaderRelation, and it computes and stores in commitWithRef the sequence of waves that this wave would commit if decided. Second, DecideWave decides a wave and further commits waves. It updates decidedWave with a new wave ID, additionally, it updates leaderSeq for the newly decided wave (by querying to commitWithRef). The two transitions are only enabled from states that satisfy some specific conditions, illustrated in the code snippet in Listings 6 and 7.

Listing 6: CreateNewWave transition.

⬇

CreateNewWaveTn(p, w, E) ==

/\ p \notin faulty =>

(/\ leaderRelation[p][w].exists = FALSE \* 1

/\ \A x \in E : leaderRelation[p][x].exists = TRUE \* 2

/\ \A x \in E : x < w \* 3

/\ \A q \in ProcessSet: \* 4

q \notin faulty /\ leaderRelation[q][w].exists = TRUE

=> leaderRelation[q][w].edges = E /\ \A q \in ProcessSet: \* 5

q \notin faulty /\ decidedWave[q] # 0

/\ decidedWave[q] < w => decidedWave[q] \in E)

/\ commitWithRef’=[commitWithRef EXCEPT ![p][w] =

IF E = {} THEN <<w>> ELSE Append(commitWithRef[p][max(E)], w)]

/\ leaderRelation’ = [leaderRelation

EXCEPT ![p][w] = [exists |-> TRUE, edges |-> E]]

/\ UNCHANGED <<decidedWave, leaderSeq, faulty>>

Listing 7: DecideWave transition.

⬇

DecideWaveTn(p, w) ==

/\ p \notin faulty =>

(/\ leaderRelation[p][w].exists=TRUE \* 1

/\ w >= decidedWave[p] \* 2

/\ \A x \in WaveSet: \* 3

x > w => leaderRelation[p][x].exists=FALSE

/\ \A q \in ProcessSet, x \in WaveSet: \* 4

q \notin faulty /\ x > w /\ leaderRelation[q][x].exists = TRUE

=> w \in leaderRelation[q][x].edges)

/\ decidedWave’ = [decidedWave EXCEPT ![p] = w]

/\ leaderSeq’ = [leaderSeq EXCEPT ![p]= [current |-> commitWithRef[p][w],

last |-> leaderSeq[p].current]]

/\ UNCHANGED <<commitWithRef, leaderRelation, faulty>>

The safety properties the ordering specification should meet are formalized as two-state invariants LeaderConsistency and LeaderMonotonicity as shown in Listing 8. On the one hand, LeaderConsistency informally states that the sequences of committed leaders of any two correct processes are a prefix one of the other. On the other hand, LeaderMonotonicity intuitively states that wave commits happen monotonically concerning the order in which waves are decided. This implies that the order in which waves were committed by the last decided wave will not be altered by the new decided wave.

Listing 8: LeaderConsistency and LeaderMonotonicity invariants.

⬇

LeaderConsistency ==

\A p, q \in ProcessorSet:

p \notin faulty /\ q \notin faulty /\ decidedWave[p] <= decidedWave[q] =>

IsPrefix(leaderSeq[p].current, leaderSeq[q].current)

LeaderMonotonicity ==

\A p \in ProcessorSet: p \notin faulty =>

IsPrefix(leaderSeq[p].last, leaderSeq[p].current)

4.3 Complete Protocol Specification

Our complete specification is a refinement of the ordering phase, including the communication phase. More precisely, we use the interface refinement construct of TLA+ that allows one to obtain a lower-level specification by instantiating the variables of a higher-level specification. The use of this construct is shown in Listing 9. The refinement and the interactions between the two phases are instantiated with GPC, that gives a candidate leader position for each wave. Whenever a vertex is added (ratified in the case of Cordial Miners) at the candidate leader position for a wave in the communication phase, CreateNewWave is instantiated in the ordering phase. Similarly, whenever a wave is ready and can be decided in the local DAG constructed by the communication phase (DAG-Rider and Cordial Miners use different criteria for this), DecideWave is instantiated on the ordering specification. For the rest of the actions in the communication phase, the state of the ordering phase remains unchanged. Thus, the complete specification additionally requires specifying GPC and modifying actions in communication specification to instantiate ordering specification.

Listing 9: Instantiation of ordering phase in concrete protocol specification.

⬇

LeaderConsensus ==

INSTANCE LeaderConsensusVerification

WITH NumWaves <- w,

NumProcesses <- n,

f <- f,

commitWithRef <- commitWithRef,

decidedWave <- decidedWave,

leaderRelation <- leaderRelation,

leaderSeq <- leaderSeq,

faulty <- faulty

Details of specifying GPC.

GPC ensures unpredictability, agreement, termination, and fairness when choosing candidate wave-leaders. Intuitively GPC samples uniformly at random a process ID for each wave, together with a holding mechanism to force unpredictability if only Byzantine processes query the global coin. Similarly to how we specified reliable broadcast, we relax the guarantees of the global perfect coin to obtain a compact specification. Our specification does not guarantee unpredictability and need not imply fairness, however it ensures agreement and termination. Dropping unpredictability and fairness enables us to model the global perfect coin as an arbitrary function, common to all correct processes, from the set of waves WaveSet to the set of processes ProcessSet. In a sense, this relaxation of the global perfect coin simulates an increased power for Byzantine processes. However, we observe it is sufficient to prove safety, thus establishing that the consensus is safe, independent of the function ChooseLeader.

5 Safety Proofs in TLAPS

In this section, we discuss the invariants, theorems, and proof technique for formally proving the safety of the specifications described in section 4. The invariants and proofs for each specification can be found in their corresponding Verification.tla files [20].

5.0.1 Theorems and Invariants

To prove the DAGConsistency invariant (Listing 2) we identify and formalize 6 new invariants (Inv1, …, Inv6). Intuitively, Inv2 and Inv3 formalize the correctness of our weaker Reliable Broadcast abstraction. Inv1 states that every vertex in the dag of a correct process is in its buffer. Inv4 states that every message in the buffer of a correct process is in the history of broadcastNetwork. Inv5 states that the history of broadcastNetwork has at most one message for each round and source process. Finally, Inv6 states that vertices in the dag of a correct process are placed at the right positions, i.e., they have consistent round and source with respect to the position. Fig. 3 illustrates the dependencies between these six invariants. Inv1, Inv2, Inv3, and Inv6 are proven independently whereas Inv4 and Inv5 are proven assuming invariants Inv2 and Inv3 hold, respectively. Finally, DagConsistency is proved assuming Inv1, Inv4, Inv5 and Inv6.

To prove the RatificationConsistency invariant (Listing 4) we identify, formalize and prove 6 more invariants. Similarly, to prove the safety of the ordering specification, namely LeaderConsistency and LeaderMonotonicity (Listing 8) invariants, we identify, formalize and prove ten further invariants. Dependencies of all these invariants are depicted in Figure 3.Finally, to prove the complete safety of DAG-Rider and Cordial Miners we prove the Safety Theorem described in Listing 10.

Listing 10: Complete safety theorem for DAG-Rider and Cordial Miners specification.

⬇

THEOREM DAGRiderSpec => [](DagConsistency /\ LeaderConsistency /\ LeaderMonotonicity)

THEOREM CordialMinersSpec => [](RatificationConsistency /\ LeaderConsistency /\ LeaderMonotonicity)

5.0.2 Proof Strategy

The proofs in TLAPS follows a standard hierarchical structure and use proofs by induction and proofs by contradiction. Here, we simply give a high-level outline of the proof strategy for proving invariants. We outline the proof for an inductive invariant (Inv1) of the ordering specification in Fig. 11. To prove that this invariant holds on all reachable states of the ordering phase specification, we state the following: $\texttt{LEMMA Inv1correctness}=\texttt{Spec}\rightarrow\Box\texttt{Inv1}.$ Here Spec is a TLA+ macro name for the ordering specification, it has the form $\texttt{Spec}=\texttt{Init}\land\Box\texttt{[Next]}_{\texttt{vars}}$ , where vars, Init, Next are TLA+ macro names for set of variables, the initial state, and the possible actions leading to the next state. The lemma is proved by induction. The induction basis $\texttt{Init}\rightarrow\texttt{Inv1}$ is trivial and TLAPS proves it automatically. The induction step is captured by $\texttt{Inv1}\land\Box\texttt{[Next]}_{\texttt{vars}}\rightarrow\texttt{Inv1'}\enspace,$ where $\texttt{[Next]}_{\texttt{vars}}$ represents that either the state machine performs a Next step, or remains idle, and Inv1’ states that the invariant holds on the state after performing $\texttt{[Next]}_{\texttt{vars}}$ . These two cases are reflected in the code snippet as $\mathtt{{<}2{>}1}$ and $\mathtt{{<}2{>}2}$ . Then, Next is a disjunction of three cases, each of which is proved separately in $\mathtt{{<}3{>}1}$ , $\mathtt{{<}3{>}2}$ , and $\mathtt{{<}3{>}3}$ , and these lower-level goals are proven using the same strategy of identifying even lower-level goals. This process is somewhat mechanical, and TLAPS allows one to write proofs in such a fashion, i.e. breaking it into lower-level proof obligations. Ultimately, the lowest-level proof obligations are proved by breaking down the definition of the terms in the statements.

Listing 11: Proof structure of a specific invariant for the ordering specification.

⬇

LEMMA Inv1Lem == Spec => []Inv1

<1>1 Init => Inv1

...

<1>2 ASSUME StateType, StateType’, Inv1, [Next]_vars

PROVE Inv1’

<2>1 ASSUME StateType, StateType’, Next, Inv1

PROVE Inv1’

<3>1 ASSUME NEW p \in ProcessSet, NEW w \in WaveSet, NEW E \in SUBSET(WaveSet),

UpdateWaveTn(p, w, E)

PROVE Inv1’

...

<3>2 ASSUME NEW p \in ProcessSet, NEW w \in WaveSet, UpdateDecidedWaveTn(p, w)

PROVE Inv1’

...

<3>3 ASSUME NEW p \in ProcessSet, ProcessFailureTn(p)

PROVE Inv1’

...

<3> QED BY <3>1, <3>2, <3>3, <2>1 DEF Next

<2>2 ASSUME UNCHANGED vars, Inv1

PROVE Inv1’

...

<2> QED BY <2>1, <2>2

<1> QED BY <1>1, <1>2, TypeLem, PTL DEF Spec

6 Evaluation

Overall, the effort to understand, specify, and prove DAG-Rider and Cordial Miners required approximately 9 and 3 person-months respectively, distributed across 5 people. We evaluate the performance of our formal proof using TLA+ and TLAPS. Table 1 lists metrics including the sizes of the specifications for the respective communication phases (reliable communication for DAG-Rider and unreliable communication for Cordial Miners) and the common ordering phase. It also reports on metrics related to proof complexity, such as the size of proof files, the maximum depth of the proof, the maximum branching in the tree-like proof structure and the number of base proof obligations. Finally, the table displays the verification time for proving the safety of each phase.

Table 1: Specification and proof metrics. Performed on a 2.10 GHz CPU with 8 GB of memory, running Windows 11 and TLAPS v1.4.5. In the table DR is DAG-Rider and CM is Cordial Miners.

Metric $\backslash$ Phase	Ordering	RB Comm.	unRB Comm.	DR	CM
Size of spec. (# loc)	272	403	160	272+460	432+60
Size of proof (# loc)	822	460	594	521	657
Max level of proof tree nodes	9	10	9	10	9
Max degree of proof tree nodes	7	7	8	7	8
# obligations in TLAPS	1302	633	895	1303+722	1303+991
Time to check by TLAPS (s)	125	49	68	125+224	125+360

Remarkably, the safety of both DAG-based protocols can be proved within minutes. The modularity of our specifications is a clear advantage since DAG-Rider’s and Cordial Miner’s specifications share the same ordering phase specification in the common file LeaderConsensusSpecification.tla, and the corresponding same safety proof in LeaderConsensusVerification.tla. As far as the communication phase is concerned, as we found with these two case studies, even if the communication mechanisms differ, the effort to adapt the specification and the proofs was minimal (approximately 3 times less) when moving from DAG-Rider to Cordial Miners. Lastly, the glue specification needed was around 60 lines for both DAG-rider and Cordial Miners.

7 Discussion

We now explain how to specify and prove safety properties for other DAG-based protocols using our framework, namely BullShark [22], Aleph [18], Hashgraph [4], and Lachesis [16]. For BullShark, there are two variants: partially synchronous and eventually synchronous. Using our framework with no additional effort we have formally verified the eventually synchronous BullShark, the specification and proofs can be found in [20]. Eventually synchronous BullShark and DAG-Rider have three key differences: (i) It has additional timeouts in the DAG construction — this only refines our RB communication specification and thus requires no additional work for safety; (ii) wave-leaders are predetermined instead of randomized — we already handle arbitrary deterministic wave-leader functions, and thus this also requires no additional work; (iii) the wave size is two instead of four. We created an eventually synchronous BullShark specification file that is similar to the DAG-Rider specification file except that the wave size is two instead of four. Additionally, we had to update some of the proofs from handling arithmetic modulo of two to four. The changes required for BullShark took us $1$ person months. Handling partially synchronous BullShark and the other protocols listed requires more work, mainly on the Ordering Specification, which reduces the entire effort by at least $50\%$ . To see that handling the Communication Specification is light, note that partially synchronous BullShark has same communication as eventually synchronous BullShark and thus RB communication file can be used here as well without any change. For Hashgraph/Lachesis, one would only refine the unRB communication specification to handle the unstructured DAG and use the fact that the unRB communication specification already specifies the structure of the so-called witness-DAG/root-DAG (thus our existing proofs prove non-equivocation), while for Aleph we can use our RB communication file without change [18].

8 Related work

Parameterized verification is the problem of verifying a protocol for arbitrarily many processes [6]. For consensus protocols that tolerate Byzantine failures, this means verifying the protocol is correct for all values of the parameters $n,f$ with $n\geq 3f+1$ . In comparison, traditional model-checking only handles finite-state systems, and thus finitely many values of the parameters $n,f$ . For instance, the Tendermint consensus protocol [28] has been modeled in TLA+, and certain properties, such as termination, were model-checked for small values of the parameters [7]. Our work can be seen as a contribution to the parameterized verification of distributed protocols by machine-checkable proofs. An alternative approach is to use parameterized model-checkers, e.g., ByMC [26]. For instance, Safety and Liveness of the consensus algorithm used in the Red Belly Blockchain [13] has been verified for all values of the parameters $n,f$ with $n\geq 3f+1$ [5]. Although the algorithm only broadcasts binary values, that work verified both safety and liveness (under a weak fairness assumption).

Other BFT consensus protocols have been verified using interactive theorem provers. The work in [37] uses the IVy interactive theorem prover [1] to formally verify a variant of the Moonshine consensus protocol [37]. The work in [32] uses IVy and Isabelle/HOL [36] to verify the Stellar Consensus Protocol [31]. The Algorand [21] consensus protocol has been verified using Coq [11]. The safety of several non-Byzantine protocols such as variants of Paxos [9, 38] have been verified using interactive theorem proving. However, these protocols are not robust to Byzantine faults and they have limited application in blockchains.

To the best of our knowledge, the only other work specific to DAG-based consensus protocols is an unpublished technical report by Crary [14] describing a formal verification of a DAG-based consensus algorithm, Hashgraph [4], with the theorem prover Coq. In contrast to that approach, our TLA+ specifications of DAG-based consensus protocols are lower-level; they are closer to actual implementations of the protocols and therefore require fewer manual abstractions. Clear advantages of our approach are its separation of concerns between the communication phase and the ordering phase, and the reusability for several DAG-based consensus protocols.

9 Conclusion and Experience

Overall the effort to specify and prove three DAG-based consensus protocols required approximately 13 person months. However, with modularization, we found that the turn-around time for specifying and proving Cordial Miner’s protocol was significantly reduced, taking 3 person months. Specifying and proving Eventually synchronous BullShark took even less time, taking just 1 person month. We believe several other protocols can be modeled and proven with equally reduced efforts. We argue that while the effort required of our approach is arguably more than that of typical pen-and-paper proofs, machine-checkable specifications and proofs forced us to be much more rigorous and define our specification as a precise abstraction of the source code. In the future, we can leverage our proven specification as a model for model-driven test-case generation [40] which would not be possible with paper&pen proofs. Thus we not only were able to prove the safety of our conceptual model of the protocol but also ensure that the implementation corresponded to the model. We are currently expanding the specifications and proofs to support additional improved DAG-based protocols designed from insights obtained during the specification and proving process.

References

[1] Ahn, K.Y., Denney, E.: Testing first-order logic axioms in program verification. In: Fraser, G., Gargantini, A. (eds.) Tests and Proofs - 4th International Conference, TAP@TOOLS 2010, Málaga, Spain, July 1-2, 2010. Proceedings. Lecture Notes in Computer Science, vol. 6143, pp. 22–37. Springer (2010). https://doi.org/10.1007/978-3-642-13977-2_4, https://doi.org/10.1007/978-3-642-13977-2_4
[2] Aminof, B., Rubin, S., Stoilkovska, I., Widder, J., Zuleger, F.: Parameterized model checking of synchronous distributed algorithms by abstraction. In: Dillig, I., Palsberg, J. (eds.) Verification, Model Checking, and Abstract Interpretation - 19th International Conference, VMCAI 2018, Los Angeles, CA, USA, January 7-9, 2018, Proceedings. Lecture Notes in Computer Science, vol. 10747, pp. 1–24. Springer (2018). https://doi.org/10.1007/978-3-319-73721-8_1, https://doi.org/10.1007/978-3-319-73721-8_1
[3] Aptos Foundation: Understanding Aptos: A comprehensive overview (2024), https://messari.io/report/understanding-aptos-a-comprehensive-overview
[4] Baird, L., Luykx, A.: The hashgraph protocol: Efficient asynchronous BFT for high-throughput distributed ledgers. In: Proceedings of COINS 2020. pp. 1–7. IEEE (2020). https://doi.org/10.1109/COINS49042.2020.9191430
[5] Bertrand, N., Gramoli, V., Konnov, I., Lazic, M., Tholoniat, P., Widder, J.: Holistic verification of blockchain consensus. In: Scheideler, C. (ed.) 36th International Symposium on Distributed Computing, DISC 2022, October 25-27, 2022, Augusta, Georgia, USA. LIPIcs, vol. 246, pp. 10:1–10:24. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022). https://doi.org/10.4230/LIPICS.DISC.2022.10, https://doi.org/10.4230/LIPIcs.DISC.2022.10
[6] Bloem, R., Jacobs, S., Khalimov, A., Konnov, I., Rubin, S., Veith, H., Widder, J.: Decidability of Parameterized Verification. Synthesis Lectures on Distributed Computing Theory, Morgan & Claypool Publishers (2015). https://doi.org/10.2200/S00658ED1V01Y201508DCT013, https://doi.org/10.2200/S00658ED1V01Y201508DCT013
[7] Braithwaite, S., Buchman, E., Konnov, I., Milosevic, Z., Stoilkovska, I., Widder, J., Zamfir, A.: Formal specification and model checking of the tendermint blockchain synchronization protocol (short paper). In: Bernardo, B., Marmsoler, D. (eds.) 2nd Workshop on Formal Methods for Blockchains, FMBC@CAV 2020, July 20-21, 2020, Los Angeles, California, USA (Virtual Conference). OASIcs, vol. 84, pp. 10:1–10:8. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020). https://doi.org/10.4230/OASICS.FMBC.2020.10, https://doi.org/10.4230/OASIcs.FMBC.2020.10
[8] Buterin, V.: Ethereum white paper: A next generation smart contract & decentralized application platform (2013), https://github.com/ethereum/wiki/wiki/White-Paper
[9] Chand, S., Liu, Y.A., Stoller, S.D.: Formal verification of multi-paxos for distributed consensus. In: Fitzgerald, J.S., Heitmeyer, C.L., Gnesi, S., Philippou, A. (eds.) FM 2016: Formal Methods - 21st International Symposium, Limassol, Cyprus, November 9-11, 2016, Proceedings. Lecture Notes in Computer Science, vol. 9995, pp. 119–136 (2016). https://doi.org/10.1007/978-3-319-48989-6_8, https://doi.org/10.1007/978-3-319-48989-6_8
[10] Coelho, P., Junior, T.C., Bessani, A., Dotti, F., Pedone, F.: Byzantine fault-tolerant atomic multicast. In: 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). pp. 39–50 (2018). https://doi.org/10.1109/DSN.2018.00017
[11] Coquand, T., Huet, G.P.: The calculus of constructions. Inf. Comput. 76(2/3), 95–120 (1988). https://doi.org/10.1016/0890-5401(88)90005-3, https://doi.org/10.1016/0890-5401(88)90005-3
[12] Cousineau, D., Doligez, D., Lamport, L., Merz, S., Ricketts, D., Vanzetto, H.: TLA + proofs. In: Proceedings of FM 2012. Lecture Notes in Computer Science, vol. 7436, pp. 147–154. Springer (2012). https://doi.org/10.1007/978-3-642-32759-9_14, https://doi.org/10.1007/978-3-642-32759-9_14
[13] Crain, T., Natoli, C., Gramoli, V.: Red belly: A secure, fair and scalable open blockchain. In: 42nd IEEE Symposium on Security and Privacy, SP 2021, San Francisco, CA, USA, 24-27 May 2021. pp. 466–483. IEEE (2021). https://doi.org/10.1109/SP40001.2021.00087, https://doi.org/10.1109/SP40001.2021.00087
[14] Crary, K.: Verifying the hashgraph consensus algorithm. Tech. rep., CMU (2021), https://arxiv.org/abs/2102.01167
[15] Cristian, F., Aghili, H., Strong, R., Dolev, D.: Atomic broadcast: From simple message diffusion to byzantine agreement. Information and Computation 118(1), 158–179 (1995). https://doi.org/https://doi.org/10.1006/inco.1995.1060, https://www.sciencedirect.com/science/article/pii/S0890540185710607
[16] Fantom Foundation: Lachesis aBFT (2024), https://docs.fantom.foundation/technology/lachesis-abft
[17] Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2), 374–382 (1985)
[18] Gagol, A., Lesniak, D., Straszak, D., Swietek, M.: Aleph: Efficient atomic broadcast in asynchronous networks with Byzantine nodes. In: Proceedings of the 1st ACM Conference on Advances in Financial Technologies, AFT 2019, Zurich, Switzerland, October 21-23, 2019. pp. 214–228. ACM (2019). https://doi.org/10.1145/3318041.3355467, https://doi.org/10.1145/3318041.3355467
[19] Geko, C.: Ethereum white paper: A next generation smart contract & decentralized application platform (2024), https://www.coingecko.com/en/chains
[20] Ghorpade, P.: Reusable TLA+ specification and Proofs for DAG based Consensus Protocols. https://github.com/pranavg5526/Reusable-Formal-Verification-of-DAG-based-Consensus (2024), [Online; accessed 26-April-2024]
[21] Gilad, Y., Hemo, R., Micali, S., Vlachos, G., Zeldovich, N.: Algorand: Scaling Byzantine agreements for cryptocurrencies. In: Proceedings of SOSP 2017. pp. 51–68. ACM (2017). https://doi.org/10.1145/3132747.3132757
[22] Giridharan, N., Kokoris-Kogias, L., Sonnino, A., Spiegelman, A.: Bullshark: Dag bft protocols made practical. Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security (2022), https://api.semanticscholar.org/CorpusID:246015617
[23] Hedra: Streamlining consensus (2024), https://hedera.com/blog/streamlining-consensus-throughput-and-lower-latency-with-about-half-the-events
[24] Keidar, I., Kokoris-Kogias, E., Naor, O., Spiegelman, A.: All you need is DAG. In: Proceedings of PODC 2021. pp. 165–175. ACM (2021). https://doi.org/10.1145/3465084.3467905, https://doi.org/10.1145/3465084.3467905
[25] Keidar, I., Naor, O., Poupko, O., Shapiro, E.: Cordial miners: Fast and efficient consensus for every eventuality. In: Oshman, R. (ed.) 37th International Symposium on Distributed Computing, DISC 2023, October 10-12, 2023, L’Aquila, Italy. LIPIcs, vol. 281, pp. 26:1–26:22. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023). https://doi.org/10.4230/LIPICS.DISC.2023.26, https://doi.org/10.4230/LIPIcs.DISC.2023.26
[26] Konnov, I.V., Lazic, M., Veith, H., Widder, J.: A short counterexample property for safety and liveness verification of fault-tolerant distributed algorithms. In: Castagna, G., Gordon, A.D. (eds.) Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017. pp. 719–734. ACM (2017). https://doi.org/10.1145/3009837.3009860, https://doi.org/10.1145/3009837.3009860
[27] Kumar, A., Kumar Sah, B., Mehrotra, T., Rajput, G.K.: A review on double spending problem in blockchain. In: 2023 International Conference on Computational Intelligence and Sustainable Engineering Solutions (CISES). pp. 881–889 (2023). https://doi.org/10.1109/CISES58720.2023.10183579
[28] Kwon, J.: Tendermint: Consensus without mining. https://tendermint.com/docs/tendermint.pdf (2014)
[29] Lamport, L.: Specifying Systems, The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley (2002), http://research.microsoft.com/users/lamport/tla/book.html
[30] Lamport, L., Shostak, R., Pease, M.: The byzantine generals problem. ACM Transactions on Programming Languages and Systems (TOPLAS) 4(3), 382–401 (1982)
[31] Lokhava, M., Losa, G., Mazières, D., Hoare, G., Barry, N., Gafni, E., Jove, J., Malinowsky, R., McCaleb, J.: Fast and secure global payments with stellar. In: Brecht, T., Williamson, C. (eds.) Proceedings of the 27th ACM Symposium on Operating Systems Principles, SOSP 2019, Huntsville, ON, Canada, October 27-30, 2019. pp. 80–96. ACM (2019). https://doi.org/10.1145/3341301.3359636, https://doi.org/10.1145/3341301.3359636
[32] Losa, G., Dodds, M.: On the formal verification of the stellar consensus protocol. In: Bernardo, B., Marmsoler, D. (eds.) 2nd Workshop on Formal Methods for Blockchains, FMBC@CAV 2020, July 20-21, 2020, Los Angeles, California, USA (Virtual Conference). OASIcs, vol. 84, pp. 9:1–9:9. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2020). https://doi.org/10.4230/OASICS.FMBC.2020.9, https://doi.org/10.4230/OASIcs.FMBC.2020.9
[33] Milosevic, Z., Hutle, M., Schiper, A.: On the reduction of atomic broadcast to consensus with byzantine faults. In: 2011 IEEE 30th International Symposium on Reliable Distributed Systems. pp. 235–244 (2011). https://doi.org/10.1109/SRDS.2011.36
[34] Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system (2008)
[35] Neu, J., Tas, E.N., Tse, D.: Two more attacks on proof-of-stake ghost/ethereum. In: Proceedings of the 2022 ACM Workshop on Developments in Consensus. p. 43–52. ConsensusDay ’22, Association for Computing Machinery, New York, NY, USA (2022). https://doi.org/10.1145/3560829.3563560, https://doi.org/10.1145/3560829.3563560
[36] Nipkow, T., Paulson, L.C., Wenzel, M.: Isabelle/HOL - A Proof Assistant for Higher-Order Logic, Lecture Notes in Computer Science, vol. 2283. Springer (2002). https://doi.org/10.1007/3-540-45949-9, https://doi.org/10.1007/3-540-45949-9
[37] Praveen, M., Ramesh, R., Doidge, I.: Formally verifying the safety of pipelined moonshot consensus protocol. CoRR abs/2403.16637 (2024). https://doi.org/10.48550/ARXIV.2403.16637, https://doi.org/10.48550/arXiv.2403.16637
[38] Schultz, W., Dardik, I., Tripakis, S.: Formal verification of a distributed dynamic reconfiguration protocol. In: Popescu, A., Zdancewic, S. (eds.) CPP ’22: 11th ACM SIGPLAN International Conference on Certified Programs and Proofs, Philadelphia, PA, USA, January 17 - 18, 2022. pp. 143–152. ACM (2022). https://doi.org/10.1145/3497775.3503688, https://doi.org/10.1145/3497775.3503688
[39] Soffer, P.: Refinement equivalence in model-based reuse: Overcoming differences in abstraction level. J. Database Manag. 16, 21–39 (07 2005)
[40] Wang, D., Dou, W., Gao, Y., Wu, C., Wei, J., Huang, T.: Model checking guided testing for distributed systems. In: Proceedings of the Eighteenth European Conference on Computer Systems. p. 127–143. EuroSys ’23, Association for Computing Machinery, New York, NY, USA (2023). https://doi.org/10.1145/3552326.3587442, https://doi.org/10.1145/3552326.3587442

Reusable Formal Verification of DAG-based Consensus Protocols

Abstract

Keywords:

1 Introduction

2 DAG-based Consensus Protocols in Blockchains

Communication phase.

Ordering phase.

3 Abstractions towards Generic and Simple Specifications

Miscellaneous Simplifying Abstractions.

4 Specifications of DAG-based Consensus Protocols

4.1 Specifying Communication in TLA+

4.1.1 Specifying Reliable Broadcast Communication

4.1.2 Specifying Unreliable Broadcast Communication

4.2 Specifying the Ordering in TLA+

4.3 Complete Protocol Specification

Details of specifying GPC.

5 Safety Proofs in TLAPS

5.0.1 Theorems and Invariants

5.0.2 Proof Strategy

6 Evaluation

7 Discussion

8 Related work

9 Conclusion and Experience

References

Reusable Formal Verification
of DAG-based Consensus Protocols