cluster/tla/consensus.tla (289 lines of code) (raw):
`^\Large\bf
TLA+ Model of an improved Zen consensus algorithm with reconfiguration capabilities ^'
-------------------------------------------------------------------------------------
-------------------------------- MODULE consensus -----------------------------------
\* Imported modules used in this specification
EXTENDS Naturals, FiniteSets, Sequences, TLC
----
\* `^\Large\bf Constants ^'
\* The specification first defines the constants of the model, which amount to values or sets of
\* values that are fixed.
CONSTANTS Values
\* Set of node ids (all master-eligible nodes)
CONSTANTS Nodes
\* The constant "Nil" denotes a place-holder for a non-existing value
CONSTANTS Nil
\* RPC message types
CONSTANTS
Join, \* only request is modeled
PublishRequest,
PublishResponse,
Commit, \* only request is modeled
Catchup \* only response is modeled
\* Publish request types
CONSTANTS
Reconfigure,
ApplyCSDiff
----
\* `^\Large\bf Variables ^'
\* The following describes the variable state of the model.
\* Set of requests and responses sent between nodes.
VARIABLE messages
\* node state (map from node id to state)
VARIABLE firstUncommittedSlot
VARIABLE currentTerm
VARIABLE currentConfiguration
VARIABLE currentClusterState
VARIABLE lastAcceptedTerm
VARIABLE lastAcceptedValue
VARIABLE joinVotes
VARIABLE electionWon
VARIABLE publishPermitted
VARIABLE publishVotes
----
\* set of valid configurations (i.e. the set of all non-empty subsets of Nodes)
ValidConfigs == SUBSET(Nodes) \ {{}}
\* quorums correspond to majority of votes in a config
IsQuorum(votes, config) == Cardinality(votes \cap config) * 2 > Cardinality(config)
\* checks whether two configurations only have intersecting quorums
IntersectingQuorums(config1, config2) ==
/\ \lnot IsQuorum(config1 \ config2, config1)
/\ \lnot IsQuorum(config2 \ config1, config2)
\* initial model state
Init == /\ messages = {}
/\ firstUncommittedSlot = [n \in Nodes |-> 0]
/\ currentTerm = [n \in Nodes |-> 0]
/\ currentConfiguration \in {[n \in Nodes |-> vc] : vc \in ValidConfigs} \* all agree on initial config
/\ currentClusterState \in {[n \in Nodes |-> v] : v \in Values} \* all agree on initial value
/\ lastAcceptedTerm = [n \in Nodes |-> Nil]
/\ lastAcceptedValue = [n \in Nodes |-> Nil]
/\ joinVotes = [n \in Nodes |-> {}]
/\ electionWon = [n \in Nodes |-> FALSE]
/\ publishPermitted = [n \in Nodes |-> FALSE]
/\ publishVotes = [n \in Nodes |-> {}]
\* Send join request from node n to node nm for term t
HandleStartJoin(n, nm, t) ==
/\ t > currentTerm[n]
/\ LET
joinRequest == [method |-> Join,
source |-> n,
dest |-> nm,
slot |-> firstUncommittedSlot[n],
term |-> t,
laTerm |-> lastAcceptedTerm[n]]
IN
/\ currentTerm' = [currentTerm EXCEPT ![n] = t]
/\ publishPermitted' = [publishPermitted EXCEPT ![n] = TRUE]
/\ electionWon' = [electionWon EXCEPT ![n] = FALSE]
/\ joinVotes' = [joinVotes EXCEPT ![n] = {}]
/\ publishVotes' = [publishVotes EXCEPT ![n] = {}]
/\ messages' = messages \cup { joinRequest }
/\ UNCHANGED <<currentClusterState, currentConfiguration,
firstUncommittedSlot, lastAcceptedValue, lastAcceptedTerm>>
\* node n handles a join request and checks if it has received enough joins (= votes)
\* for its term to be elected as master
HandleJoinRequest(n, m) ==
/\ m.method = Join
/\ m.term = currentTerm[n]
/\ \/ /\ m.slot < firstUncommittedSlot[n]
\/ /\ m.slot = firstUncommittedSlot[n]
/\ (m.laTerm /= Nil => lastAcceptedTerm[n] /= Nil /\ m.laTerm <= lastAcceptedTerm[n])
/\ joinVotes' = [joinVotes EXCEPT ![n] = @ \cup { m.source }]
/\ electionWon' = [electionWon EXCEPT ![n] = IsQuorum(joinVotes'[n], currentConfiguration[n])]
/\ IF electionWon'[n] /\ publishPermitted[n] /\ lastAcceptedTerm[n] /= Nil
THEN LET publishRequests == { [method |-> PublishRequest,
source |-> n,
dest |-> ns,
term |-> currentTerm[n],
slot |-> firstUncommittedSlot[n],
value |-> lastAcceptedValue[n]] : ns \in Nodes }
IN
/\ messages' = messages \cup publishRequests
/\ publishPermitted' = [publishPermitted EXCEPT ![n] = FALSE]
ELSE
/\ UNCHANGED <<messages, publishPermitted>>
/\ UNCHANGED <<currentClusterState, currentConfiguration, currentTerm, publishVotes,
firstUncommittedSlot, lastAcceptedValue, lastAcceptedTerm>>
\* client causes a cluster state change v
ClientRequest(n, v) ==
/\ electionWon[n]
/\ publishPermitted[n]
/\ LET
publishRequests == { [method |-> PublishRequest,
source |-> n,
dest |-> ns,
term |-> currentTerm[n],
slot |-> firstUncommittedSlot[n],
value |-> [type |-> ApplyCSDiff,
val |-> (currentClusterState[n] :> v)]
] : ns \in Nodes }
IN
/\ publishPermitted' = [publishPermitted EXCEPT ![n] = FALSE]
/\ messages' = messages \cup publishRequests
/\ UNCHANGED <<currentClusterState, currentConfiguration, currentTerm, electionWon,
firstUncommittedSlot, lastAcceptedValue, lastAcceptedTerm, joinVotes, publishVotes>>
\* change the set of voters
ChangeVoters(n, vs) ==
/\ electionWon[n]
/\ publishPermitted[n]
/\ LET
publishRequests == { [method |-> PublishRequest,
source |-> n,
dest |-> ns,
term |-> currentTerm[n],
slot |-> firstUncommittedSlot[n],
value |-> [type |-> Reconfigure, val |-> vs]] : ns \in Nodes }
IN
/\ publishPermitted' = [publishPermitted EXCEPT ![n] = FALSE]
/\ messages' = messages \cup publishRequests
/\ UNCHANGED <<currentClusterState, currentConfiguration, currentTerm, electionWon,
firstUncommittedSlot, lastAcceptedValue, lastAcceptedTerm, joinVotes, publishVotes>>
\* handle publish request m on node n
HandlePublishRequest(n, m) ==
/\ m.method = PublishRequest
/\ m.slot = firstUncommittedSlot[n]
/\ m.term = currentTerm[n]
/\ lastAcceptedTerm' = [lastAcceptedTerm EXCEPT ![n] = m.term]
/\ lastAcceptedValue' = [lastAcceptedValue EXCEPT ![n] = m.value]
/\ LET
response == [method |-> PublishResponse,
source |-> n,
dest |-> m.source,
success |-> TRUE,
term |-> m.term,
slot |-> m.slot]
IN
/\ messages' = messages \cup {response}
/\ UNCHANGED <<currentClusterState, currentConfiguration, currentTerm,
electionWon, firstUncommittedSlot, publishPermitted, joinVotes, publishVotes>>
\* node n commits a change
HandlePublishResponse(n, m) ==
/\ m.method = PublishResponse
/\ m.slot = firstUncommittedSlot[n]
/\ m.term = currentTerm[n]
/\ publishVotes' = [publishVotes EXCEPT ![n] = @ \cup {m.source}]
/\ IF IsQuorum(publishVotes'[n], currentConfiguration[n])
THEN
LET
commitRequests == { [method |-> Commit,
source |-> n,
dest |-> ns,
term |-> currentTerm[n],
slot |-> firstUncommittedSlot[n]] : ns \in Nodes }
IN
/\ messages' = messages \cup commitRequests
ELSE
UNCHANGED <<messages>>
/\ UNCHANGED <<currentClusterState, currentConfiguration, currentTerm, electionWon,
firstUncommittedSlot, lastAcceptedValue, lastAcceptedTerm,
publishPermitted, joinVotes>>
\* apply committed change to node n
HandleCommitRequest(n, m) ==
/\ m.method = Commit
/\ m.slot = firstUncommittedSlot[n]
/\ m.term = lastAcceptedTerm[n]
/\ firstUncommittedSlot' = [firstUncommittedSlot EXCEPT ![n] = @ + 1]
/\ lastAcceptedTerm' = [lastAcceptedTerm EXCEPT ![n] = Nil]
/\ lastAcceptedValue' = [lastAcceptedValue EXCEPT ![n] = Nil]
/\ publishPermitted' = [publishPermitted EXCEPT ![n] = TRUE]
/\ publishVotes' = [publishVotes EXCEPT ![n] = {}]
/\ IF lastAcceptedValue[n].type = Reconfigure THEN
/\ currentConfiguration' = [currentConfiguration EXCEPT ![n] = lastAcceptedValue[n].val]
/\ electionWon' = [electionWon EXCEPT ![n] = IsQuorum(joinVotes[n], currentConfiguration'[n])]
/\ UNCHANGED <<currentClusterState>>
ELSE
/\ Assert(lastAcceptedValue[n].type = ApplyCSDiff, "unexpected type")
/\ Assert(DOMAIN(lastAcceptedValue[n].val) = {currentClusterState[n]}, "diff mismatch")
/\ currentClusterState' = [currentClusterState EXCEPT ![n] = lastAcceptedValue[n].val[@]] \* apply diff
/\ UNCHANGED <<currentConfiguration, electionWon>>
/\ UNCHANGED <<currentTerm, joinVotes, messages>>
\* node n captures current state and sends a catch up message
SendCatchupResponse(n) ==
/\ LET
catchupMessage == [method |-> Catchup,
slot |-> firstUncommittedSlot[n],
config |-> currentConfiguration[n],
state |-> currentClusterState[n]]
IN
/\ messages' = messages \cup { catchupMessage }
/\ UNCHANGED <<currentClusterState, currentConfiguration, currentTerm,
lastAcceptedValue, electionWon, firstUncommittedSlot, publishPermitted,
joinVotes, lastAcceptedTerm, publishVotes>>
\* node n handles a catchup message
HandleCatchupResponse(n, m) ==
/\ m.method = Catchup
/\ m.slot > firstUncommittedSlot[n]
/\ firstUncommittedSlot' = [firstUncommittedSlot EXCEPT ![n] = m.slot]
/\ lastAcceptedTerm' = [lastAcceptedTerm EXCEPT ![n] = Nil]
/\ lastAcceptedValue' = [lastAcceptedValue EXCEPT ![n] = Nil]
/\ publishPermitted' = [publishPermitted EXCEPT ![n] = TRUE]
/\ electionWon' = [electionWon EXCEPT ![n] = FALSE]
/\ currentConfiguration' = [currentConfiguration EXCEPT ![n] = m.config]
/\ currentClusterState' = [currentClusterState EXCEPT ![n] = m.state]
/\ joinVotes' = [joinVotes EXCEPT ![n] = {}]
/\ publishVotes' = [publishVotes EXCEPT ![n] = {}]
/\ UNCHANGED <<currentTerm, messages>>
\* crash/restart node n (loses ephemeral state)
RestartNode(n) ==
/\ electionWon' = [electionWon EXCEPT ![n] = FALSE]
/\ publishPermitted' = [publishPermitted EXCEPT ![n] = FALSE]
/\ joinVotes' = [joinVotes EXCEPT ![n] = {}]
/\ publishVotes' = [publishVotes EXCEPT ![n] = {}]
/\ UNCHANGED <<messages, firstUncommittedSlot, currentTerm, currentConfiguration,
currentClusterState, lastAcceptedTerm, lastAcceptedValue>>
\* next-step relation
Next ==
\/ \E n, nm \in Nodes : HandleStartJoin(n, nm, currentTerm[n] + 1)
\/ \E m \in messages : HandleJoinRequest(m.dest, m)
\/ \E n \in Nodes : \E v \in Values : ClientRequest(n, v)
\/ \E m \in messages : HandlePublishRequest(m.dest, m)
\/ \E m \in messages : HandlePublishResponse(m.dest, m)
\/ \E m \in messages : HandleCommitRequest(m.dest, m)
\/ \E n \in Nodes : RestartNode(n)
\/ \E n \in Nodes : \E vs \in ValidConfigs : ChangeVoters(n, vs)
\/ \E n \in Nodes : SendCatchupResponse(n)
\/ \E n \in Nodes : \E m \in messages : HandleCatchupResponse(n, m)
----
\* main invariant:
StateMachineSafety ==
\A n1, n2 \in Nodes :
firstUncommittedSlot[n1] = firstUncommittedSlot[n2] =>
/\ currentClusterState[n1] = currentClusterState[n2]
/\ currentConfiguration[n1] = currentConfiguration[n2]
OneMasterPerTerm ==
\A n1, n2 \in Nodes :
/\ electionWon[n1]
/\ electionWon[n2]
/\ currentTerm[n1] = currentTerm[n2]
/\ IntersectingQuorums(currentConfiguration[n1], currentConfiguration[n2])
=> n1 = n2
LogMatching ==
\A n1, n2 \in Nodes :
/\ firstUncommittedSlot[n1] = firstUncommittedSlot[n2]
/\ lastAcceptedTerm[n1] = lastAcceptedTerm[n2]
=> lastAcceptedValue[n1] = lastAcceptedValue[n2]
SingleNodeInvariant ==
\A n \in Nodes :
/\ (lastAcceptedTerm[n] = Nil) = (lastAcceptedValue[n] = Nil)
/\ lastAcceptedTerm[n] /= Nil => (lastAcceptedTerm[n] <= currentTerm[n])
/\ electionWon[n] = IsQuorum(joinVotes[n], currentConfiguration[n]) \* cached value is consistent
/\ electionWon[n] /\ publishPermitted[n] => lastAcceptedValue[n] = Nil
LogMatchingMessages ==
\A m1, m2 \in messages:
/\ m1.method = PublishRequest
/\ m2.method = PublishRequest
/\ m1.slot = m2.slot
/\ m1.term = m2.term
=> m1.value = m2.value
SafeCatchupMessages ==
\A m1, m2 \in messages:
/\ m1.method = Catchup
/\ m2.method = Catchup
/\ m1.slot = m2.slot
=> m1.config = m2.config /\ m1.state = m2.state
\* State-exploration limits
StateConstraint ==
/\ \A n \in Nodes: currentTerm[n] <= 3
/\ \A n \in Nodes: firstUncommittedSlot[n] <= 2
/\ Cardinality(messages) <= 15
====================================================================================================