Verification of Durable Linearizability of a Persistent Queue

This Website documents KIV-proofs done for the paper [FM19].

The formal proofs make use of a library that contains a formalization of much of the refinement theory of IO-Automata from [LV95]. The specifications imported from this library contain

the definitions of an IO-Automaton as a pair of a start- and and step-function.
the definition of IO-Automata refinement.
the definition of a forward simulation together with a proof that it implies refinement.

The proofs benefit from an automated transformation of programs to a standard IO-Automaton, that is programmed in KIV. Based on this transformation, the proofs consist of three major parts, that are part of the main project.

As a first step, the programs in specification DMSQaux are annotated with assertions. When transforming to an IO-Automaton the assertions generate proof obligations, which imply that the automaton satisfies a (quite complex) invariant. Getting all these assertions right, was the most complex part of the verification.
The second step, in specification DMSQ, augments the automaton with the global transitions of flushing values from volatile to persistent memory and by adding a step for crash and recovery. It is proved that these extra steps still preserve the invariant from the first step.
That the resulting automaton refines the abstract specification DQ (the automaton IDQ given in [FM19], Figure 3), is proved in DMSQ-refines-DQ via a forward simulation. The forward simulation itself is defined as predicate abs in specification abs

We describe the construction of the automaton, and the three proof steps in the following subsections.

Construction of a standard automaton from programs

The main part of the specification DMSQaux is the definition of two procedures ENQ and DEQ that follow the description of Algorithm 2 in [FM19]. Please ignore the RUN routine. It is defined solely to introduce the three extra program counter values S, I and C (the final automaton constructed will override the steps of RUN). Note that the line numbers (= program counter values) are slightly different than the ones given in the paper. Assertions which hold at a specific line number are given with /* and */ or below the code if they hold for ranges of program counter values. The syntax is similar to the one of the paper, but there are some small discrepancies.

All programs can refer to the global variables specified under "state variables".
Accessing the value stored under a reference r in volatile/persistent store is explicitly given as vs[r]/ps[r]. vs and ps map a finite number of allocated references to cells, that contain three elements. A value stored in the queue is stored under vs[r].val, an optional threadid (either None or Some(tid)) under vs[r].opttid, the pointer to the next cell is vs[r].nxt.
Queue-references are stored in vs and ps. All other references are stored separately. The volatile/persistent head is stored in vhead/phead. Note that the tail of the queue is stored in volatile memory only, and reconstructed by traversing the queue from phead on a crash.
References to return values are allocated and stored in a separate part of the heap. The volative and persistent versions of this heap are called vrvHp and prvHp. It maps return value references to optional values (either Val(value) or Empty). The return values array which stores references (rvref) to return values for each thread is modelled as two functions vretRefs and pretRefs. Splitting the heap into two parts and not storing the return values array or the head in it avoids an untyped, flat heap. As a disadvantage of the typing scheme, flushing vhead, vretRefs(tid), vrHp(rvref) as well flushing a next-pointer/a value/ a threadid of a queue-cell all look different.
The atomic steps of the program are parallel assignments (the individual assignments are separated with a comma, while statements are separated with a semicolon), the evaluation of tests, allocating a new cell (by choosing an arbitrary one not yet allocated at line E2). Instead of a "while(true)"-loop with return statements in the middle of the code a "while (not success)"-loop is used, where success is a boolean variable, that starts as false and is set to true by CAS instructions. A "CAS(old, global, new)" is encoded as "if* old=global then success:=true, global := new else success := false". The test of an "if*" is evaluated together with the following assignment in one atomic step.
Variables "first" and "last" in the code of [FM19] are named "fst" and "lst" here, since "last" is a keyword in KIV.
The programs use four auxiliary variables:
- qrl is the list of references to cells in the heap that store elements of the abstract queue (the abstraction function will specify that ps[r].val for the references r in qrl gives the abstract queue).
- orl contains all references that have alreadly been dequeued in persistent store. orl includes the initial dummy cell. qrl and orl are updated at persistence points at E11, E14, D15, D21 and D27.
- owns maps references r, that have been allocated but have not been enqueued in persistent store to the thread that allocated them. This ensures that they are different from each other and from queue references, which have owns(r) = None. A freshly allocated cell has owns(r) = Some(tid). When it is enqueued in volatile store (by the CAS at E9), its owner changes to Weak(r). When the reference is flushed (either at E11 by the thead enqueuing it, or by another thread helping at E14 or D15), i.e. when it is enqueued in persistent store, the owner becomes None.
- rvowns maps references rvref, that have been allocated by thread tid to either Pending(tid) or Claimed(tid, val). The latter case is used, when a successful CAS at D19 has fixed the return value val the dequeue will return and store in vrvHp[vretRefs(tid)]. One of the critical features of the algorithm is that as soon as rvowns(rvref) = Claimed(tid,val), a helping thread can overwrite vrvHp[rvref] with val. This is possible even if the thread tid has finished the dequeue already and started a new one. In this case rvref will be no longer equal to retRefs(tid), and the helping thread will not really help, but overwrite an obsolete old reference.

The declaration of the programs generates an IO Automaton with the following components (the generated specifications are all shown below the main specification).

The state of the automaton is of type "DMSQauxState". It consists of
- a global state "gs" of type DMSQauxGState, which is a tuple of all state variables,
- a program counter function "pcf". pcf(tid) returns for every thread tid the program line (of type DMSQauxPC). the thread will execute next. A thread which is outside of the two programs is at either pc = S, pc = C or pc = I. These three state correspond to "idle", "ready" and "crashed" in [FM19]. ENQ and DEQ can be invoked, when pc = I.
- a local state function "lsf". lsf(tid) returns the local state of thread tid (= all local variables and parameters) as a tuple of values. One component lsf(tid).tid of the local state is always the threadid of the running thread. The threadid cannot be assigned by construction, so all threads satisfy the invariant "lsf(tid).tid = tid".
The actions (type DMSQauxAction) consist of invoke and return steps for the program (again invRUN and retRUN will be discared in the augmented automaton), which are used for the calling/returning step. Invoke steps move the arguments of the action to local variables, return steps require that the arguments of the action coincide with the computed values in local state. All internal steps of the programs have action "τ", except nondeterministic steps. In our case there is one at line E2, which allocates (chooses) a node, ce and success. This step executes action "chooseE1(node, ce, success)" where the arguments are the chosen values (the third component will always be false, since the choice has "with success = false"). A similar choice to allocate a new return reference is at line D2. Having a non-empty (internal) action for nondeterministic steps allows to compute a deterministic final state after each step, depending on the initial state and the action. Conceptually the invoke and return steps are the external actions.
The step-relation step(A) \subseteq state \times action \times state (predicate DMAQauxstep) is defined by lifting local steps (predicate lstep) to the full state. Predicate lstep has the global state "gs", together with the localstate "ls" and the "pc"-value of a single thread as input. For a given action "a" it returns modified values of "gs", "ls" and "pc".
The lstep-relation is defined via a precondition (predicate pre), and three functions "gstepf", "lstepf" and "pcstepf". "pre(gs,ls,pc,a)" determines if a step with action "a" is possible for the thread, "gstepf(gs, ls, pc,a)", "lstepf(gs, ls, pc,a)" and "pcstepf(gs, ls, pc,a)" compute the global state, local state and pc-value after the step. Usually the precondition requires the action to be empty. Assignments to global resp. variables are reflected in the definitions of gstepf and lstepf for each pc-value. In these definitions update functions like ".next:=" correspond to selectors ".next". They are written infix: "ls.next:= r" is local state "ls", where the next-field has been updated to "r". For an if-then-else in the code "pcstepf" will use a predicate logic if-then-else for for the new pc-value, written "(test ⊃ thenpc; elsepc)"

Proof obligations for assertions in the programs

The lines of the programs are annotated with assertions. The assertions are combined with the invariants given below the code. Invariants without a list of pc's are added (by conjunction) to all assertions. They are global invariants that are proven to hold in every state reachable by the automaton (even when no thread is currently executing one of the programs), so they depend on global state only. They are collected in the definition of predicate "DMSQauxGInv". Assertions with a range of pc's are added to the assertion of each pc within the range. Note that assertions are checked to mention only those local variables, which are defined at the respective program line. The assertions are collected to define a (rather large) predicate "DMSQauxLinv(gs, ls, pc)". Its definition is a big conjunction of formulas of the form "pc = <pcvalue> -> <assertion at pcvalue>". The generated proof obligations guarantee that "DMSQauxLinv(gs, lsf(tid), pcf(tid))" holds for all threads in every state of the automaton (together the full invariant "DMSQauxInv(gs, lsf, pcf)" is implied to hold in every reachable state. Below the program and the invariants a rely condition is specified. From this formula the definition for predicate "DMSQauxRely(gs, tid, gs0)" is generated. Using a rely condition avoids the quadratic complexity inherent in Owicki-Gries style proofs. The following proof obligations are generated in the theorembase of specification DMSQaux for every step of the program, to ensure that DMSQauxInv holds:

"step-<pcval>-<pcval0>": for every step of the thread tid from pcvalue "pcval" to "pcval0": From the global invariant and the local invariant of tid (= the assertion that holds at "pcval") prove that after a step (i.e. under "pre" and result computed with "gstepf", "lstepf" and "pcf") the global invariant and the local invariant (= the assertion at "pcval0") hold again. Together these proof obligations imply step-DMSQauxLInv (This is generated as an axiom to be used by specifications above the current one. Note that the axiom is of course not available for proving the proof obligations).
"rely-<pcval>-<pcval0>": Every step of the program, that modifies global variables is a step that satisfies the rely of other threads. When tid0 is different from ls.tid, the step satisfies DMSQauxRely(gs, tid0, gstepf(gs, ls, <pcval>)). Together with a reflexivity-condition for the rely-predicate (which subsumes all steps that do not change global variables) the proof obligations implies rely-DMSQauxLInv
"stable-<pcval>: The assertions at each point "pcval" are stable with respect to rely steps. Note that if the assertions at several pc-values are equal, the proof obligation is generated only for the earliest pc-value. Together these proof obligations imply stable-DMSQauxLInv.
The two previous theorems combined imply other-DMSQauxLInv: steps of other threads preserve the local invariant (i.e. local assertions) of a thread.
Combining the previous results gives the main theorem DMSQauxInv-thm, that ensure that the full invariant DMSQauxInv is preserved by all steps.

For the concrete algorithm, the relevant predicates needed to define its invariants and rely condition are defined at the beginning of the specification. We give an informal description of the main "listp" -predicate first (for a pictorial description see also Fig. 2 in [FM19]), and then go through the other predicates.

listp(orl + qrl, ps, vs) is the main predicate of the global invariant. It specifies the possible queue structure in volatile and persistent memory. It specifies that orl + qrl is a non-empty list of allocated references (at least one dummy cell is always in orl). The references form a linked list in both volatile (vs) and persistent (ps) memory (each ps[r].nxt is equal to the next element in the list). They all have owns(r) = None. The cells stored under the references all have the same content in vs and ps, except for two possible exceptions
- A cell may be marked in volatile, but not yet in persistent memory (otherwise their marking is equal).
- The last cell r (axiom listp-one specifies this case), must have ps[r].nxt = null, but vs[r].nxt may point to a cell, that has been enqueued in volatile store. This cell then has a weak owner, is the same in vs and ps, and its next pointer is null.
.
equalowns(vs, ps, owns, tid, vs0, ps0, owns0) is used in the rely condition: Allocated references r strongly owned by tid (owns(r) = Some(tid)) cannot be changed by steps other threads, and their ownership will stay the same. Weakly owned cells will not change, if they are still weakly owned after the step. Otherwise the ownership has changed to None.
"rl from r" computes the references in reference list "rl" that come after "r" (including r). If r is not contained in rl, the result is the empty list. This predicate is used in the rely to ensure that references like head or tail move forward in the queue only.
markedp(orl, vs/ps) and unmarkedp(qrl, vs/ps): These predicates are used as part of the global invariant: they ensures that old cells in orl/ current queue cells in qrl are marked/unmarked, i.e. their threadid vs/ps[r].opttid is Some(tid)/None.
equalrl(orl + qrl, vs, ps, vs0, ps0) is used in the rely condition: All cells stored under the references in orl and qrl either change their marking from None to Some(tid), or otherwise leave it unchanged. The values and the nxt-references are unchanged, except for the nxt-reference of the last reference "rl.last". If this points to null, a new enqueue may have changed it. Otherwise, in the exceptional case of listp above, where vs[rl.last].nxt is not null, the reference may not change anymore, and the value stored in the cell vs[rl.last].nxt also must not be changed.
allocrv(vretRefs, vrvHp) and allocrv(pretRefs, prvHp) assert that return references are either null (their initial value) or allocated and non-null.
Invariant nonullRetRefs(orl + qrl, vs, vretRefs, pretRefs) asserts that before a reference (from orl or qrl) is dequeued by marking it with tid the refurn reference v/pretRefs(tid) has been allocated and can no longer be null.
Invariants rvownerok(vretRefs, vrvHp, rvowns) and rvownerok(pretRefs, prvHp, rvowns) asserts that v/pretRefs(tid) is always owned by tid.
Rely predicate rvownsrely(tid, rvowns, rvowns0, vrvHp) asserts that ownership of return values can only be changed from Pending(tid) to Claimed(tid,val).
rvrely(tid, rvowns, vretRefs, vrvHp, vretRefs0, vrvHp0) ensures that other threads than tid cannot change the value stored at vrvHp[vretRefs(tid)] when rvref is owned by tid, except they can help and set it to Val(val) when rvowns(vretRefs(tid)) = Claimed(tid, val).
prvHpRely(vrvHp, prvHp, prvHp0) ensures that prvHp[rvref] can only change by flushing vrvHp[rvref].
noid(noid(tid, orl + qrl, r, vs) is used in assertions troughout the code. It ensures that no reference after r can be marked with tid. In most places r = vhead, execpt when tid just dequeued a value (then r = next). This assertion is critical to ensure that a thread tid helping another dequeue does not load its own threadid as othertid at D25a. A corresponding rely noidRely(tid, orl + qrl, vs, orl0 + qrl0, vs0) asserts that no other thread will mark a queue cell with tid.
nextnottid(tid, vhead, vs) makes sure that the next reference after vhead is not marked with tid. This assertion and the corresponding nextnottidRely(tid, vhead, vs, vhead0, vs0) ensures that vhead never moves ahead violating the property
In addition to the predicates given above, the rely condition (which uses the state variables for values before the step, and variables with "0" at the end, to describe values after the step) specifies that orl + qrl (where "+" appends lists) move forward by rely steps: the new orl0 + qrl0 is equal to orl + qrl + rl with some new elements "rl" that have been enqueued. Similarly orl = rl0 + orl, where "rl0" contains the values that have been dequeued.

Augmenting the automaton with global steps

The automaton DMSQaux defined above has program steps only. It does not have crashes or flushing steps done by the operating system. The actions of DMSQaux also do not match the actions of the abstract specification. We therefore define a new automaton DMSQ with a step relation, that augments the one of DMSQaux with global steps. To enable an elegant refinement proof, where steps of the concrete automaton match abstract actions 1:1, so an elegant refinement proof becomes possible, we also change the actions to match the actions of the abstract specification. Note that the modifications affect internal actions only (which do not affect, whether refinement holds), the external invoke- and return-actions for ENQ and DEQ remain untouched. To describe DMSQ we first give a short description of the abstract specification DQ, which is also described as automaton IDQ in Fig. 3 of [FM19]. The specification defines a state with three components

The abstract queue "q" (a list of elements)
A local state function from tid's to local states of type DQlocalstate. For convenience, this function merges pc(t), in(t) and out(t). As an example a state invEnq(val) is equivalent to [FM19] pc(t) = invEnq, in(t) = val, state retDeq(optval) is equivalent to pc(t) = retDeq, out(t) = optval.
A function "obsEmptyf(tid)" that corresponds to "obsEmp(t)" in [FM19].

Its actions as specified in DQaction are as in [FM19], except for notation: "res_t(Deq,v)" is written retDeq(tid, Val(v))" for a nonempty value v here. The step relation "DQstep" defined in DQ consists of a global part for stuttering steps and crashes, and a local part for steps done by a single thread. Since the local step relation is very simple for each action, we directly specify a relation "DQlstep" without splitting into a precondition and an effect as in the paper.

The actions of DQ are also used in DMSQ. To this purpose we map the original actions "ca" of DMSQaux with "ca.toQ(gs,ps,ls)" to actions of DQ. invoke and return actions for ENQ and DEQ are mapped by identity. Actions "invRun" and "retRun" (and their corresponding steps) are deleted and replaced with the run(tid) and crash steps. For all the internal steps function the axiom for ".toQ" has to carefully specify the action needed at persistence points:

The steps at E11 is a persistence point for the thread, when the persistent next pointer "ps[ls.lst].nxt" of the cell pointed to by last ("lst" here) is still null: the volatile non-null pointer then has not been already flushed by another thread.
Again, "vs[ls.lst].nxt" is not yet flushed, then E14 is a persistence point for its (weak) owner, that is running ENQ.
When loading next = null at D8, then this is the point where the queue can be successfully checked to be empty.
At D12, the decistion to return Empty from Deq becomes persistent.
At D15, flushing vs[lst].nxt is a persistence point for an enqueue of the thread that (weakly) owns it (when the pointer has not yet been flushed).
At D21, flushing the threadid of the cell is a persistence point, when no other thread has done the flush already.
At D27, flushing the threadid of the cell is a persistence point of the thread that has marked the cell, when there was no flush previously.
All other steps implement the τ action.

The steps of DMSQ are

The global flushing steps. These have an empty action attached when flushing the values of cells, when flushing vretRefs(tid) to pretRefs(tid) and when flushing vrvHp(rvref) to prvHp(rvref) Flushing a next-pointer is a persistence point for an enqueue, when it is the next pointer at the end of a queue. Similarly, flushing a mark is a persistence point for a dequeue, if it flushes the threadid at the head of the queue.
A run step, that modifies the pc of some thread tid from S to I (corresponding to the abstract "idle" and "ready").
The crash+recovery step. This is specified as resetting volatile memory to the persistent one, setting the pc of all threads, which are not in idle state to crashed, resetting ownership to None, deleting local state (except that threads still know their tid), and to execute the recovery program. Running the recovery program is specified as formula a of Dynamic Logic: "⟨p⟩ phi" means: "p terminates, and phi holds afterwards". This program recovers the volatile head by setting it to the persistent one, and then by moving it forward until an unmarked cell or the end of the list is found. Similarly the tail is recomputed, by first setting it to the just computed vhead, and then traversing the queue references, until the end of the list is found.
All program steps of DMSQaux (except the ones for invRUN and retRUN) with the action lifted by ".toQ" as described above.

The initialisation predicate DMSQinit is the same as the one of DMSQaux except that all threads with pcf(tid) = S (aka "idle"). The main theorem proven is that like DMSQaux, DMSQ also satisfies the invariant DMSQauxINV.

Refinement proof

The refinement proof uses the forward simulation (the general definition allows one concrete step to match several abstract ones, as long as the external actions are the same; we do not need this generality here, since we have ensured internal actions match 1:1) defined in specification abs. The definition of predicate abs specifies four properties:

The sequence of values stored under the references of auxiliary variables qrl in persistent store (= content(qrl,ps)) is equal to the abstract queue
When the next pointer of the last queue cell is already non-null in volatile memory, then its (weak) owner is at E10 after a sucessful CAS (success= true), or at E11 before its flush.
When qrl is nonempty, and its head is volatilely marked, then the thread that has marked it is at D20 after a successful CAS, or at D21 before the flush.
All threads have a local state that abstract from the concrete state (specified with predicate absst. When the pc-value of a thread is S, I, C the abstract pc is idle, ready, crashed. If the pc is before its persistence the persistence point of ENQ then abstract thread is in local state invEnq(tid, value). If the pc-value is after the persistence point, then the abstract state in retEnq. Note that there is the critical region at E10, E11 again (as above), where before/after the persistence point depends on whether the volatile reference has been flushed. pc-values while executing deq instructions are similar.

The simulation proof in DMSQ-refines-DQ consists of proving the (trivial) initialization condition and the (complex) proof of a commuting diagram for each step. This proof is broken down into several subproofs:

fwsim-FlushVal, fwsim-FlushHead, fwsim-FlushrvHp, fwsim-FlushRetref prove global flushes which are never persistence points.
fwsim-FlushNxt, fwsim-FlushOpttid prove the global flushes, which are sometimes persistence points.
fwsim-Crash and fwsim-run prove the other two steps, which are not done by programs.
fwsim-lstep proves commuting diagrams for program steps. The proof is by case split over the actions. The steps with invoke and return actions are proven directly, the other steps are again done with the help of lemmas:
- fwsim-lstep-doEnq proves the enqueue step at E11
- fwsim-lstep-checkEmp proves the checkEmpty-step at D8
- fwsim-lstep-doEnqother proves the persistence points at E14 and D15
- fwsim-lstep-doDeq proves the persistence point at D21 and D12 (dequeuing empty)
- fwsim-lstep-doDeqother proves the persistence points at D27
- fwsim-lstep-tau proves all steps that are no persistence points. This proof is again broken down into lemmas fwsin-lstep-tau-<pcval> for every possible pc-value.

Back to the overview of all KIV projects.
For general informations on KIV and a download site for the whole system refer to this page.

Bibliography

[FM19]	J. Derrick, S. Doherty, B. Dongol, G. Schellhorn, H. Wehrheim. Verifying Correctness of Persistent Concurrent Data Structures (FM 2019, Springer LNCS)
[PPoPP18]	M. Friedman, M. Herlihy, V. J. Marathe, E. Petrank. A persistent lock-free queue for non-volatile memory. (PPoPP 2018, pp. 28--40, ACM)
[LV95]	N. Lynch, F. Vaandrager Forward and Backward Simulations -- Part I: Untimed systems (Information and Computation 1995, vol. 121(2), p. 214--233)