Verification of Durable Linearizability of a Persistent Queue
This Website documents KIV-proofs done for the paper [FM19].
The formal proofs make use of a library
that contains a formalization of much of the refinement theory of IO-Automata from [LV95].
The specifications imported from this library contain
.
The proofs benefit from an automated transformation of programs to a standard IO-Automaton,
that is programmed in KIV. Based on this transformation, the proofs consist of three major parts,
that are part of the main project.
- As a first step, the programs in specification DMSQaux
are annotated with assertions. When transforming to an IO-Automaton the assertions
generate proof obligations, which imply that the automaton satisfies a (quite complex) invariant.
Getting all these assertions right, was the most complex part of the verification.
- The second step, in specification DMSQ, augments the automaton
with the global transitions of flushing values from volatile to persistent memory
and by adding a step for crash and recovery. It is proved that these extra steps
still preserve the invariant from the first step.
- That the resulting automaton refines the abstract specification DQ (the automaton IDQ given in [FM19], Figure 3),
is proved in DMSQ-refines-DQ via a forward simulation. The forward simulation itself
is defined as predicate abs in specification abs
We describe the construction of the automaton, and the three proof steps in the following subsections.
Construction of a standard automaton from programs
The main part of the specification DMSQaux is the definition of two procedures
ENQ and DEQ that follow the description of Algorithm 2 in [FM19]. Please ignore the RUN routine. It is defined solely
to introduce the three extra program counter values S, I and C (the final automaton constructed will override the steps of RUN).
Note that the line numbers (= program counter values) are slightly different than the ones given in the paper. Assertions which hold at a specific line
number are given with /* and */ or below the code if they hold for ranges of program counter values.
The syntax is similar to the one of the paper, but there are some small discrepancies.
- All programs can refer to the global variables specified under "state variables".
- Accessing the value stored under a reference r in volatile/persistent store is explicitly given as vs[r]/ps[r].
vs and ps map a finite number of allocated references to cells,
that contain three elements. A value stored in the queue is stored under vs[r].val,
an optional threadid (either None or Some(tid)) under vs[r].opttid, the pointer to the next cell is vs[r].nxt.
- Queue-references are stored in vs and ps. All other references are stored separately.
The volatile/persistent head is stored in vhead/phead. Note that the tail of the queue is stored
in volatile memory only, and reconstructed by traversing the queue from phead on a crash.
- References to return values are allocated and stored in a separate part of the heap.
The volative and persistent versions of this heap are called vrvHp and prvHp.
It maps return value references to optional values (either Val(value) or Empty).
The return values array which stores references (rvref) to return values for each thread is modelled
as two functions vretRefs and pretRefs.
Splitting the heap into two parts and not storing the return values array or the head
in it avoids an untyped, flat heap. As a disadvantage of the typing scheme, flushing vhead,
vretRefs(tid), vrHp(rvref) as well flushing a next-pointer/a value/ a threadid of a queue-cell all look different.
- The atomic steps of the program are parallel assignments (the individual assignments are separated with a comma,
while statements are separated with a semicolon), the evaluation of tests, allocating a new cell (by choosing an
arbitrary one not yet allocated at line E2). Instead of a "while(true)"-loop with return statements in the middle of the code a
"while (not success)"-loop is used, where success is a boolean variable, that starts as false and is set to true by CAS instructions.
A "CAS(old, global, new)" is encoded as "if* old=global then success:=true, global := new else success := false".
The test of an "if*" is evaluated together with the following assignment in one atomic step.
- Variables "first" and "last" in the code of [FM19] are named "fst" and "lst" here, since "last" is a keyword in KIV.
- The programs use four auxiliary variables:
- qrl is the list of references to cells in the heap that store
elements of the abstract queue (the abstraction function will specify that ps[r].val for the references r in qrl
gives the abstract queue).
- orl contains all references that have alreadly been dequeued in persistent store. orl includes the initial
dummy cell. qrl and orl are updated at persistence points at E11, E14, D15, D21 and D27.
- owns maps references r, that have been allocated but have not been enqueued in persistent store
to the thread that allocated them. This ensures that they are different from each other
and from queue references, which have owns(r) = None. A freshly allocated cell has owns(r) = Some(tid).
When it is enqueued in volatile store (by the CAS at E9), its owner changes to Weak(r). When the reference
is flushed (either at E11 by the thead enqueuing it, or by another thread helping at E14 or D15), i.e.
when it is enqueued in persistent store, the owner becomes None.
- rvowns maps references rvref, that have been allocated by thread tid to either Pending(tid) or
Claimed(tid, val). The latter case is used, when a successful CAS at D19 has fixed the return
value val the dequeue will return and store in vrvHp[vretRefs(tid)]. One of the critical features
of the algorithm is that as soon as rvowns(rvref) = Claimed(tid,val), a helping
thread can overwrite vrvHp[rvref] with val. This is possible even if the thread tid has finished
the dequeue already and started a new one. In this case rvref will be no longer equal to retRefs(tid),
and the helping thread will not really help, but overwrite an obsolete old reference.
The declaration of the programs generates an IO Automaton with the following components (the generated
specifications are all shown below the main specification).
- The state of the automaton is of type "DMSQauxState". It consists of
- a global state "gs" of type DMSQauxGState, which is a tuple of all state variables,
- a program counter function "pcf". pcf(tid) returns for every thread tid the program line (of type DMSQauxPC).
the thread will execute next. A thread which is outside of the two programs is at either pc = S, pc = C or pc = I.
These three state correspond to "idle", "ready" and "crashed" in [FM19].
ENQ and DEQ can be invoked, when pc = I.
- a local state function "lsf". lsf(tid) returns the local state of thread tid (= all local variables and parameters)
as a tuple of values. One component lsf(tid).tid of the local state is always the threadid of the running thread.
The threadid cannot be assigned by construction, so all threads satisfy the invariant "lsf(tid).tid = tid".
- The actions (type DMSQauxAction) consist of invoke and return steps for the program (again invRUN and retRUN will be discared
in the augmented automaton), which are used for the calling/returning step. Invoke steps move the arguments of the action
to local variables, return steps require that the arguments of the action coincide with the computed values in local state.
All internal steps of the programs have action "τ", except nondeterministic steps. In our case there is one
at line E2, which allocates (chooses) a node, ce and success. This step executes action "chooseE1(node, ce, success)"
where the arguments are the chosen values (the third component will always be false, since the choice has "with success = false"). A similar choice to allocate a new return reference is at line D2.
Having a non-empty (internal) action for nondeterministic steps allows to compute a deterministic final state after each step,
depending on the initial state and the action. Conceptually the invoke and return steps are the external actions.
- The step-relation step(A) \subseteq state \times action \times state (predicate DMAQauxstep) is defined by lifting local
steps (predicate lstep) to the full state. Predicate lstep has the global state "gs", together with the localstate "ls" and the "pc"-value
of a single thread as input. For a given action "a" it returns modified values of "gs", "ls" and "pc".
- The lstep-relation is defined via a precondition (predicate pre), and three functions "gstepf", "lstepf" and "pcstepf".
"pre(gs,ls,pc,a)" determines if a step with action "a" is possible for the thread, "gstepf(gs, ls, pc,a)",
"lstepf(gs, ls, pc,a)" and "pcstepf(gs, ls, pc,a)" compute the global state, local state and pc-value after the step.
Usually the precondition requires the action to be empty. Assignments to global resp. variables are reflected in the definitions
of gstepf and lstepf for each pc-value. In these definitions
update functions like ".next:=" correspond to selectors ".next". They are written infix: "ls.next:= r" is local state "ls", where
the next-field has been updated to "r". For an if-then-else in the code "pcstepf" will use a predicate logic if-then-else for
for the new pc-value, written "(test ⊃ thenpc; elsepc)"
Proof obligations for assertions in the programs
The lines of the programs are annotated with assertions. The assertions are combined with the invariants given below the code.
Invariants without a list of pc's are added (by conjunction) to all assertions. They are global invariants that are proven to hold in every state
reachable by the automaton (even when no thread is currently executing one of the programs), so they depend on global state only. They are
collected in the definition of predicate "DMSQauxGInv". Assertions with a range of pc's are added to the assertion
of each pc within the range. Note that assertions are checked to mention only those local variables, which are defined at the
respective program line. The assertions are collected to define a (rather large) predicate "DMSQauxLinv(gs, ls, pc)".
Its definition is a big conjunction of formulas of the form "pc = <pcvalue> -> <assertion at pcvalue>". The generated proof obligations
guarantee that "DMSQauxLinv(gs, lsf(tid), pcf(tid))" holds for all threads in every state of the automaton (together the
full invariant "DMSQauxInv(gs, lsf, pcf)" is implied to hold in every reachable state.
Below the program and the invariants a rely condition is specified. From this formula the definition
for predicate "DMSQauxRely(gs, tid, gs0)" is generated. Using a rely condition avoids the quadratic complexity inherent
in Owicki-Gries style proofs.
The following proof obligations are generated in the theorembase of specification DMSQaux
for every step of the program, to ensure that DMSQauxInv holds:
- "step-<pcval>-<pcval0>": for every step of the thread tid from pcvalue "pcval" to "pcval0":
From the global invariant and the local invariant of tid (= the assertion that holds at "pcval") prove that
after a step (i.e. under "pre" and result computed with "gstepf", "lstepf" and "pcf")
the global invariant and the local invariant (= the assertion at "pcval0") hold again.
Together these proof obligations imply
step-DMSQauxLInv (This is generated as an axiom to be used by specifications above the current one.
Note that the axiom is of course not available for proving the proof obligations).
- "rely-<pcval>-<pcval0>": Every step of the program, that modifies global variables is a step that satisfies the rely of other threads.
When tid0 is different from ls.tid, the step satisfies DMSQauxRely(gs, tid0, gstepf(gs, ls, <pcval>)).
Together with a reflexivity-condition for the rely-predicate (which subsumes all steps that do not change global variables)
the proof obligations implies rely-DMSQauxLInv
- "stable-<pcval>: The assertions at each point "pcval" are stable with respect to rely steps. Note that if the assertions
at several pc-values are equal, the proof obligation is generated only for the earliest pc-value.
Together these proof obligations imply stable-DMSQauxLInv.
- The two previous theorems combined imply other-DMSQauxLInv:
steps of other threads preserve the local invariant (i.e. local assertions) of a thread.
- Combining the previous results gives the main theorem DMSQauxInv-thm, that ensure that the full invariant
DMSQauxInv is preserved by all steps.
For the concrete algorithm, the relevant predicates needed to define its invariants and rely condition are defined at the beginning of the
specification. We give an informal description of the main "listp" -predicate first (for a pictorial description see also Fig. 2 in [FM19]),
and then go through the other predicates.
- listp(orl + qrl, ps, vs) is the main predicate of the global invariant. It specifies the possible queue structure in volatile and persistent
memory. It specifies that orl + qrl is a non-empty list of allocated references (at least one dummy cell is always in orl).
The references form a linked list in both volatile (vs) and persistent (ps) memory (each ps[r].nxt is equal to the next element in the list).
They all have owns(r) = None. The cells stored under the references all have the same content in vs and ps, except for two possible exceptions
- A cell may be marked in volatile, but not yet in persistent memory (otherwise their marking is equal).
- The last cell r (axiom listp-one specifies this case), must have ps[r].nxt = null, but vs[r].nxt may point to a cell,
that has been enqueued in volatile store. This cell then has a weak owner, is the same in vs and ps, and its next pointer is null.
.
- equalowns(vs, ps, owns, tid, vs0, ps0, owns0) is used in the rely condition: Allocated references r strongly owned by tid
(owns(r) = Some(tid)) cannot be changed by steps other threads, and their ownership will stay the same. Weakly owned cells
will not change, if they are still weakly owned after the step. Otherwise the ownership has changed to None.
- "rl from r" computes the references in reference list "rl" that come after "r" (including r).
If r is not contained in rl, the result is the empty list. This predicate is used in the rely to ensure that references like head or tail
move forward in the queue only.
- markedp(orl, vs/ps) and unmarkedp(qrl, vs/ps): These predicates are used as part of the global invariant:
they ensures that old cells in orl/ current queue cells in qrl are marked/unmarked, i.e. their threadid vs/ps[r].opttid is Some(tid)/None.
- equalrl(orl + qrl, vs, ps, vs0, ps0) is used in the rely condition:
All cells stored under the references in orl and qrl either change their marking from None to Some(tid), or otherwise leave it unchanged.
The values and the nxt-references are unchanged, except for the nxt-reference of the last reference "rl.last".
If this points to null, a new enqueue may have changed it. Otherwise, in the exceptional case of listp above,
where vs[rl.last].nxt is not null, the reference may not change anymore, and the value stored in the cell vs[rl.last].nxt also must not be changed.
- allocrv(vretRefs, vrvHp) and allocrv(pretRefs, prvHp) assert that return references are either null (their initial value) or allocated and non-null.
- Invariant nonullRetRefs(orl + qrl, vs, vretRefs, pretRefs) asserts that before a reference (from orl or qrl)
is dequeued by marking it with tid the refurn reference v/pretRefs(tid) has been allocated and can no longer be null.
- Invariants rvownerok(vretRefs, vrvHp, rvowns) and rvownerok(pretRefs, prvHp, rvowns) asserts
that v/pretRefs(tid) is always owned by tid.
- Rely predicate rvownsrely(tid, rvowns, rvowns0, vrvHp) asserts that ownership of return values can only be changed from Pending(tid) to Claimed(tid,val).
- rvrely(tid, rvowns, vretRefs, vrvHp, vretRefs0, vrvHp0) ensures that other threads than tid cannot change the value
stored at vrvHp[vretRefs(tid)] when rvref is owned by tid, except they can help
and set it to Val(val) when rvowns(vretRefs(tid)) = Claimed(tid, val).
- prvHpRely(vrvHp, prvHp, prvHp0) ensures that prvHp[rvref] can only change by flushing vrvHp[rvref].
- noid(noid(tid, orl + qrl, r, vs) is used in assertions troughout the code. It ensures that
no reference after r can be marked with tid. In most places r = vhead, execpt when tid just
dequeued a value (then r = next). This assertion is critical to ensure that a thread tid helping
another dequeue does not load its own threadid as othertid at D25a. A corresponding rely noidRely(tid, orl + qrl, vs, orl0 + qrl0, vs0)
asserts that no other thread will mark a queue cell with tid.
- nextnottid(tid, vhead, vs) makes sure that the next reference after vhead is not marked with tid.
This assertion and the corresponding nextnottidRely(tid, vhead, vs, vhead0, vs0) ensures that
vhead never moves ahead violating the property
- In addition to the predicates given above, the rely condition (which uses the state variables for values before the step, and
variables with "0" at the end, to describe values after the step) specifies that orl + qrl (where "+" appends lists) move forward
by rely steps: the new orl0 + qrl0 is equal to orl + qrl + rl with some new elements "rl" that have been enqueued.
Similarly orl = rl0 + orl, where "rl0" contains the values that have been dequeued.
Augmenting the automaton with global steps
The automaton DMSQaux defined above has program steps only. It does not have crashes or flushing steps done by the operating system.
The actions of DMSQaux also do not match the actions of the abstract specification. We therefore
define a new automaton DMSQ with a step relation, that augments the one of DMSQaux with global steps.
To enable an elegant refinement proof, where steps of the concrete automaton match abstract actions 1:1,
so an elegant refinement proof becomes possible, we also change the actions to match the actions of the abstract specification.
Note that the modifications affect internal actions only (which do not affect, whether refinement holds),
the external invoke- and return-actions for ENQ and DEQ remain untouched.
To describe DMSQ we first give a short description of the abstract specification DQ, which is also described as
automaton IDQ in Fig. 3 of [FM19]. The specification defines a state with three components
- The abstract queue "q" (a list of elements)
- A local state function from tid's to local states of type DQlocalstate.
For convenience, this function merges pc(t), in(t) and out(t). As an example a state invEnq(val) is equivalent
to [FM19] pc(t) = invEnq, in(t) = val, state retDeq(optval) is equivalent to pc(t) = retDeq, out(t) = optval.
- A function "obsEmptyf(tid)" that corresponds to "obsEmp(t)" in [FM19].
Its actions as specified in DQaction are as in [FM19], except for notation: "res_t(Deq,v)" is written retDeq(tid, Val(v))"
for a nonempty value v here. The step relation "DQstep" defined in DQ consists of a global part for stuttering steps and crashes,
and a local part for steps done by a single thread. Since the local step relation is very simple for each action,
we directly specify a relation "DQlstep" without splitting into a precondition and an effect as in the paper.
The actions of DQ are also used in DMSQ. To this purpose we map the original actions "ca" of DMSQaux with
"ca.toQ(gs,ps,ls)" to actions of DQ. invoke and return actions for ENQ and DEQ are mapped by identity.
Actions "invRun" and "retRun" (and their corresponding steps) are deleted and replaced with the run(tid) and crash steps.
For all the internal steps function the axiom for ".toQ" has to
carefully specify the action needed at persistence points:
- The steps at E11 is a persistence point for the thread, when the persistent next pointer "ps[ls.lst].nxt" of the cell pointed to by last ("lst" here) is still
null: the volatile non-null pointer then has not been already flushed by another thread.
- Again, "vs[ls.lst].nxt" is not yet flushed, then E14 is a persistence point for its (weak) owner, that is running ENQ.
- When loading next = null at D8, then this is the point where the queue can be successfully checked to be empty.
- At D12, the decistion to return Empty from Deq becomes persistent.
- At D15, flushing vs[lst].nxt is a persistence point for an enqueue of the thread that (weakly) owns it (when the pointer has not yet been flushed).
- At D21, flushing the threadid of the cell is a persistence point, when no other thread has done the flush already.
- At D27, flushing the threadid of the cell is a persistence point of the thread that has marked the cell, when there was no flush previously.
- All other steps implement the τ action.
The steps of DMSQ are
- The global flushing steps. These have an empty action attached when flushing the values of cells,
when flushing vretRefs(tid) to pretRefs(tid) and when flushing vrvHp(rvref) to prvHp(rvref)
Flushing a next-pointer is a persistence point for an enqueue, when it is the
next pointer at the end of a queue.
Similarly, flushing a mark is a persistence point
for a dequeue, if it flushes the threadid at the head of the queue.
- A run step, that modifies the pc of some thread tid from S to I (corresponding to the abstract "idle" and "ready").
- The crash+recovery step.
This is specified as resetting volatile memory to the persistent one, setting the pc of all
threads, which are not in idle state to crashed, resetting ownership to None, deleting local state
(except that threads still know their tid), and to execute the recovery program.
Running the recovery program is specified as formula a of Dynamic Logic: "⟨p⟩ phi" means: "p terminates, and phi holds afterwards".
This program recovers the volatile head by setting it to the persistent one, and then by moving it forward
until an unmarked cell or the end of the list is found. Similarly the tail is recomputed, by first setting it to the just computed vhead,
and then traversing the queue references, until the end of the list is found.
- All program steps of DMSQaux (except the ones for invRUN and retRUN) with the action lifted by ".toQ" as described above.
The initialisation predicate DMSQinit is the same as the one of DMSQaux
except that all threads with pcf(tid) = S (aka "idle").
The main theorem proven is that like DMSQaux, DMSQ also satisfies the invariant DMSQauxINV.
Refinement proof
The refinement proof uses the forward simulation (the general definition allows one concrete step to match several abstract ones, as
long as the external actions are the same; we do not need this generality here, since we have ensured internal actions match 1:1)
defined in specification abs.
The definition of predicate abs specifies four properties:
- The sequence of values stored under the references of auxiliary variables qrl in persistent store (= content(qrl,ps)) is equal to the abstract queue
- When the next pointer of the last queue cell is already non-null in volatile memory, then its (weak) owner is at E10 after a sucessful CAS (success= true),
or at E11 before its flush.
- When qrl is nonempty, and its head is volatilely marked, then the thread that has marked it is at D20 after a successful CAS, or at D21 before the flush.
- All threads have a local state that abstract from the concrete state (specified with predicate absst.
When the pc-value of a thread is S, I, C the abstract pc is idle, ready, crashed. If the pc is before its persistence the persistence point of ENQ then
abstract thread is in local state invEnq(tid, value). If the pc-value is after the persistence point, then the abstract state in retEnq.
Note that there is the critical region at E10, E11 again (as above), where before/after the persistence point depends on whether
the volatile reference has been flushed. pc-values while executing deq instructions are similar.
The simulation proof in DMSQ-refines-DQ consists of proving the (trivial)
initialization condition and the (complex) proof of a
commuting diagram for each step.
This proof is broken down into several subproofs:
Back to the overview of all KIV projects.
For general informations on KIV and a download site for the whole system refer to this page.
Bibliography
[FM19] |
J. Derrick, S. Doherty, B. Dongol, G. Schellhorn, H. Wehrheim.
Verifying Correctness of Persistent Concurrent Data Structures
(FM 2019, Springer LNCS) |
[PPoPP18] |
M. Friedman, M. Herlihy, V. J. Marathe, E. Petrank.
A persistent lock-free queue for non-volatile memory.
(PPoPP 2018, pp. 28--40, ACM) |
[LV95] |
N. Lynch, F. Vaandrager
Forward and Backward Simulations -- Part I: Untimed systems
(Information and Computation 1995, vol. 121(2), p. 214--233) |