Well-founded recursion in Rocq

Gijs Pennings, May 2025

Rocq requires that all functions terminate. That is why, by default, only structurally recursive functions are allowed: functions that recur on structural subterms of the decreasing argument. However, not all functions are structurally recursive. A common solution is well-founded recursion, where arguments decrease with respect to a well-founded relation. Several plugins support well-founded recursion, and in this tutorial we compare different methods through concrete examples. This extends a paper by Leroy (2024).

Proving properties of functions defined using well-founded recursion requires well-founded induction. Hence, this technique is introduced in the first section. In the next section, we perform well-founded recursion using various methods on a simple example: Euclid's algorithm. Then, in the following section, we use our preferred method to implement and verify quicksort. In the last section, we consider an algorithm on trees that requires a more specialized well-founded relation.

This tutorial was built using Alectryon. To step through the proofs, use Ctrl and the arrow keys. For this purpose, the floating style (selectable from the top banner) offers the best readability. The source file is also available.

Induction

Before we delve into recursive functions, we will first introduce some advanced forms of induction. We illustrate this with a toy problem: the sum of the first \(k\) odd numbers is equal to \(k^2\). We compute this sum as follows.

Fixpoint sum_odd (k : nat) : nat :=
  match k with
  | 0 => 0
  | S k' => (2*k' + 1) + sum_odd k'
  end.

Refresher: (weak) induction

In Rocq, we carry out induction using the eponymous induction tactic. It behaves like destruct, which performs a case analysis by generating a subgoal for each constructor of the inductive type. In addition, induction also introduces induction hypotheses. Specifically, it applies an induction principle, by default the one that was automatically generated when the inductive type was declared. Recall that for nat, the default induction principle is:

Check nat_ind.nat_ind
     : forall P : nat -> Prop,
       P 0 ->
       (forall n : nat, P n -> P (S n)) ->
       forall n : nat, P n

In plain language, it states that if some predicate \(P\) holds for \(0\), and if it holds for \(n+1\) given that it holds for \(n\), then \(P\) holds for all natural numbers.

Let's apply induction to a simple example.

Lemma sum_odd_eq_square : forall k, sum_odd k = k*k.forall k : nat, sum_odd k = k * k
Proof.2
  induction k.sum_odd 0 = 0 * 0
k: nat
IHk: sum_odd k = k * k
sum_odd (S k) = S k * S k
  -8 trivial.
  -cf simpl.ck + (k + 0) + 1 + sum_odd k = S (k + k * S k) rewrite IHk.ck + (k + 0) + 1 + k * k = S (k + k * S k) ring.
Qed.

The two bullets correspond to the two premises of the induction principle: the base case and step case, respectively. At rewrite IHk, we use the induction hypothesis.

Instead of induction k, we could also write induction k using nat_ind, specifying the induction principle explicitly. Actually, we can even apply it manually:

Goal forall k, sum_odd k = k*k.2
Proof.5
  apply nat_ind.8forall n : nat,
sum_odd n = n * n -> sum_odd (S n) = S n * S n
  -10 trivial.
  -22 intros.n: nat
H: sum_odd n = n * n
sum_odd (S n) = S n * S n simpl.29n + (n + 0) + 1 + sum_odd n = S (n + n * S n) rewrite H.29n + (n + 0) + 1 + n * n = S (n + n * S n) ring.
Qed.

Strong induction

We can use other induction principles as well. One option is strong induction. Compared to 'weak' induction as used above, it employs a stronger hypothesis: we prove that \(P\ n\) under the assumption that \(P\ m\) for all \(m < n\). By contrast, weak induction essentially only assumes that \(P\ (n-1)\).

Strong induction on natural numbers is formalized by the following theorem.

Theorem strong_ind_nat :
  forall P : nat -> Prop,
    (forall n, (forall m, m < n -> P m) -> P n) ->
    forall n, P n.forall P : nat -> Prop,
(forall n : nat, (forall m : nat, m < n -> P m) -> P n) ->
forall n : nat, P n
Proof.36
  intros.P: nat -> Prop
H: forall n : nat,
(forall m : nat, m < n -> P m) -> P n
2a
P n
  (* We generalize the goal to strengthen the hypothesis! *)
  enough (forall a b, b <= a -> P b).3e
3f
2a
H0: forall a b : nat, b <= a -> P b
40
3dforall a b : nat, b <= a -> P b
  -43 apply H0 with n.44n <= n
    reflexivity.
  -3d48 induction a;
    intros;
    apply H;
    intros.3e
3f
n, b: nat
H0: b <= 0
m: nat
H1: m < b
P m
3e
3f
n, a: nat
IHa: forall b : nat, b <= a -> P b
b: nat
H0: b <= S a
58
59
5a
    +54 lia.  (* contradiction *)
    +5d5a apply IHa.5dm <= a
      lia.
Qed.

Note that we can prove the strong induction principle using weak induction! (It is also possible and easy to prove the converse.) This means that strong induction is exactly as powerful as weak induction, even if their names suggest otherwise; 'strong' only refers to the induction hypothesis.

We can again prove sum_odd_eq_square, now using our strong induction principle. It is not necessary to use strong induction here, but it serves as an example of how to use an alternative induction principle.

Goal forall k, sum_odd k = k*k.1d
Proof.5
  induction k using strong_ind_nat.d
H: forall m : nat, m < k -> sum_odd m = m * m
sum_odd k = k * k
  destruct k.H: forall m : nat, m < 0 -> sum_odd m = m * m
9
d
H: forall m : nat, m < S k -> sum_odd m = m * m
f
  -73 trivial.
  -78f simpl.7818 rewrite H; lia.
Qed.

Well-founded induction

Fundamentally, strong induction works because the natural numbers are well-ordered, meaning that every set of natural numbers has a least element. Hence, every 'chain of reasoning' reaches a base case, typically 0. By contrast, the integers (with the usual ordering) are not well-ordered: \(\mathbb Z\) has no least element, so there is no starting point for reasoning. In fact, there is an infinite descending chain, namely -1, -2, -3, etc.

The idea of strong induction can be generalized to any type equipped with a well-founded relation, i.e., a relation with no infinite descending chains. Equivalently, every element must be accessible: there is a finite path of smaller elements leading to a least element. This form of induction is known as well-founded induction.

We define the notions of accessibility and well-foundedness in line with the standard library:

Section WfInd.

Variable A : Type.
Variable R : A -> A -> Prop.

Inductive Acc (x : A) : Prop :=
  Acc_intro : (forall y, R y x -> Acc y) -> Acc x.

Definition well_founded : Prop := forall a, Acc a.

Now, we can state and prove the well-founded induction principle.

Theorem well_founded_ind :
  well_founded ->
  forall P : A -> Prop,
    (forall x, (forall y, R y x -> P y) -> P x) ->
    forall a, P a.A: Type
R: A -> A -> Prop
well_founded ->
forall P : A -> Prop,
(forall x : A, (forall y : A, R y x -> P y) -> P x) ->
forall a : A, P a
Proof.83
  intros Hwf P H a.86
87
Hwf: well_founded
P: A -> Prop
H: forall x : A, (forall y : A, R y x -> P y) -> P x
a: A
P a
  apply Acc_ind.8dforall x : A,
(forall y : A, R y x -> Acc y) ->
(forall y : A, R y x -> P y) -> P x
8dAcc a
  -95 intros.86
87
8e
8f
90
a, x: A
H0: forall y : A, R y x -> Acc y
H1: forall y : A, R y x -> P y
P x
    apply H.9fforall y : A, R y x -> P y
    assumption.
  -8d99 apply Hwf.
Qed.

Specifically, the above theorem yields the induction principle if we provide a proof of the well-foundedness of \(R\). Instead of requiring that the whole relation is well-founded, we could also restrict our induction to the 'accessible part' of \(R\).

Theorem well_founded_ind_alt :
  forall P : A -> Prop,
    (forall x, (forall y, R y x -> P y) -> P x) ->
    forall a, Acc a -> P a.85forall P : A -> Prop,
(forall x : A, (forall y : A, R y x -> P y) -> P x) ->
forall a : A, Acc a -> P a
Proof.ac
  intros P H a Ha.86
87
8f
90
91
Ha: Acc a
92
  apply Acc_ind.b396
b399
  -b7 intros.86
87
8f
90
91
b4
x: A
a1
a2
a3
    apply H.bfa7
    assumption.
  -b399 assumption.
Qed.

End WfInd.

Unfortunately, the above theorem cannot be directly used with the induction tactic. Therefore, we will always use well_founded_ind, which requires a proof of well-foundedness. Let's prove the well-foundedness of the most fundamental relation: the standard ordering of natural numbers.

Theorem lt_wf : well_founded lt.well_founded lt
Proof.c8
  unfold well_founded.forall a : nat, Acc lt a
  induction a using strong_ind_nat.a: nat
H: forall m : nat, m < a -> Acc lt m
Acc lt a
  constructor.d3forall y : nat, y < a -> Acc lt y
  assumption.
Qed.

Definition lt_wf_ind := well_founded_ind lt_wf.

We now have all the tools we need to apply well-founded induction on our toy problem. Also note that lt_wf_ind, which is the well-founded induction principle specialized to natural numbers, is precisely equal to the strong induction principle as defined earlier.

Check lt_wf_ind.lt_wf_ind
     : forall P : nat -> Prop,
       (forall x : nat,
        (forall y : nat, y < x -> P y) -> P x) ->
       forall a : nat, P a

Goal forall k, sum_odd k = k*k.1d
Proof.5
  induction k using lt_wf_ind.d
H: forall y : nat, y < k -> sum_odd y = y * y
70
  destruct k.H: forall y : nat, y < 0 -> sum_odd y = y * y
9
d
H: forall y : nat, y < S k -> sum_odd y = y * y
f
  -e3 trivial.
  -e8f simpl.e818 rewrite H; lia.
Qed.

To conclude this section, we present a generalization of lt_wf: for any type A, given a function f that maps elements of A to natural numbers, the relation that compares the images under f is well-founded. As we will see later, this result is frequently useful in practice.

Section WfNat.

Variable A : Type.
Variable f : A -> nat.

Definition ltof (a b : A) := f a < f b.

Theorem well_founded_ltof : well_founded ltof.86
f: A -> nat
well_founded ltof
Proof.f3
  intros.f3
  enough (forall n a, f a < n -> Acc ltof a).86
f6
H: forall (n : nat) (a : A), f a < n -> Acc ltof a
f7
f5forall (n : nat) (a : A), f a < n -> Acc ltof a
  -fc unfold well_founded.fdforall a : A, Acc ltof a
    intros.86
f6
fe
91
Acc ltof a
    apply H with (S (f a)).10bf a < S (f a)
    auto.
  -f5101 induction n;
    intros.86
f6
91
H: f a < 0
10c
86
f6
2a
IHn: forall a : A, f a < n -> Acc ltof a
91
H: f a < S n
10c
    +116 lia.  (* contradiction *)
    +11b10c constructor.11bforall y : A, ltof y a -> Acc ltof y
      intros.86
f6
2a
11c
91
11d
y: A
H0: ltof y a
Acc ltof y
      apply IHn.12af y < n
      unfold ltof in H0.86
f6
2a
11c
91
11d
12b
H0: f y < f a
131
      lia.
Qed.

End WfNat.

All the above results are available in the standard library (Init.Wf and Arith.Wf_nat). In the remainder of this tutorial, we will use the declarations from the standard library.

Well-founded recursion

In Rocq, functions on inductive types are generally defined using Fixpoint. Such definitions must terminate. To guarantee this, the definition must satisfy (syntactical) constraints on one of the arguments, called the decreasing argument. Namely, recursive calls must operate on structural subterms of this argument. For example, a function on natural numbers may recur on the predecessor, and a function on lists may recur on the tail.

However, not all functions have this shape. A simple counterexample is Euclid's algorithm for computing the greatest common divisor. The following definition is not allowed in Rocq, since it is not structurally recursive. [†]

Fixpoint gcd (a b : nat) : nat :=
  match b with
  | 0 => a
  | _ => gcd b (a mod b)
  end.Cannot guess decreasing argument of fix.

This function does terminate, of course, since the second argument strictly decreases each recursive call. Formally, < is a well-founded relation on nat (as proven in Well-founded induction), so only a finite chain of recursive calls is required. Because it relies on well-foundedness, we refer to this method of defining recursive functions as well-founded recursion.

Unfortunately, Rocq cannot automatically infer the above reasoning, so we must explicitly specify and prove why the function terminates. In this section we will compare different methods to perform well-founded recursion, using Euclid's algorithm as a running example. This is based on a paper by Leroy (2024), which uses the same example. Specifically, we will consider Function, Equations, Program Fixpoint, and lastly the method proposed by Leroy, which we refer to as Acc-recursion.

We will evaluate each method based on a number of criteria, similarly to Leroy:

Legibility of definitions.
Legibility of extracted code.
Ease of proving properties.
No axioms used.

† Actually, if we define the second case as S b' => gcd (S b') (a mod (S b')), the definition is accepted as structurally recursive! This is because a mod (S b') simplifies to a subtraction, which 'preserves being a subterm' according to the standard library. Hence, it is not necessary to use well-founded recursion to implement Euclid's algorithm, but it functions as an uncomplicated example for our purposes.

Preparation

Before implementing and verifying Euclid's algorithm, we need a formal definition of the greatest common divisor. We will also prove some properties. It is best to derive them from the definition, rather than the various implementations. This enables reuse and avoids dependence on specialized concepts like Acc.

The greatest common divisor (GCD) of two integers a, b is the largest integer x that divides both a and b. Equivalently, x is a common divisor (CD) of a and b, and every common divisor of a and b also divides x. We use Nat.divide from the standard library, which already provides a range of helpful lemmas.

Definition is_cd (a b x : nat) : Prop :=
  Nat.divide x a /\ Nat.divide x b.

Definition is_gcd (a b x : nat) : Prop :=
  is_cd a b x /\ forall y, is_cd a b y -> Nat.divide y x.

First, let's prove some corner cases. These correspond to the base case of the algorithm.

Lemma is_cd_0 : forall x, is_cd x 0 x.forall x : nat, is_cd x 0 x
Proof.139
  split.x: nat
Nat.divide x x
140Nat.divide x 0
  -13f reflexivity.
  -140145 apply Nat.divide_0_r.
Qed.

Lemma is_gcd_0 : forall x, is_gcd x 0 x.forall x : nat, is_gcd x 0 x
Proof.14c
  firstorder using is_cd_0.
Qed.

Next, we essentially prove the step case of Euclid's algorithm: gcd(a, b) is equal to gcd(b, a mod b). To do this, it is useful to prove some identities involving division and modulo.

Lemma divide_mod_1 : forall a b x,
  Nat.divide x a -> Nat.divide x b -> Nat.divide x (a mod b).forall a b x : nat,
Nat.divide x a ->
Nat.divide x b -> Nat.divide x (a mod b)
Proof.151
  intros a b x H1 H2.a, b, x: nat
H1: Nat.divide x a
H2: Nat.divide x b
Nat.divide x (a mod b)
  destruct H1 as [A HA].a, b, x, A: nat
HA: a = A * x
15b
15c
  destruct H2 as [B HB].161
162
B: nat
HB: b = B * x
15c
  exists (A - B * (a/b)).166a mod b = (A - B * (a / b)) * x
  pose proof (Nat.div_mod_eq a b).161
162
167
168
H: a = b * (a / b) + a mod b
16c
  rewrite Nat.mul_sub_distr_r.170a mod b = A * x - B * (a / b) * x
  lia.
Qed.

Lemma divide_mod_2 : forall a b x,
  Nat.divide x (a mod b) -> Nat.divide x b -> Nat.divide x a.forall a b x : nat,
Nat.divide x (a mod b) ->
Nat.divide x b -> Nat.divide x a
Proof.177
  intros a b x H1 H2.159
H1: Nat.divide x (a mod b)
15b
Nat.divide x a
  destruct H1 as [A HA].161
HA: a mod b = A * x
15b
180
  destruct H2 as [B HB].161
185
167
168
180
  exists (A + B * (a/b)).189a = (A + B * (a / b)) * x
  pose proof (Nat.div_mod_eq a b).161
185
167
168
171
18d
  lia.
Qed.

Lemma is_cd_step : forall a b x,
  is_cd a b x -> is_cd b (a mod b) x.forall a b x : nat, is_cd a b x -> is_cd b (a mod b) x
Proof.193
  split.159
H: is_cd a b x
Nat.divide x b
19a15c
  2: apply divide_mod_1.19919a180
19a19c
  all: firstorder.
Qed.

Lemma is_cd_step_rev : forall a b x,
  is_cd b (a mod b) x -> is_cd a b x.forall a b x : nat, is_cd b (a mod b) x -> is_cd a b x
Proof.1a5
  split.159
H: is_cd b (a mod b) x
180
1ac19c
  1: apply divide_mod_2 with b.1ac15c
1af1af
  all: firstorder.
Qed.

Lemma is_gcd_step_rev : forall a b x,
  is_gcd b (a mod b) x -> is_gcd a b x.forall a b x : nat,
is_gcd b (a mod b) x -> is_gcd a b x
Proof.1b5
  (* exercise *)
Admitted.

Function

Now, we are ready to implement gcd. Our first method of choice is the functional induction plugin. Its main command, Function, which behaves like Fixpoint, allows us to perform well-founded recursion using the {wf R x} annotation. Here, x should be the decreasing argument, and R a well-founded ordering relation on the type of x, such that x decreases every recursive call. In the case of gcd, the decreasing argument is b, and the relation is simply lt.

Require Import FunInd.
Require Import Recdef.

Function gcd_1 (a b : nat) {wf lt b} : nat :=
  match b with
  | 0 => a
  | _ => gcd_1 b (a mod b)
  end.forall a b n : nat, b = S n -> a mod S n < S n
ca
(* It remains to prove some obligations... *)
Proof.1ba
  -1bb auto using Nat.mod_upper_bound.
  -c8 apply lt_wf.
Defined.

Unlike Fixpoint, after the definition, we are left with some proof obligations. We need to prove that:

b is decreasing in each recursive call;
the relation lt is well-founded.

In this case, instead of {wf lt b}, we could have used the annotation {measure id b}. This uses the idea of well_founded_ltof as discussed earlier: it asserts that id b decreases with each recursive call. For our purposes, id suffices, though any mapping from the type of b to nat is allowed. With measure, only the first obligation remains, since it is already known that lt is well-founded.

Note that we finish the obligations with Defined instead of Qed. This makes the proof term transparent, enabling computation. Try it yourself: with Qed instead of Defined, the following would not output 6.

Compute gcd_1 12 18.= 6
: nat

Next, we check the extracted code (OCaml), which looks clean.

Extraction gcd_1.(** val gcd_1 : nat -> nat -> nat **)

let rec gcd_1 a = function
| O -> a
| S n -> gcd_1 (S n) (Nat.modulo a (S n))

Lastly, we will prove the correctness of gcd_1. Importantly, if we would use induction b, we would get stuck. Instead, we should use the custom tactic functional induction, which introduces a more relevant induction hypothesis.

Lemma gcd_1_ok : forall a b, is_gcd a b (gcd_1 a b).forall a b : nat, is_gcd a b (gcd_1 a b)
Proof.1c6
  intros.a, b: nat
is_gcd a b (gcd_1 a b)
  functional induction (gcd_1 a b).d4
is_gcd a 0 a
1ce
y: match b with
| 0 => False
| S _ => True
end
IHn: is_gcd b (a mod b) (gcd_1 b (a mod b))
is_gcd a b (gcd_1 b (a mod b))
  -1d2 apply is_gcd_0.
  -1d71da apply is_gcd_step_rev.1d7is_gcd b (a mod b) (gcd_1 b (a mod b))
    assumption.
Qed.

After our preparation in the previous section, the proof is simple. Since is_gcd characterizes the GCD, any further properties we want to prove can start from is_gcd instead of the implementation. (For example, as an exercise, try to prove that gcd_1 is associative without functional induction.) This means that criterion 3 is satisfied. Looking at the definition and extracted code, criteria 1 and 2 are also met. Lastly, since we used no axioms, criterion 4 is also fulfilled.

There is one problem with Function: it is considered 'legacy functionality'. Users are recommended to use the Equations plugin instead. That is the method we will look at next.

Equations

The Equations plugin by Sozeau and Mangin (2019) boasts many features, including a different syntax for function definitions, but we will focus on well-founded recursion. As with Function, an Equations definition supports a wf x R annotation to enable well-founded recursion. (Note that the arguments are swapped!)

From Equations Require Import Equations.

Equations gcd_2 (a b : nat) : nat by wf b lt :=
  gcd_2 x 0 := x;
  gcd_2 x y := gcd_2 y (x mod y).
Next Obligation.x, n: nat
gcd_2: nat -> forall x : nat, x < S n -> nat
n - snd (Nat.divmod x n 0 n) < S n
  lia.
Qed.

Again, a proof obligation was generated (corresponding to the first obligation of gcd_1). The goal may look strange at first, but it is merely the (simplified) definition of Nat.modulo. In contrast to gcd_1, even if the obligation is opaque, computation is possible:

Compute gcd_2 12 18.= 6
: nat

It has been smooth sailing so far, but the extracted code is unfortunately not very readable. The issue seems to be that the function is internally uncurried, and the destructuring carries over into the extracted code.

Extraction gcd_2.(** val gcd_2 : nat -> nat -> nat **)

let gcd_2 a b =
  let rec fix_F x =
    match let _,pr2 = x in pr2 with
    | O -> let pr1,_ = x in pr1
    | S n ->
      let y = (S
        n),(Nat.modulo (let pr1,_ = x in pr1) (S n))
      in
      fix_F y
  in fix_F (a,b)

It remains to prove the correctness. Similarly to the functional induction plugin, Equations offers a custom funelim tactic. It performs a case analysis according to the function's definition, abstracting away the underlying well-founded induction. This allows for succinct proofs.

Lemma gcd_2_ok : forall a b, is_gcd a b (gcd_2 a b).forall a b : nat, is_gcd a b (gcd_2 a b)
Proof.1ee
  intros.1cdis_gcd a b (gcd_2 a b)
  funelim (gcd_2 a b).141
Heqcall: x = gcd_2 x 0
is_gcd x 0 x
1e8
H: is_gcd (S n) (x mod S n) (gcd_2 (S n) (x mod S n))
Heqcall: gcd_2 (S n) (x mod S n) = gcd_2 x (S n)
is_gcd x (S n) (gcd_2 (S n) (x mod S n))
  -1f8 apply is_gcd_0.
  -1fe201 apply is_gcd_step_rev.1feis_gcd (S n) (x mod S n) (gcd_2 (S n) (x mod S n))
    assumption.
Qed.

In summary, apart from the legibility of the extracted code (#2), the Equations plugin fulfills all criteria.

Program Fixpoint

Fixpoint can be prefixed with Program, enabling 'program mode'. Then, as with Function, it supports well-founded recursion through {wf R x} annotations.

Require Import Program.

Program Fixpoint gcd_3 (a b : nat) {wf lt b} : nat :=
  match b with
  | 0 => a
  | _ => gcd_3 b (a mod b)
  end.
Next Obligation.1ce
gcd_3: nat -> forall b0 : nat, b0 < b -> nat
n: 0 <> b
a mod b < b
  auto using Nat.mod_upper_bound.
Qed.

The extracted code (omitted) is cluttered, since proof obligations and dependent wrappers bleed into it, but there is a more pressing issue. Whereas Equations automatically provides reduction lemmas (such as gcd_2_equation_1) and an elimination principle (gcd_2_elim), Program Fixpoint is much more bare-bones.

Check gcd_2_equation_1.  (* generated by Equations *)gcd_2_equation_1
     : forall x : nat, gcd_2 x 0 = x

To verify gcd_3, we need an analogous lemma, but proving it is not easy. Since gcd_3 is defined using the Fix combinator, simpl is unable to unfold gcd_3 a 0. Apart from a Stack Overflow post, little documentation is available. The key is the fix_sub_eq lemma. It is important to note that it carries a (hidden) functional extensionality hypothesis, i.e., it is axiomatic!

Fact gcd_3_equation_1 : forall a, gcd_3 a 0 = a.forall a : nat, gcd_3 a 0 = a
Proof.214
  intros.1d3gcd_3 a 0 = a
  unfold gcd_3.1d3gcd_3_func (existT (fun _ : nat => nat) a 0) = a
  unfold gcd_3_func.1d3Fix_sub {_ : nat & nat}
  (MR lt
     (fun recarg : {_ : nat & nat} => projT2 recarg))
  gcd_3_func_obligation_3
  (fun _ : {_ : nat & nat} => nat)
  (fun (recarg : {_ : nat & nat})
     (gcd_3' : {recarg' : {_ : nat & nat}
               | projT2 recarg' < projT2 recarg} ->
               nat) =>
   match
     projT2 recarg as b'
     return (b' = projT2 recarg -> nat)
   with
   | 0 => fun _ : 0 = projT2 recarg => projT1 recarg
   | S n =>
       fun Heq_b : S n = projT2 recarg =>
       gcd_3'
         (exist
            (fun recarg' : {_ : nat & nat} =>
             projT2 recarg' < projT2 recarg)
            (existT (fun _ : nat => nat)
               (projT2 recarg)
               (projT1 recarg mod projT2 recarg))
            (gcd_3_func_obligation_1 (projT1 recarg)
               (projT2 recarg) (S n)
               (gcd_3_func_obligation_2 n) Heq_b))
   end eq_refl) (existT (fun _ : nat => nat) a 0) = a
  rewrite fix_sub_eq.1d3(let f_sub :=
   fun (recarg : {_ : nat & nat})
     (gcd_3' : {recarg' : {_ : nat & nat}
               | projT2 recarg' < projT2 recarg} ->
               nat) =>
   match
     projT2 recarg as b'
     return (b' = projT2 recarg -> nat)
   with
   | 0 => fun _ : 0 = projT2 recarg => projT1 recarg
   | S n =>
       fun Heq_b : S n = projT2 recarg =>
       gcd_3'
         (exist
            (fun recarg' : {_ : nat & nat} =>
             projT2 recarg' < projT2 recarg)
            (existT (fun _ : nat => nat)
               (projT2 recarg)
               (projT1 recarg mod projT2 recarg))
            (gcd_3_func_obligation_1 (projT1 recarg)
               (projT2 recarg) (S n)
               (gcd_3_func_obligation_2 n) Heq_b))
   end eq_refl in
 f_sub (existT (fun _ : nat => nat) a 0)
   (fun
      y : {y : {_ : nat & nat}
          | MR lt
              (fun recarg : {_ : nat & nat} =>
               projT2 recarg) y
              (existT (fun _ : nat => nat) a 0)} =>
    Fix_sub {_ : nat & nat}
      (MR lt
         (fun recarg : {_ : nat & nat} =>
          projT2 recarg)) gcd_3_func_obligation_3
      (fun _ : {_ : nat & nat} => nat)
      (fun (recarg : {_ : nat & nat})
         (gcd_3' : {recarg' : {_ : nat & nat}
                   | projT2 recarg' < projT2 recarg} ->
                   nat) =>
       match
         projT2 recarg as b'
         return (b' = projT2 recarg -> nat)
       with
       | 0 =>
           fun _ : 0 = projT2 recarg => projT1 recarg
       | S n =>
           fun Heq_b : S n = projT2 recarg =>
           gcd_3'
             (exist
                (fun recarg' : {_ : nat & nat} =>
                 projT2 recarg' < projT2 recarg)
                (existT (fun _ : nat => nat)
                   (projT2 recarg)
                   (projT1 recarg mod projT2 recarg))
                (gcd_3_func_obligation_1
                   (projT1 recarg) (projT2 recarg)
                   (S n) (gcd_3_func_obligation_2 n)
                   Heq_b))
       end eq_refl) (` y))) = a
1d3forall (x : {_ : nat & nat})
  (f
   g : {y : {_ : nat & nat}
       | MR lt
           (fun recarg : {_ : nat & nat} =>
            projT2 recarg) y x} -> nat),
(forall
   y : {y : {_ : nat & nat}
       | MR lt
           (fun recarg : {_ : nat & nat} =>
            projT2 recarg) y x}, f y = g y) ->
match
  projT2 x as b' return (b' = projT2 x -> nat)
with
| 0 => fun _ : 0 = projT2 x => projT1 x
| S n =>
    fun Heq_b : S n = projT2 x =>
    f
      (exist
         (fun recarg' : {_ : nat & nat} =>
          projT2 recarg' < projT2 x)
         (existT (fun _ : nat => nat) (projT2 x)
            (projT1 x mod projT2 x))
         (gcd_3_func_obligation_1 (projT1 x)
            (projT2 x) (S n)
            (gcd_3_func_obligation_2 n) Heq_b))
end eq_refl =
match
  projT2 x as b' return (b' = projT2 x -> nat)
with
| 0 => fun _ : 0 = projT2 x => projT1 x
| S n =>
    fun Heq_b : S n = projT2 x =>
    g
      (exist
         (fun recarg' : {_ : nat & nat} =>
          projT2 recarg' < projT2 x)
         (existT (fun _ : nat => nat) (projT2 x)
            (projT1 x mod projT2 x))
         (gcd_3_func_obligation_1 (projT1 x)
            (projT2 x) (S n)
            (gcd_3_func_obligation_2 n) Heq_b))
end eq_refl
  -226 reflexivity.
  -1d322a intros.d4
x: {_ : nat & nat}
f, g: {y : {_ : nat & nat}
| MR lt
    (fun recarg : {_ : nat & nat} => projT2 recarg)
    y x} -> nat
H: forall
  y : {y : {_ : nat & nat}
      | MR lt
          (fun recarg : {_ : nat & nat} =>
           projT2 recarg) y x}, f y = g y
match
  projT2 x as b' return (b' = projT2 x -> nat)
with
| 0 => fun _ : 0 = projT2 x => projT1 x
| S n =>
    fun Heq_b : S n = projT2 x =>
    f
      (exist
         (fun recarg' : {_ : nat & nat} =>
          projT2 recarg' < projT2 x)
         (existT (fun _ : nat => nat) (projT2 x)
            (projT1 x mod projT2 x))
         (gcd_3_func_obligation_1 (projT1 x)
            (projT2 x) (S n)
            (gcd_3_func_obligation_2 n) Heq_b))
end eq_refl =
match
  projT2 x as b' return (b' = projT2 x -> nat)
with
| 0 => fun _ : 0 = projT2 x => projT1 x
| S n =>
    fun Heq_b : S n = projT2 x =>
    g
      (exist
         (fun recarg' : {_ : nat & nat} =>
          projT2 recarg' < projT2 x)
         (existT (fun _ : nat => nat) (projT2 x)
            (projT1 x mod projT2 x))
         (gcd_3_func_obligation_1 (projT1 x)
            (projT2 x) (S n)
            (gcd_3_func_obligation_2 n) Heq_b))
end eq_refl
    destruct x as [x y].a, x, y: nat
f, g: {y0 : {_ : nat & nat}
| MR lt
    (fun recarg : {_ : nat & nat} => projT2 recarg)
    y0 (existT (fun _ : nat => nat) x y)} -> nat
H: forall
  y : {y0 : {_ : nat & nat}
      | MR lt
          (fun recarg : {_ : nat & nat} =>
           projT2 recarg) y0
          (existT (fun _ : nat => nat) x y)},
f y = g y
match
  projT2 (existT (fun _ : nat => nat) x y) as b'
  return
    (b' = projT2 (existT (fun _ : nat => nat) x y) ->
     nat)
with
| 0 =>
    fun
      _ : 0 = projT2 (existT (fun _ : nat => nat) x y)
    => projT1 (existT (fun _ : nat => nat) x y)
| S n =>
    fun
      Heq_b : S n =
              projT2 (existT (fun _ : nat => nat) x y)
    =>
    f
      (exist
         (fun recarg' : {_ : nat & nat} =>
          projT2 recarg' <
          projT2 (existT (fun _ : nat => nat) x y))
         (existT (fun _ : nat => nat)
            (projT2 (existT (fun _ : nat => nat) x y))
            (projT1 (existT (fun _ : nat => nat) x y)
             mod projT2
                   (existT (fun _ : nat => nat) x y)))
         (gcd_3_func_obligation_1
            (projT1 (existT (fun _ : nat => nat) x y))
            (projT2 (existT (fun _ : nat => nat) x y))
            (S n) (gcd_3_func_obligation_2 n) Heq_b))
end eq_refl =
match
  projT2 (existT (fun _ : nat => nat) x y) as b'
  return
    (b' = projT2 (existT (fun _ : nat => nat) x y) ->
     nat)
with
| 0 =>
    fun
      _ : 0 = projT2 (existT (fun _ : nat => nat) x y)
    => projT1 (existT (fun _ : nat => nat) x y)
| S n =>
    fun
      Heq_b : S n =
              projT2 (existT (fun _ : nat => nat) x y)
    =>
    g
      (exist
         (fun recarg' : {_ : nat & nat} =>
          projT2 recarg' <
          projT2 (existT (fun _ : nat => nat) x y))
         (existT (fun _ : nat => nat)
            (projT2 (existT (fun _ : nat => nat) x y))
            (projT1 (existT (fun _ : nat => nat) x y)
             mod projT2
                   (existT (fun _ : nat => nat) x y)))
         (gcd_3_func_obligation_1
            (projT1 (existT (fun _ : nat => nat) x y))
            (projT2 (existT (fun _ : nat => nat) x y))
            (S n) (gcd_3_func_obligation_2 n) Heq_b))
end eq_refl
    simpl.23bmatch y as b' return (b' = y -> nat) with
| 0 => fun _ : 0 = y => x
| S n =>
    fun Heq_b : S n = y =>
    f
      (exist
         (fun recarg' : {_ : nat & nat} =>
          projT2 recarg' < y)
         (existT (fun _ : nat => nat) y (x mod y))
         (gcd_3_func_obligation_1 x y (S n)
            (gcd_3_func_obligation_2 n) Heq_b))
end eq_refl =
match y as b' return (b' = y -> nat) with
| 0 => fun _ : 0 = y => x
| S n =>
    fun Heq_b : S n = y =>
    g
      (exist
         (fun recarg' : {_ : nat & nat} =>
          projT2 recarg' < y)
         (existT (fun _ : nat => nat) y (x mod y))
         (gcd_3_func_obligation_1 x y (S n)
            (gcd_3_func_obligation_2 n) Heq_b))
end eq_refl
    destruct y.a, x: nat
f, g: {y : {_ : nat & nat}
| MR lt
    (fun recarg : {_ : nat & nat} => projT2 recarg)
    y (existT (fun _ : nat => nat) x 0)} -> nat
H: forall
  y : {y : {_ : nat & nat}
      | MR lt
          (fun recarg : {_ : nat & nat} =>
           projT2 recarg) y
          (existT (fun _ : nat => nat) x 0)},
f y = g y
x = x
23c
f, g: {y0 : {_ : nat & nat}
| MR lt
    (fun recarg : {_ : nat & nat} => projT2 recarg)
    y0 (existT (fun _ : nat => nat) x (S y))} ->
nat
H: forall
  y : {y0 : {_ : nat & nat}
      | MR lt
          (fun recarg : {_ : nat & nat} =>
           projT2 recarg) y0
          (existT (fun _ : nat => nat) x (S y))},
f y = g y
f
  (exist
     (fun recarg' : {_ : nat & nat} =>
      projT2 recarg' < S y)
     (existT (fun _ : nat => nat) (S y) (x mod S y))
     (gcd_3_func_obligation_1 x (S y) (S y)
        (gcd_3_func_obligation_2 y) eq_refl)) =
g
  (exist
     (fun recarg' : {_ : nat & nat} =>
      projT2 recarg' < S y)
     (existT (fun _ : nat => nat) (S y) (x mod S y))
     (gcd_3_func_obligation_1 x (S y) (S y)
        (gcd_3_func_obligation_2 y) eq_refl))
    +246 reflexivity.
    +24e251 apply H.
Qed.

Fact gcd_3_equation_2 : forall a b,
  gcd_3 a (S b) = gcd_3 (S b) (a mod S b).forall a b : nat,
gcd_3 a (S b) = gcd_3 (S b) (a mod S b)
Proof.258
  (* identical to `gcd_3_equation_1`; omitted *)
Admitted.

Using these reduction lemmas we can prove the correctness of gcd_3. Even now the proof is less elegant than with Equations.

Lemma gcd_3_ok : forall b a, is_gcd a b (gcd_3 a b).forall b a : nat, is_gcd a b (gcd_3 a b)
Proof.25d
  induction b using (well_founded_induction lt_wf).60
H: forall y : nat,
y < b -> forall a : nat, is_gcd a y (gcd_3 a y)
forall a : nat, is_gcd a b (gcd_3 a b)
  (* Crucially, the IH is general over `a`. *)
  intros.60
265
d4
is_gcd a b (gcd_3 a b)
  destruct b.H: forall y : nat,
y < 0 -> forall a : nat, is_gcd a y (gcd_3 a y)
d4
is_gcd a 0 (gcd_3 a 0)
60
H: forall y : nat,
y < S b -> forall a : nat, is_gcd a y (gcd_3 a y)
d4
is_gcd a (S b) (gcd_3 a (S b))
  -26e rewrite gcd_3_equation_1.26f1d4
    apply is_gcd_0.
  -274276 rewrite gcd_3_equation_2.274is_gcd a (S b) (gcd_3 (S b) (a mod S b))
    apply is_gcd_step_rev.274is_gcd (S b) (a mod S b) (gcd_3 (S b) (a mod S b))
    apply H.274a mod S b < S b
    auto using Nat.mod_upper_bound.
Qed.

In conclusion, while the definition is legible, Program Fixpoint fails to meet all other criteria. It is cumbersome to work with and hence not recommended.

Acc-recursion

Finally, we study the method proposed by Leroy, which we call Acc-recursion. They propose to 'go back to the basics' and perform structural recursion on the accessibility of the decreasing argument. To understand what this means, let's look at a concrete example:

Fail Fixpoint gcd_rec (a b : nat) (ACC : Acc lt b)
    {struct ACC} : nat :=
  match b with
  | 0 => a
  | _ => gcd_rec b (a mod b) _
  end.The command has indeed failed with message:
Cannot infer this placeholder of type
"Acc lt (a mod b)" in
environment:
gcd_rec : nat -> forall b : nat, Acc lt b -> nat
a, b : nat
ACC : Acc lt b
n : nat

Compared to the naive definition, we added an auxiliary argument: the accessibility proof ACC of b. With the annotation {struct ACC}, we indicate that we want to perform structural recursion on ACC. In the recursive call, it remains to fill the hole _ with a (transparent) proof of accessibility of a mod b. This proof must be a structural subterm of ACC. Leroy suggests to use Acc_inv for this.

Check Acc_inv.Acc_inv
     : forall (A : Type) (R : A -> A -> Prop) (x : A),
       Acc R x -> forall y : A, R y x -> Acc R y

This is the inversion principle for Acc, provided by the standard library. Since Acc has a single constructor, Acc lt b only holds if Acc lt x for all x smaller than b. This means we can extract an accessibility proof of a mod b from Acc lt b.

Fail Fixpoint gcd_rec (a b : nat) (ACC : Acc lt b)
    {struct ACC} : nat :=
  match b with
  | 0 => a
  | _ => gcd_rec b (a mod b) (Acc_inv ACC _)
  end.The command has indeed failed with message:
Cannot infer this placeholder of type "a mod b < b" in
environment:
gcd_rec : nat -> forall b : nat, Acc lt b -> nat
a, b : nat
ACC : Acc lt b
n : nat

Now, the hole requires a proof of a mod b < b (and it can be opaque). You can provide an explicit proof term, but it is easier to generate an obligation using Program:

Program Fixpoint gcd_rec (a b : nat) (ACC : Acc lt b)
    {struct ACC} : nat :=
  match b with
  | 0 => a
  | _ => gcd_rec b (a mod b) (Acc_inv ACC _)
  end.
Next Obligation.1ce
ACC: Acc lt b
210
211
  auto using Nat.mod_upper_bound.
Qed.

Note that we solely use Program to generate obligations for the holes! This is entirely different from the previous section, where we used it to perform well-founded recursion. (Here, we simply perform structural recursion.)

If we extract this definition, the resulting code is flawless, since ACC is in Prop, so it is discarded during extraction.

Extraction gcd_rec.(** val gcd_rec : nat -> nat -> nat **)

let rec gcd_rec a b = match b with
| O -> a
| S _ -> gcd_rec b (Nat.modulo a b)

Next, we want to prove the correctness. You might be inclined to perform induction on ACC, but Leroy warns this is not effective. Instead, we perform well-founded induction on b, the decreasing argument. Then, in order to unfold gcd_rec, we destruct ACC.

Lemma gcd_rec_ok : forall b a ACC, is_gcd a b (gcd_rec a b ACC).forall (b a : nat) (ACC : Acc lt b),
is_gcd a b (gcd_rec a b ACC)
Proof.295
  induction b using (well_founded_induction lt_wf).60
H: forall y : nat,
y < b ->
forall (a : nat) (ACC : Acc lt y),
is_gcd a y (gcd_rec a y ACC)
forall (a : nat) (ACC : Acc lt b),
is_gcd a b (gcd_rec a b ACC)
  destruct ACC.60
29d
d4
a0: forall y : nat, y < b -> Acc lt y
is_gcd a b (gcd_rec a b (Acc_intro b a0))
  simpl.2a2is_gcd a b
  (match b as b' return (b' = b -> nat) with
   | 0 => fun _ : 0 = b => a
   | S n =>
       fun Heq_b : S n = b =>
       gcd_rec b (a mod b)
         (a0 (a mod b)
            (gcd_rec_obligation_1 a b (S n)
               (gcd_rec_obligation_2 n) Heq_b))
   end eq_refl)
  destruct b.H: forall y : nat,
y < 0 ->
forall (a : nat) (ACC : Acc lt y),
is_gcd a y (gcd_rec a y ACC)
d4
a0: forall y : nat, y < 0 -> Acc lt y
1d4
60
H: forall y : nat,
y < S b ->
forall (a : nat) (ACC : Acc lt y),
is_gcd a y (gcd_rec a y ACC)
d4
a0: forall y : nat, y < S b -> Acc lt y
is_gcd a (S b)
  (gcd_rec (S b) (a mod S b)
     (a0 (a mod S b)
        (gcd_rec_obligation_1 a (S b) (S b)
           (gcd_rec_obligation_2 b) eq_refl)))
  -2ab apply is_gcd_0.
  -2b12b4 apply is_gcd_step_rev.2b1is_gcd (S b) (a mod S b)
  (gcd_rec (S b) (a mod S b)
     (a0 (a mod S b)
        (gcd_rec_obligation_1 a (S b) (S b)
           (gcd_rec_obligation_2 b) eq_refl)))
    apply H.2b128a
    auto using Nat.mod_upper_bound.
Qed.

The order in which we quantify over the variables is important: the induction hypothesis should be general over both a and ACC. For the same reason, we should not intros before induction.

Even for a simple function like Euclid's algorithm, the accessibility argument in the goal becomes distractingly verbose. A useful trick: in VsRocq for VS Code, you can Alt+Click to hide this argument in the goal. It is safe to ignore, since the induction hypothesis abstracts over ACC. Other editors might have similar functionality.

Optionally, we can define a neat 'wrapper' that provides the initial accessibility proof.

Definition gcd_4 (a b : nat) : nat := gcd_rec a b (lt_wf b).

Lemma gcd_4_ok : forall a b, is_gcd a b (gcd_4 a b).forall a b : nat, is_gcd a b (gcd_4 a b)
Proof.2c2
  intros.1cdis_gcd a b (gcd_4 a b)
  apply gcd_rec_ok.
Qed.

To conclude, let's assess the criteria. While the extracted code is pristine (#2), our definition is less readable due to the auxiliary accessibility arguments (#1). Proving correctness is somewhat more involved than, for example, using Equations, but nevertheless every step is easy to follow (#3). Lastly, no axioms were required (#4).

Discussion

In the table below, we summarize our experience with the four methods discussed in the preceding sections. Here ✅, 🟡, and ❌ denote 'good', 'adequate', and 'bad', respectively. This mostly agrees with Leroy, except that Acc-recursion is graded 🟡 on the first criterion.

Criteria	Function	Equations	Program Fixpoint	Acc-recursion
Legibility of definitions	✅	✅	✅	🟡
Legibility of extracted code	✅	❌	❌	✅
Ease of proving properties	✅	✅	❌	✅
No axioms used	✅	✅	❌	✅

Leroy mentions two more approaches to perform well-founded recursion: tactics in proof mode and the Fix combinator. We did not study these since they are clearly inferior to the other four methods. Both approaches are lower-level than, for example, Equations, and hence are generally harder to work with. Specifically, both methods suffer from poorly legible definitions and Fix, like Program Fixpoint, requires axioms.

In addition to the four criteria in the table, we could consider a fifth: understandability. Function and Equations are powerful plugins with a user-friendly interface, but it can be difficult to grasp what happens inside the 'black box'. On the other hand, Acc-recursion relies on more basic principles, making it easy to understand. The latter is especially valuable in educational contexts.

Our recommendations are as follows. If code extraction is not a requirement, use Equations. It functions similarly to Function, but is more modern. Otherwise, Acc-recursion offers more transparency and control.

In the remainder of this tutorial, we will apply Acc-recursion to two other problems: quicksort (on lists) and a rebalancing algorithm (on trees). These will serve as examples of how the method performs on more complicated inductive data structures.

Quicksort

We have seen how Acc-recursion can be used to implement Euclid's algorithm. Now, we want to evaluate how well this method performs on more complex algorithms and data structures. To this end, we will study a widely used algorithm on lists: quicksort. For simplicity, we restrict ourselves to lists on natural numbers. Naively, we might implement quicksort as follows.

Definition geb (x y : nat) := Nat.leb y x.

Fixpoint quicksort (l : list nat) :=
  match l with
  | [] => []
  | x :: xs =>
      let L := filter (    geb x) xs in
      let R := filter (Nat.ltb x) xs in
      quicksort L ++ x :: quicksort R
  end.Recursive definition of quicksort is ill-formed.
In environment
quicksort : list nat -> list nat
l : list nat
x : nat
xs : list nat
L := filter (geb x) xs : list nat
R := filter (Nat.ltb x) xs : list nat
Recursive call to quicksort has principal argument
equal to "R" instead of "xs".
Recursive definition is:
"fun l : list nat =>
 match l with
 | [] => []
 | x :: xs =>
     let L := filter (geb x) xs in
     let R := filter (Nat.ltb x) xs in
     quicksort L ++ x :: quicksort R
 end".

However, L and R are not structural subterms of l, so this definition is rejected. Nevertheless, quicksort terminates, since L and R are shorter than l as they never contain the pivot x. We will formalize this using well-founded recursion.

In contrast to Euclid's algorithm, which we implemented using Acc-recursion without much groundwork, we will now systematically derive a definition of quicksort. To this end, the next section presents a practical, general guide on using Acc-recursion.

Guide to Acc-recursion

We have applied Acc-recursion to one concrete problem so far, but it is helpful to distill the method into a design pattern applicable to arbitrary algorithms. That is the goal of this section.

For gcd, we started with the naive definition and gradually added accessibility arguments. This is a practical approach in general. However, in hindsight, we can improve our approach to managing the obligations. You may have noticed that we often wrote auto using Nat.mod_upper_bound, not only to prove the obligations, but also after applying the induction hypothesis. This is no coincidence. Since these obligations are typically simple, we suggest to prove them as separate lemmas and add them to a hint database. Then, after applying the induction hypothesis, proofs can simply be finished with auto. Furthermore, obligations will be automatically solved by the default solving tactic.

Now, combining our approach for gcd and the above insight, we suggest the following steps to define a well-founded recursive function:

Write the naive Fixpoint definition.
Determine the decreasing argument, say x, and the relation, say R, such that x decreases according to R each recursive call.
Add an auxiliary argument ACC : Acc R x to the function signature, and add an argument Acc_inv ACC _ to each recursive call. Also, change Fixpoint into Program Fixpoint.
Now, obligations should be generated. For each goal, create a new lemma. Prove these lemmas and add them to a hint database. It is easiest if you add them to core. Otherwise, you will have to customize the obligation solving tactic.
Then, all obligations should be automatically solved. Check that your function definition is accepted.

In the end, we have a valid definition that is structurally recursive in the ACC argument. You can make this explicit using a {struct ACC} annotation, but this is not necessary since it can be determined automatically.

Next, to prove properties about our new definition, we will perform well-founded induction on the decreasing argument x. This requires the well-founded induction principle for R, which in turn requires a proof that R is well-founded. If R is a relation that compares elements via some mapping f : A -> nat, a proof of well-foundedness is provided by the theorem well_founded_ltof. Otherwise, you need to prove it manually.

To prove a property, proceed as follows:

State the lemma as you would for the naive definition.
Add the quantifier forall ACC and add the ACC argument where necessary. Make sure ACC is the last variable in the quantifier list.
Start the proof with induction x using (well_founded_induction <proof>). That is, perform well-founded induction on the decreasing argument x using the well-foundedness of R.
Next, destruct ACC. Now, with simpl, you can unroll your function.
Proceed as normal. Since your function likely matches on the decreasing argument, the logical next step is destruct x.

At some point in the proof, you will apply the induction hypothesis. Then, you must show that the argument it is applied to is smaller than the original x under R. If your proof follows the structure of the definition, the proof goal will precisely match an obligation. Since we added these as hints, auto can solve them directly.

In the remainder of this section, we will carry out these steps to implement and prove the correctness of quicksort. Before reading on, it is highly instructive to fix the naive definition of quicksort yourself using this guide, and prove a simple property about it.

Implementation

Now, we will formulate a well-founded recursive definition of quicksort. Since we already provided a naive definition, we move on to step 2. The list becomes shorter every recursive call, so the relation should compare lists by length. It can be conveniently defined through ltof (see Well-founded induction).

Definition lt_len     := ltof              _ (@length nat).
Definition lt_len_wf  := well_founded_ltof _ (@length nat).
Definition wf_ind_len := well_founded_induction lt_len_wf.

We also immediately derive the well-foundedness (lt_len_wf) and the induction principle (wf_ind_len) for later use. Note that @ makes all implicit arguments explicit, allowing us to specialize lt_len to lists of natural numbers.

Next, we skip to step 4. You are encouraged to carry out step 3 yourself, and check if qs_oblig is sensible. The proof obligation generated by Program Fixpoint will actually be in terms of lt_len, but we state qs_oblig in its 'unfolded' form. This makes the hint applicable even if the goal is already (partially) unfolded. To ensure the obligations are still solved automatically, we allow lt_len and ltof to be unfolded by auto.

Fact qs_oblig : forall f (x : nat) xs,
  length (filter f xs) < length (x :: xs).forall (f : nat -> bool) (x : nat) (xs : list nat),
length (filter f xs) < length (x :: xs)
Proof.2cc
  intros.f: nat -> bool
141
xs: list nat
length (filter f xs) < length (x :: xs)
  simpl.2d3length (filter f xs) < S (length xs)
  apply Nat.lt_succ_r.2d3length (filter f xs) <= length xs
  apply filter_length_le.
Qed.

Hint Unfold lt_len ltof : core.
Hint Resolve qs_oblig : core.

Then, combining steps 3 and 5, we see that all obligations of the definition with auxiliary arguments are solved automatically:

Program Fixpoint qs_rec (l : list nat) (ACC : Acc lt_len l) :=
  match l with
  | [] => []
  | x :: xs =>
      let L := filter (    geb x) xs in
      let R := filter (Nat.ltb x) xs in
      qs_rec L (Acc_inv ACC _) ++ x :: qs_rec R (Acc_inv ACC _)
  end.

Now, we are ready to prove that qs_rec is correct. The correctness comprises two aspects: the resulting list is sorted and contains the same elements. It is interesting and non-trivial to formalize these notions, but we take a practical approach and use the definitions of sortedness and permutations from the standard library.

Permutation

First, we will show that a quicksorted list is a permutation of the original list. For the naive definition, the lemma would be stated as follows (step 1).

Require Import Permutation.
Print Permutation.Inductive
Permutation (A : Type) : list A -> list A -> Prop :=
    perm_nil : Permutation [] []
  | perm_skip : forall (x : A) (l l' : list A),
                Permutation l l' ->
                Permutation (x :: l) (x :: l')
  | perm_swap : forall (x y : A) (l : list A),
                Permutation (y :: x :: l)
                  (x :: y :: l)
  | perm_trans : forall l l' l'' : list A,
                 Permutation l l' ->
                 Permutation l' l'' ->
                 Permutation l l''.

Arguments Permutation [A]%type_scope (_ _)%list_scope
Arguments perm_nil A%type_scope
Arguments perm_skip [A]%type_scope x [l l']%list_scope
  _
Arguments perm_swap [A]%type_scope x y l%list_scope
Arguments perm_trans [A]%type_scope
  [l l' l'']%list_scope _ _

Fail Lemma qs_Permutation : forall l, Permutation l (qs_rec l).The command has indeed failed with message:
In environment
l : list nat
The term "qs_rec l" has type
 "Acc lt_len l -> list nat"
while it is expected to have type "list nat".

Of course, qs_rec takes an ACC argument, so we add it to the quantifier list (step 2). Now, try to prove this revised lemma. The setup for well-founded induction is already given. A helper lemma, partition_Permutation, may be useful.

Lemma partition_Permutation : forall p xs,
  Permutation xs (filter (geb p) xs ++ filter (Nat.ltb p) xs).forall (p : nat) (xs : list nat),
Permutation xs
  (filter (geb p) xs ++ filter (Nat.ltb p) xs)
Proof.2e2
  induction xs as [| x xs IH].p: nat
Permutation []
  (filter (geb p) [] ++ filter (Nat.ltb p) [])
p, x: nat
2d5
IH: Permutation xs
  (filter (geb p) xs ++ filter (Nat.ltb p) xs)
Permutation (x :: xs)
  (filter (geb p) (x :: xs) ++
   filter (Nat.ltb p) (x :: xs)) trivial.2ee2f1
  unfold geb.2eePermutation (x :: xs)
  (filter (fun y : nat => y <=? p) (x :: xs) ++
   filter (Nat.ltb p) (x :: xs))
  simpl.2eePermutation (x :: xs)
  ((if x <=? p
    then x :: filter (fun y : nat => y <=? p) xs
    else filter (fun y : nat => y <=? p) xs) ++
   (if p <? x
    then x :: filter (Nat.ltb p) xs
    else filter (Nat.ltb p) xs))
  destruct (Nat.leb_spec x p).2ef
2d5
2f0
H: x <= p
Permutation (x :: xs)
  ((x :: filter (fun y : nat => y <=? p) xs) ++
   (if p <? x
    then x :: filter (Nat.ltb p) xs
    else filter (Nat.ltb p) xs))
2ef
2d5
2f0
H: p < x
Permutation (x :: xs)
  (filter (fun y : nat => y <=? p) xs ++
   (if p <? x
    then x :: filter (Nat.ltb p) xs
    else filter (Nat.ltb p) xs))
  -2ff apply Nat.ltb_ge in H.2ef
2d5
2f0
H: (p <? x) = false
302
    rewrite H.30dPermutation (x :: xs)
  ((x :: filter (fun y : nat => y <=? p) xs) ++
   filter (Nat.ltb p) xs)
    simpl.30dPermutation (x :: xs)
  (x
   :: filter (fun y : nat => y <=? p) xs ++
      filter (Nat.ltb p) xs)
    constructor.30dPermutation xs
  (filter (fun y : nat => y <=? p) xs ++
   filter (Nat.ltb p) xs)
    apply IH.
  -305307 apply Nat.ltb_lt in H.2ef
2d5
2f0
H: (p <? x) = true
307
    rewrite H.321Permutation (x :: xs)
  (filter (fun y : nat => y <=? p) xs ++
   x :: filter (Nat.ltb p) xs)
    apply Permutation_cons_app.32131a
    apply IH.
Qed.

Lemma qs_Permutation : forall l ACC, Permutation l (qs_rec l ACC).forall (l : list nat) (ACC : Acc lt_len l),
Permutation l (qs_rec l ACC)
Proof.32b
  induction l as [l IH] using wf_ind_len.       (* step 3 *)l: list nat
IH: forall y : list nat,
ltof (list nat) (length (A:=nat)) y l ->
forall ACC : Acc lt_len y,
Permutation y (qs_rec y ACC)
forall ACC : Acc lt_len l,
Permutation l (qs_rec l ACC)
  destruct ACC.                                 (* step 4 *)333
334
a: forall y : list nat, lt_len y l -> Acc lt_len y
Permutation l (qs_rec l (Acc_intro l a))
  destruct l.                                   (* step 5 *)IH: forall y : list nat,
ltof (list nat) (length (A:=nat)) y [] ->
forall ACC : Acc lt_len y,
Permutation y (qs_rec y ACC)
a: forall y : list nat, lt_len y [] -> Acc lt_len y
Permutation [] (qs_rec [] (Acc_intro [] a))
2a
333
IH: forall y : list nat,
ltof (list nat) (length (A:=nat)) y (n :: l) ->
forall ACC : Acc lt_len y,
Permutation y (qs_rec y ACC)
a: forall y : list nat,
lt_len y (n :: l) -> Acc lt_len y
Permutation (n :: l)
  (qs_rec (n :: l) (Acc_intro (n :: l) a))
  (* exercise *)
Admitted.

Sorted

Proving that a quicksorted list is indeed sorted is more involved. There are two notions of sortedness in the standard library: locally sorted and strongly sorted. Since < is a transitive relation, they are actually equivalent, but we will use StronglySorted since it allows convenient use of, e.g., Forall_app in Sorted_app below.

Require Import Sorted.

Print StronglySorted.Inductive
StronglySorted (A : Type) (R : A -> A -> Prop)
  : list A -> Prop :=
    SSorted_nil : StronglySorted R []
  | SSorted_cons : forall (a : A) (l : list A),
                   StronglySorted R l ->
                   Forall (R a) l ->
                   StronglySorted R (a :: l).

Arguments StronglySorted [A]%type_scope
  R%function_scope _%list_scope
Arguments SSorted_nil [A]%type_scope R%function_scope
Arguments SSorted_cons [A]%type_scope
  [R]%function_scope a [l]%list_scope _ _
Definition Sorted := StronglySorted le.

Lemma Sorted_app : forall xs ys,
  Sorted xs ->
  Sorted ys ->
  (forall x y, In x xs -> In y ys -> x <= y) ->
  Sorted (xs ++ ys).forall xs ys : list nat,
Sorted xs ->
Sorted ys ->
(forall x y : nat, In x xs -> In y ys -> x <= y) ->
Sorted (xs ++ ys)
Proof.34b
  intros xs ys Hxs Hys H.xs, ys: list nat
Hxs: Sorted xs
Hys: Sorted ys
H: forall x y : nat, In x xs -> In y ys -> x <= y
Sorted (xs ++ ys)
  induction xs as [| x xs IH].ys: list nat
Hxs: Sorted []
355
H: forall x y : nat, In x [] -> In y ys -> x <= y
Sorted ([] ++ ys)
141
353
Hxs: Sorted (x :: xs)
355
H: forall x0 y : nat,
In x0 (x :: xs) -> In y ys -> x0 <= y
IH: Sorted xs ->
(forall x y : nat, In x xs -> In y ys -> x <= y) ->
Sorted (xs ++ ys)
Sorted ((x :: xs) ++ ys) trivial.362366
  apply StronglySorted_inv in Hxs.141
353
Hxs: StronglySorted le xs /\ Forall (le x) xs
355
364
365
366
  simpl.36dSorted (x :: xs ++ ys)
  constructor.36dStronglySorted le (xs ++ ys)
36dForall (le x) (xs ++ ys)
  -375 apply IH;
    firstorder.
  -36d379 apply Forall_app.36dForall (le x) xs /\ Forall (le x) ys
    split.36dForall (le x) xs
36dForall (le x) ys
    +385 easy.
    +36d389 apply Forall_forall.36dforall x0 : nat, In x0 ys -> x <= x0
      intros y.141
353
36e
355
364
365
y: nat
In y ys -> x <= y
      apply H.396In x (x :: xs)
      apply in_eq.
Qed.

Next, with the goal of proving qs_Sorted, it is helpful to show that quicksort preserves membership: any element in the output is also in the input list. This is formalized by qs_In. It follows easily from qs_Permutation, but, as an exercise, you can also prove it directly using well-founded induction on l.

Lemma qs_In : forall a l ACC, In a (qs_rec l ACC) -> In a l.forall (a : nat) (l : list nat) (ACC : Acc lt_len l),
In a (qs_rec l ACC) -> In a l
Proof.39e
  intros.d4
333
ACC: Acc lt_len l
H: In a (qs_rec l ACC)
In a l
  apply Permutation_in with (qs_rec l ACC).3a5Permutation (qs_rec l ACC) l
3a5In a (qs_rec l ACC)
  -3ab symmetry.3a5Permutation l (qs_rec l ACC)
    apply qs_Permutation.
  -3a53af assumption.
Qed.

Lemma qs_filter_In : forall f a l ACC,
  In a (qs_rec (filter f l) ACC) -> f a = true.forall (f : nat -> bool) (a : nat) (l : list nat)
  (ACC : Acc lt_len (filter f l)),
In a (qs_rec (filter f l) ACC) -> f a = true
Proof.3ba
  intros.2d4
d4
333
ACC: Acc lt_len (filter f l)
H: In a (qs_rec (filter f l) ACC)
f a = true
  apply qs_In in H.2d4
d4
333
3c2
H: In a (filter f l)
3c4
  apply filter_In in H.2d4
d4
333
3c2
H: In a l /\ f a = true
3c4
  easy.
Qed.

Lemma qs_Sorted : forall l ACC, Sorted (qs_rec l ACC).forall (l : list nat) (ACC : Acc lt_len l),
Sorted (qs_rec l ACC)
Proof.3d0
  induction l as [l IH] using wf_ind_len.333
IH: forall y : list nat,
ltof (list nat) (length (A:=nat)) y l ->
forall ACC : Acc lt_len y, Sorted (qs_rec y ACC)
forall ACC : Acc lt_len l, Sorted (qs_rec l ACC)
  destruct ACC.333
3d8
33a
Sorted (qs_rec l (Acc_intro l a))
  destruct l as [| p l].IH: forall y : list nat,
ltof (list nat) (length (A:=nat)) y [] ->
forall ACC : Acc lt_len y, Sorted (qs_rec y ACC)
341
Sorted (qs_rec [] (Acc_intro [] a))
2ea
333
IH: forall y : list nat,
ltof (list nat) (length (A:=nat)) y (p :: l) ->
forall ACC : Acc lt_len y, Sorted (qs_rec y ACC)
a: forall y : list nat,
lt_len y (p :: l) -> Acc lt_len y
Sorted (qs_rec (p :: l) (Acc_intro (p :: l) a)) constructor.3e73ea
  simpl.3e7Sorted
  (qs_rec (filter (geb p) l)
     (a (filter (geb p) l) (qs_oblig (geb p) p l)) ++
   p
   :: qs_rec (filter (Nat.ltb p) l)
        (a (filter (Nat.ltb p) l)
           (qs_oblig (Nat.ltb p) p l)))
  apply Sorted_app.3e7Sorted
  (qs_rec (filter (geb p) l)
     (a (filter (geb p) l) (qs_oblig (geb p) p l)))
3e7Sorted
  (p
   :: qs_rec (filter (Nat.ltb p) l)
        (a (filter (Nat.ltb p) l)
           (qs_oblig (Nat.ltb p) p l)))
3e7forall x y : nat,
In x
  (qs_rec (filter (geb p) l)
     (a (filter (geb p) l) (qs_oblig (geb p) p l))) ->
In y
  (p
   :: qs_rec (filter (Nat.ltb p) l)
        (a (filter (Nat.ltb p) l)
           (qs_oblig (Nat.ltb p) p l))) -> x <= y
  2: constructor.3f43e7StronglySorted le
  (qs_rec (filter (Nat.ltb p) l)
     (a (filter (Nat.ltb p) l)
        (qs_oblig (Nat.ltb p) p l)))
3e7Forall (le p)
  (qs_rec (filter (Nat.ltb p) l)
     (a (filter (Nat.ltb p) l)
        (qs_oblig (Nat.ltb p) p l)))
3f9
  -3f4 apply IH.3e7ltof (list nat) (length (A:=nat)) (filter (geb p) l)
  (p :: l) auto.
  -3e73ff apply IH.3e7ltof (list nat) (length (A:=nat))
  (filter (Nat.ltb p) l) (p :: l) auto.
  -3e7401 apply Forall_forall.3e7forall x : nat,
In x
  (qs_rec (filter (Nat.ltb p) l)
     (a (filter (Nat.ltb p) l)
        (qs_oblig (Nat.ltb p) p l))) -> p <= x
    intros.2ea
333
3e8
3e9
141
H: In x
  (qs_rec (filter (Nat.ltb p) l)
     (a (filter (Nat.ltb p) l)
        (qs_oblig (Nat.ltb p) p l)))
p <= x
    apply qs_filter_In, Nat.ltb_lt in H.2ea
333
3e8
3e9
141
306
41b
    lia.
  -3e73fa intros x y HL HR.2ea
333
3e8
3e9
x, y: nat
HL: In x
  (qs_rec (filter (geb p) l)
     (a (filter (geb p) l) (qs_oblig (geb p) p l)))
HR: In y
  (p
   :: qs_rec (filter (Nat.ltb p) l)
        (a (filter (Nat.ltb p) l)
           (qs_oblig (Nat.ltb p) p l)))
x <= y
    apply qs_filter_In, Nat.leb_le in HL.2ea
333
3e8
3e9
427
HL: x <= p
429
42a
    destruct HR as [Hp | HR].2ea
333
3e8
3e9
427
42f
Hp: p = y
42a
2ea
333
3e8
3e9
427
42f
HR: In y
  (qs_rec (filter (Nat.ltb p) l)
     (a (filter (Nat.ltb p) l)
        (qs_oblig (Nat.ltb p) p l)))
42a
    2: apply qs_filter_In, Nat.ltb_lt in HR.4322ea
333
3e8
3e9
427
42f
HR: p < y
42a
    all: lia.
Qed.

Conclusion

We can define a wrapper function without the accessibility argument and gather our results in a single theorem.

Definition quicksort l := qs_rec l (lt_len_wf l).

Theorem quicksort_ok : forall l, let s := quicksort l in
                                 Permutation l s /\ Sorted s.forall l : list nat,
let s := quicksort l in Permutation l s /\ Sorted s
Proof.440
  split.333
Permutation l (quicksort l)
447Sorted (quicksort l)
  -446 apply qs_Permutation.
  -44744b apply qs_Sorted.
Qed.

Let's briefly assess Leroy's criteria on our implementation and verification of quicksort. Apart from the auxiliary accessibility arguments, the definition is highly readable. Since these accessibility arguments disappear during extraction (below), the code is arguably even easier to read. To prove properties, there seems to be a standard design pattern, outlined in our guide. Specifically, by deriving well-foundedness using well_founded_ltof and adding the obligations to a hint database, we can focus on the main correctness arguments. Lastly, no axioms were used.

Recursive Extraction quicksort.
type bool =
| True
| False

type nat =
| O
| S of nat

type 'a list =
| Nil
| Cons of 'a * 'a list

(** val app : 'a1 list -> 'a1 list -> 'a1 list **)

let rec app l m =
  match l with
  | Nil -> m
  | Cons (a, l1) -> Cons (a, (app l1 m))

module Nat =
 struct
  (** val leb : nat -> nat -> bool **)

  let rec leb n m =
    match n with
    | O -> True
    | S n' -> (match m with
               | O -> False
               | S m' -> leb n' m')

  (** val ltb : nat -> nat -> bool **)

  let ltb n m =
    leb (S n) m
 end

(** val filter : ('a1 -> bool) -> 'a1 list -> 'a1 list **)

let rec filter f = function
| Nil -> Nil
| Cons (x, l0) ->
  (match f x with
   | True -> Cons (x, (filter f l0))
   | False -> filter f l0)

(** val geb : nat -> nat -> bool **)

let geb x y =
  Nat.leb y x

(** val qs_rec : nat list -> nat list **)

let rec qs_rec = function
| Nil -> Nil
| Cons (x, xs) ->
  let l0 = filter (geb x) xs in
  let r = filter (Nat.ltb x) xs in app (qs_rec l0) (Cons (x, (qs_rec r)))

(** val quicksort : nat list -> nat list **)

let quicksort =
  qs_rec

In summary, Acc-recursion proves effective for implementing and verifying quicksort. In the next section, we test whether Acc-recursion extends to other algorithms.

Rebalancing binary search trees

Binary search trees (BSTs) are most efficient when their height is minimal. While self-balancing BSTs maintain a low height during updates, they are intricate. A simpler alternative is to periodically rebalance the entire tree. A time-optimal in-place algorithm for this was proposed by Stout and Warren (1986), building on earlier work by Day (1976). It is known as the Day–Stout–Warren (DSW) algorithm.

The DSW algorithm consists of two phases. First, the BST is flattened into a vine, where each parent node only has a right child. Then, the vine is transformed into a balanced tree. In this section, we will only examine the first phase, since the second phase is structurally recursive and hence not relevant to this tutorial.

It should be noted that, as a functional programming language, Rocq does not benefit from the algorithm's in-placeness. That is, in the worst case, our implementation requires more than \(O(1)\) additional space. Nevertheless, it is worthwhile to verify the correctness of the algorithm. Moreover, the first phase is particularly relevant for this tutorial, since it uses a specialized relation to prove termination.

In the next section, we will first define binary search trees. Then, we will implement the first phase of the DSW algorithm, appropriately called tree_to_vine. Lastly, we will prove that tree_to_vine is correct.

Binary search trees

A BST is a binary tree that satisfies the following invariant: for any node with key \(v\), all keys in the left subtree are less than \(v\), and all keys in the right subtree are greater than \(v\). First, we define binary trees:

Inductive tree : Type :=
  | E
  | T (l : tree) (v : nat) (r : tree).

Next, we formalize the invariant. A definition is given in Software Foundations, but instead of Inductive, we will define it as a predicate. This definition enables more effective use of the powerful firstorder tactic, simplifying our proofs. This is aptly illustrated by the succinct proof of forall_tree_mono, a lemma we will need later.

Fixpoint forall_tree (P : nat -> Prop) (t : tree) : Prop :=
  match t with
  | E       => True
  | T l v r => P v /\ forall_tree P l /\ forall_tree P r
  end.

Lemma forall_tree_mono : forall P Q t,
  (forall x, P x -> Q x) ->
  forall_tree P t ->
  forall_tree Q t.forall (P Q : nat -> Prop) (t : tree),
(forall x : nat, P x -> Q x) ->
forall_tree P t -> forall_tree Q t
Proof.453
  induction t;
  firstorder.
Qed.

Fixpoint bst (t : tree) : Prop :=
  match t with
  | E       => True
  | T l v r => forall_tree (gt v) l /\ bst l
            /\ forall_tree (lt v) r /\ bst r
  end.

For simplicity, as with quicksort, we assume that the key is a natural number (and there is no separate value). Furthermore, gt and lt are strict, so we disallow duplicate keys.

Implementation of `tree_to_vine`

Now, we implement the first phase of the DSW algorithm, tree_to_vine. The naive definition is given below. Broadly speaking, we traverse down the right spine, and whenever the left subtree is not empty, we rotate one node onto the spine. This is visualized in the following figure. For full details, refer to the original paper.

As an aside, note how elegant the functional definition of a rotation is compared to the imperative implementation in the paper. Essentially, only the parentheses are rearranged, so we can see at a glance that the ordering is correct.

Fail Fixpoint tree_to_vine (t : tree) : tree :=
  match t with
  (* base case: *)
  | E                  => E
  (* move down: *)
  | T E v r            => T E v (tree_to_vine r)
  (* rotate: *)
  | T (T t1 a t2) b t3 => tree_to_vine (T t1 a (T t2 b t3))
  end.The command has indeed failed with message:
Recursive definition of tree_to_vine is ill-formed.
In environment
tree_to_vine : tree -> tree
t : tree
l : tree
b : nat
t3 : tree
t1 : tree
a : nat
t2 : tree
Recursive call to tree_to_vine has principal argument
equal to "T t1 a (T t2 b t3)" instead of
one of the following variables: "l" "t3" "t1"
"t2".
Recursive definition is:
"fun t : tree =>
 match t with
 | E => E
 | T E b t3 => T E b (tree_to_vine t3)
 | T (T t1 a t2) b t3 =>
     tree_to_vine (T t1 a (T t2 b t3))
 end".

In the third case, we recur on a tree that is not a subterm of t, so Rocq rejects the definition. In fact, it is non-trivial to see why tree_to_vine terminates. The key observation is that the number of nodes in left subtrees 'hanging' from the spine decreases each recursive call. Here, we include leaves in the count. This is formalized by size_left.

Fixpoint size (t : tree) : nat := 1 +
  match t with
  | E       => 0
  | T l _ r => size l + size r
  end.

Fixpoint size_left (t : tree) : nat :=
  match t with
  | E       => 0
  | T l _ r => size l + size_left r
  end.

In other words, size_left t decreases each recursive call. Similarly to quicksort, we define a relation based on size_left (and derive the well-foundedness and induction principle). Then, following our guide, we add auxiliary arguments to tree_to_vine and prove the obligations as separate lemmas. There are two distinct obligations this time, but Hint Unfold lt_size_left is sufficient to automatically solve the first one.

Definition lt_size_left     := ltof              _ size_left.
Definition lt_size_left_wf  := well_founded_ltof _ size_left.
Definition wf_ind_size_left := well_founded_induction lt_size_left_wf.

Fact tree_to_vine_oblig2 : forall t1 a t2 b t3,
  size_left (T t1 a (T t2 b t3)) < size_left (T (T t1 a t2) b t3).forall (t1 : tree) (a : nat) (t2 : tree) (b : nat)
  (t3 : tree),
size_left (T t1 a (T t2 b t3)) <
size_left (T (T t1 a t2) b t3)
Proof.459
  intros.t1: tree
d4
t2: tree
60
t3: tree
size_left (T t1 a (T t2 b t3)) <
size_left (T (T t1 a t2) b t3)
  simpl.460size t1 + (size t2 + size_left t3) <
S (size t1 + size t2 + size_left t3)
  lia.
Qed.

Hint Unfold lt_size_left : core.
Hint Resolve tree_to_vine_oblig2 : core.

Program Fixpoint tree_to_vine (t : tree) (ACC : Acc lt_size_left t) :=
  match t with
  | E => E
  | T E v r =>
      T E v (tree_to_vine r (Acc_inv ACC _))
  | T (T t1 a t2) b t3 =>
      tree_to_vine (T t1 a (T t2 b t3)) (Acc_inv ACC _)
  end.

Previously, we used lt and lt_len as relations, which are natural choices to compare natural numbers and lists, respectively. On the other hand, tree_to_vine is an example of where we need a relation tailored to the algorithm.

Verification of `tree_to_vine`

Now, we will prove some properties of tree_to_vine, following our guide. We should prove that the resulting tree is a vine and contains the same elements, similarly to quicksort. First, we establish some supporting lemmas. The first one, forall_tree_to_vine, demonstrates the proof strategy that we will repeatedly use.

Lemma forall_tree_to_vine : forall P t ACC,
  forall_tree P t -> forall_tree P (tree_to_vine t ACC).forall (P : nat -> Prop) (t : tree)
  (ACC : Acc lt_size_left t),
forall_tree P t -> forall_tree P (tree_to_vine t ACC)
Proof.46a
  induction t as [t IH] using wf_ind_size_left.3e
t: tree
IH: forall y : tree,
ltof tree size_left y t ->
forall ACC : Acc lt_size_left y,
forall_tree P y ->
forall_tree P (tree_to_vine y ACC)
forall ACC : Acc lt_size_left t,
forall_tree P t -> forall_tree P (tree_to_vine t ACC)
  destruct ACC.3e
472
473
a: forall y : tree,
lt_size_left y t -> Acc lt_size_left y
forall_tree P t ->
forall_tree P (tree_to_vine t (Acc_intro t a))
  intros.3e
472
473
479
H: forall_tree P t
forall_tree P (tree_to_vine t (Acc_intro t a))
  destruct t.3e
IH: forall y : tree,
ltof tree size_left y E ->
forall ACC : Acc lt_size_left y,
forall_tree P y ->
forall_tree P (tree_to_vine y ACC)
a: forall y : tree,
lt_size_left y E -> Acc lt_size_left y
H: forall_tree P E
forall_tree P (tree_to_vine E (Acc_intro E a))
3e
461
v: nat
462
IH: forall y : tree,
ltof tree size_left y (T t1 v t2) ->
forall ACC : Acc lt_size_left y,
forall_tree P y ->
forall_tree P (tree_to_vine y ACC)
a: forall y : tree,
lt_size_left y (T t1 v t2) -> Acc lt_size_left y
H: forall_tree P (T t1 v t2)
forall_tree P
  (tree_to_vine (T t1 v t2) (Acc_intro (T t1 v t2) a))
  2: destruct t1.4833e
48c
462
IH: forall y : tree,
ltof tree size_left y (T E v t2) ->
forall ACC : Acc lt_size_left y,
forall_tree P y ->
forall_tree P (tree_to_vine y ACC)
a: forall y : tree,
lt_size_left y (T E v t2) -> Acc lt_size_left y
H: forall_tree P (T E v t2)
forall_tree P
  (tree_to_vine (T E v t2) (Acc_intro (T E v t2) a))
3e
t1_1: tree
v0: nat
t1_2: tree
48c
462
IH: forall y : tree,
ltof tree size_left y (T (T t1_1 v0 t1_2) v t2) ->
forall ACC : Acc lt_size_left y,
forall_tree P y ->
forall_tree P (tree_to_vine y ACC)
a: forall y : tree,
lt_size_left y (T (T t1_1 v0 t1_2) v t2) ->
Acc lt_size_left y
H: forall_tree P (T (T t1_1 v0 t1_2) v t2)
forall_tree P
  (tree_to_vine (T (T t1_1 v0 t1_2) v t2)
     (Acc_intro (T (T t1_1 v0 t1_2) v t2) a))
  (* Now, the three cases correspond to those in the definition! *)
  3: simpl; apply IH.48349449bltof tree size_left (T t1_1 v0 (T t1_2 v t2))
  (T (T t1_1 v0 t1_2) v t2)
49bforall_tree P (T t1_1 v0 (T t1_2 v t2))
  all: firstorder.
Qed.

Lemma forall_tree_lt : forall a b t,
  lt b a -> forall_tree (lt a) t -> forall_tree (lt b) t.forall (a b : nat) (t : tree),
b < a -> forall_tree (lt a) t -> forall_tree (lt b) t
Proof.4ab
  intros.1ce
472
H: b < a
H0: forall_tree (lt a) t
forall_tree (lt b) t
  apply forall_tree_mono with (lt a).4b2forall x : nat, a < x -> b < x
4b2forall_tree (lt a) t
  1: transitivity a.1ce
472
4b3
4b4
141
H1: a < x
b < a
4c0a < x
4bb
  all: assumption.
Qed.

Lemma bst_rotate : forall a b t1 t2 t3,
  bst (T (T t1 a t2) b t3) -> bst (T t1 a (T t2 b t3)).forall (a b : nat) (t1 t2 t3 : tree),
bst (T (T t1 a t2) b t3) -> bst (T t1 a (T t2 b t3))
Proof.4c7
  firstorder using forall_tree_lt.
Qed.

We define the vine predicate as a special case of bst where all left children are E. Note that a vine is isomorphic to a sorted list. Then, we prove that tree_to_vine outputs a vine.

Fixpoint vine (t : tree) : Prop :=
  match t with
  | E       => True
  | T E v r => forall_tree (lt v) r /\ vine r
  | _       => False
  end.

Lemma tree_to_vine_shape : forall t ACC,
  bst t -> vine (tree_to_vine t ACC).forall (t : tree) (ACC : Acc lt_size_left t),
bst t -> vine (tree_to_vine t ACC)
Proof.4cc
  induction t as [t IH] using wf_ind_size_left.472
IH: forall y : tree,
ltof tree size_left y t ->
forall ACC : Acc lt_size_left y,
bst y -> vine (tree_to_vine y ACC)
forall ACC : Acc lt_size_left t,
bst t -> vine (tree_to_vine t ACC)
  destruct ACC.472
4d4
479
bst t -> vine (tree_to_vine t (Acc_intro t a))
  intros.472
4d4
479
H: bst t
vine (tree_to_vine t (Acc_intro t a))
  destruct t.IH: forall y : tree,
ltof tree size_left y E ->
forall ACC : Acc lt_size_left y,
bst y -> vine (tree_to_vine y ACC)
486
H: bst E
vine (tree_to_vine E (Acc_intro E a))
461
48c
462
IH: forall y : tree,
ltof tree size_left y (T t1 v t2) ->
forall ACC : Acc lt_size_left y,
bst y -> vine (tree_to_vine y ACC)
48e
H: bst (T t1 v t2)
vine
  (tree_to_vine (T t1 v t2) (Acc_intro (T t1 v t2) a))
  2: destruct t1.4e348c
462
IH: forall y : tree,
ltof tree size_left y (T E v t2) ->
forall ACC : Acc lt_size_left y,
bst y -> vine (tree_to_vine y ACC)
497
H: bst (T E v t2)
vine
  (tree_to_vine (T E v t2) (Acc_intro (T E v t2) a))
49c
49d
49e
48c
462
IH: forall y : tree,
ltof tree size_left y (T (T t1_1 v0 t1_2) v t2) ->
forall ACC : Acc lt_size_left y,
bst y -> vine (tree_to_vine y ACC)
4a0
H: bst (T (T t1_1 v0 t1_2) v t2)
vine
  (tree_to_vine (T (T t1_1 v0 t1_2) v t2)
     (Acc_intro (T (T t1_1 v0 t1_2) v t2) a))
  all: simpl.4e4True
4f2forall_tree (lt v)
  (tree_to_vine t2 (a t2 (le_n (S (size_left t2))))) /\
vine
  (tree_to_vine t2 (a t2 (le_n (S (size_left t2)))))
4f7vine
  (tree_to_vine (T t1_1 v0 (T t1_2 v t2))
     (a (T t1_1 v0 (T t1_2 v t2))
        (tree_to_vine_oblig2 t1_1 v0 t1_2 v t2)))
  -4fd trivial.
  -4f2501 split.4f2forall_tree (lt v)
  (tree_to_vine t2 (a t2 (le_n (S (size_left t2)))))
4f2vine
  (tree_to_vine t2 (a t2 (le_n (S (size_left t2)))))
    1: apply forall_tree_to_vine.4f2forall_tree (lt v) t2
50d
    2: apply IH.5124f2ltof tree size_left t2 (T E v t2)
4f2bst t2
    all: firstorder.
  -4f7503 apply IH.4f74a7
4f7bst (T t1_1 v0 (T t1_2 v t2))
    2: apply bst_rotate.5204f7bst (T (T t1_1 v0 t1_2) v t2)
    all: auto.
Qed.

Next, we want to prove that the output vine contains the same elements as the input tree. Since all keys are unique, it is sufficient to show shared membership through contains. The proof is left as an exercise to the reader. Note that the condition bst t is not necessary.

Fixpoint contains (x : nat) (t : tree) : Prop :=
  match t with
  | E       => False
  | T l v r => x = v \/ contains x l \/ contains x r
  end.

Lemma tree_to_vine_contains : forall x t ACC,
  contains x t <-> contains x (tree_to_vine t ACC).forall (x : nat) (t : tree) (ACC : Acc lt_size_left t),
contains x t <-> contains x (tree_to_vine t ACC)
Proof.52a
  (* exercise *)
Admitted.

The combination of tree_to_vine_shape and tree_to_vine_contains verifies tree_to_vine. This is only a small part of the overall correctness proof of the DSW algorithm.

Conclusion

We introduced and proved the well-founded induction principle. Then, we compared Function, Equations, Program Fixpoint, and Acc-recursion as techniques to perform well-founded recursion. We found that Acc-recursion is best if code extraction is important, and otherwise Equations is a good option. We presented a practical guide on Acc-recursion and used it to verify a non-trivial algorithm: quicksort. We further experimented with Acc-recursion on a tree rebalancing algorithm that requires a specialized relation to prove termination.

Acknowledgments. This tutorial was developed as part of a research internship at Eindhoven University of Technology. Thanks to Herman Geuvers for his guidance in shaping the project and his enthusiastic supervision.

References

Xavier Leroy. "Well-founded recursion done right". In: The Tenth International Workshop on Coq for Programming Languages. Jan. 2024. URL: https://popl24.sigplan.org/details/CoqPL-2024-papers/2/Well-founded-recursion-done-right.

Matthieu Sozeau and Cyprien Mangin. "Equations reloaded: high-level dependently-typed functional programming and proving in Coq". In: Proceedings of the ACM on Programming Languages 3.ICFP (July 2019), pp. 1–29. DOI: 10.1145/3341690.

Quentin F. Stout and Bette L. Warren. "Tree rebalancing in optimal time and space". In: Communications of the ACM 29.9 (Sept. 1986), pp. 902–908. DOI: 10.1145/6592.6599.

Well-founded recursion in Rocq

Induction

Refresher: (weak) induction

Strong induction

Well-founded induction

Well-founded recursion

Preparation

Function

Equations

Program Fixpoint

Acc-recursion

Discussion

Quicksort

Guide to Acc-recursion

Implementation

Permutation

Sorted

Conclusion

Rebalancing binary search trees

Binary search trees

Implementation of tree_to_vine

Verification of tree_to_vine

Conclusion

References

Implementation of `tree_to_vine`

Verification of `tree_to_vine`