23
$\begingroup$

Consider an inductive type which has some recursive occurrences in a nested, but strictly positive location. For example, trees with finite branching with nodes using a generic list data structure to store the children.

Inductive LTree : Set := Node : list LTree -> LTree.

The naive way of defining a recursive function over these trees by recursing over trees and lists of trees does not work. Here's an example with the size function that computes the number of nodes.

Fixpoint size (t : LTree) : nat := match t with Node l => 1 + (size_l l) end
with size_l (l : list LTree) : nat := match l with
    | nil => 0
    | cons h r => size h + size_l r
  end.

This definition is ill-formed (error message excerpted):

Error:
Recursive definition of size_l is ill-formed.
Recursive call to size has principal argument equal to
"h" instead of "r".

Why is the definition ill-formed, even though r is clearly a subterm of l? Is there a way to define recursive functions on such a data structure?


If you aren't fluent in Coq syntax: LTree is an inductive type corresponding to the following grammar.

$$\begin{align} \mathtt{LTree} ::= & \\ \vert & \mathtt{list}(\mathtt{LTree}) \\ \end{align}$$

We attempt to define the size function by induction over trees and lists. In OCaml, that would be:

type t = Node of t list
let rec size = function Node l -> 1 + size_l l
and size_l = function [] -> 0
                    | h::r -> size h + size_l r
$\endgroup$
3
  • $\begingroup$ Is this on-topic? I'm not sure; let's discuss this on Meta. $\endgroup$ Commented Mar 7, 2012 at 17:39
  • $\begingroup$ Can you add equivalent function definitions in something less Coqy and more mathy? I have trouble understanding the syntax. $\endgroup$
    – Raphael
    Commented Mar 7, 2012 at 22:14
  • 1
    $\begingroup$ @Raphael I've tried, is that better now? Honestly, this question is pretty specific to Coq. $\endgroup$ Commented Mar 8, 2012 at 2:39

2 Answers 2

16
$\begingroup$

What works

If you nest the definition of the fixpoint on lists inside the definition of the fixpoint on trees, the result is well-typed. This is a general principle when you have nested recursion in an inductive type, i.e. when the recursion goes through a constructor like list.

Fixpoint size (t : LTree) : nat :=
  let size_l := (fix size_l (l : list LTree) : nat :=
                  match l with
                    | nil => 0
                    | h::r => size h + size_l r
                  end) in
  match t with Node l =>
    1 + size_l l
  end.

Or if you prefer to write this more tersely:

Fixpoint size (t : LTree) : nat :=
  match t with Node l =>
    1 + (fix size_l (l : list LTree) : nat :=
          match l with
            | nil => 0
            | h::r => size h + size_l r
          end) l
  end.

(I have no idea who I heard it from first; this was certainly discovered independently many times.)

A general recursion predicate

More generally, you can define the “proper” induction principle on LTree manually. The automatically generated induction principle LTree_rect omits the hypothesis on the list, because the induction principle generator only understand non-nested strictly positive occurences of the inductive type.

LTree_rect = 
fun (P : LTree -> Type) (f : forall l : list LTree, P (Node l)) (l : LTree) =>
match l as l0 return (P l0) with
| Node x => f x
end
     : forall P : LTree -> Type,
       (forall l : list LTree, P (Node l)) -> forall l : LTree, P l

Let's add the induction hypothesis on lists. To fulfill it in the recursive call, we call the list induction principle and pass it the tree induction principle on the smaller tree inside the list.

Fixpoint LTree_rect_nest (P : LTree -> Type) (Q : list LTree -> Type)
                         (f : forall l, Q l -> P (Node l))
                         (g : Q nil) (h : forall t l, P t -> Q l -> Q (cons t l))
                         (t : LTree) :=
  match t as t0 return (P t0) with
    | Node l => f l (list_rect Q g (fun u r => h u r (LTree_rect_nest P Q f g h u)) l)
  end.

Why

The answer to why lies in the precise rules for accepting recursive functions. These rules are perforce subtle, because there is a delicate balance between allowing complex cases (such as this one, with nested recursion in the datatype) and unsoundness. The Coq reference manual introduces the language (the calculus of inductive constructions, which is the proof language of Coq), mostly with formally precise definitions, but if you want the exact rules regarding induction and coinduction you'll need go to the research papers, on this topic Eduardo Giménez's [1].

Starting with the Coq manual, in the notation of the Fix rule, we have the fixpoint definition of $\mathsf{Fix} f_i \, \{ f_1 : A_1 := t_1 \; ; \;\; f_2 : A_2 := t_2 \}$ where:

$$\begin{align*} \Gamma_1 &= (x : \mathtt{LTree}) & A_1 &= \mathrm{nat} & t_1 &= \mathsf{case}(x, \mathtt{LTree}, \lambda y. g_1 (f_2 y)) \\ \Gamma_2 &= (l : \mathtt{list}\;\mathtt{LTree}) & A_2 &= \mathrm{nat} & t_2 &= \mathsf{case}(l, \mathtt{list \: LTree}, \lambda h \: r. g_2 (f_1 h) (f_2 r)) \\ \end{align*}$$

In order for the fixpoint definition to be accepted, “if $f_j$ occurs [in $t_i$] then … the argument should be syntactically recognized as structurally smaller than [the argument passed to $f_i$]” (simplifying because the functions have a single argument). Here, we need to check that

  • $i=1$, $j=2$: l must be structurally smaller than t in size, ok.
  • $i=2$, $j=1$: h must be structurally smaller than l in size_l, looks ok but isn't!
  • $i=2$, $j=2$: r must be structurally smaller than l in size_l, ok.

The reason why h is not structurally smaller than l according to the Coq interpreter is not clear to me. As far as I understand from discussions on the Coq-club list [1] [2], this is a restriction in the interpreter, which could in principle be lifted, but very carefully to avoid introducing an inconsistency.

References

Cocorico, the nonterminating Coq wiki: Mutual Induction

Coq-Club mailing list:

The Coq Development Team. The Coq Proof Assistant: Reference Manual. Version 8.3 (2010). [web] ch. 4.

Eduardo Giménez. Codifying guarded definitions with recursive schemes. In Types'94: Types for Proofs and Programs, LNCS 996. Springer-Verlag, 1994. doi:10.1007/3-540-60579-7_3 [Springer]

Eduardo Giménez. Structural Recursive Definitions in Type Theory. In ICALP'98: Proceedings of the 25th International Colloquium on Automata, Languages and Programming. Springer-Verlag, 1998. [PDF]

$\endgroup$
7
$\begingroup$

This is obviously a problem specific to Coq since I believe there are nicer ways to get around it with some other proof assistants (I'm looking at Agda)

At first I thought r was not recognized as structurally smaller because the structure is only about the inductive definition currently handled by Fixpoint: so this is not a LTree subterm even if it is a list subterm.

But if you expand the processing of the list, then it works:

Fixpoint size t :=
  match t with
  | Node l => S
     ((fix size_l l :=
     match l with
     | nil => 0
     | cons t l => size_l l + size t
     end) l)
 end.

Or since the auxiliary function already exists in the standard library:

Require Import List.

Fixpoint size t :=
  match t with
  | Node l => S (fold_left (fun a t => a + size t) l 0)
  end.

To be honest I'm not sure why those are accepted by Coq, but I'm sure glad they are.

There is also a solution that works more often and not only for lists: defining a self-contained inductive type. In this case you can define your size function manually. (with two fixpoints)

Inductive LTree : Set :=
  | Node : list_LTree -> LTree
with list_LTree : Set :=
  | LTree_nil : list_LTree
  | LTree_cons : LTree -> list_LTree -> list_LTree.

Note that if you have problems for more complex inductive definitions, you can use a size-decreasing argument. That is possible but cumbersome for this problem (and I would dare to say for most problems)

$\endgroup$
1
  • $\begingroup$ What I still don't understand today is why the naive approach doesn't work. In principle, this should be a consequence of Eduardo Giménez's paper, but I don't see where the deduction breaks; this may be a limitation of the Coq engine rather than the underlying calculus. $\endgroup$ Commented Mar 8, 2012 at 2:30

Not the answer you're looking for? Browse other questions tagged or ask your own question.