Binding variables to terms involving later variables

Question

Consider the following pseudocode in a hypothetical proof assistant:

def f (n : ℕ) : P n
  := match n with
     | zero -> ?0
     | suc k -> ?1
     end

Under what seems to me to be the most natural approach to parsing this code, the variable n scopes over the rest of the definition after it appears, while the variable k scopes over its branch ?1. Since k is encountered later, it will be added to the context later, and so the context of the branch ?1 will then be something like

n : ℕ
k : ℕ

Except, in that branch the match variable n is supposed to be specialized to suc k. That certainly has to happen in the type: in that branch we have to give a term of type P (suc k). But it's hard to imagine forbidding the user from explicitly writing n inside that branch either; from the user's perspective that would amount to removing variables from the context, which is weird and unintuitive. So to allow the user to write n but ensure that it's equal to suc k, it seems that the context of the branch ?1 must be something like

n : ℕ := suc k
k : ℕ

Allowing some of the variables in a context to come with definitions (i.e. to be "let-bound") is fairly standard. But here we have a variable that occurs earlier in the context being let-bound to equal a term in which appears a variable that occurs later in the context. Mathematically, this is impossible under the usual inductive definition of contexts according to which we can extend a context with a variable belonging to a type in that context and possibly having a value in that context:

$$ \frac{\Gamma \vdash A\,\mathsf{type}}{(\Gamma, x:A)\,\mathsf{ctx}}\hspace{1cm} \frac{\Gamma \vdash A\,\mathsf{type}\quad \Gamma \vdash a:A}{(\Gamma, x:A := a)\,\mathsf{ctx}}$$

In an NbE-style implementation that separates De-Bruijn-index "terms" from De-Bruijn-level "values", and uses values for the types and values in a context, it is technically possible for a variable to have a value incorporating later variables. For example, one approach is for every (index) variable in a context to have a "value" which might be just "itself"; then our weird context would be

index1 : ℕ := suc level1
index0 : ℕ := level1

But this seems fraught with peril. In particular, when typechecking the branch, how do we define the transformation that takes the initial context such as n : ℕ, adds a variable to it, and then binds the existing variable n to a term involving the new later variable? In particular, there could be other variables in the initial context whose types and values involve n -- in its incarnation as a level variable in that context -- and which will need to be specialized to the new value of n, and part of the point of NbE is to avoid ever having to substitute for a level variable.

What is the best way to deal with this issue? How do existing proof assistants do it?

Edited to add: I don't think the solution can be quite as simple as just adding the new variable(s) in place of the match variable n or right before it. If the datatype is indexed, then the indices also have to be specialized depending on the branch, and so the new variables have to appear before all the indices (assuming they are all variables, as in the simplest sort of pattern-match on an indexed datatype without fancy unification). But there might be parameters on which the type of the new variables depend but which originally appear after these index variables in the context. For instance, consider:

data Vec (A : Type) : ℕ → Type
  nil : Vec A 0
  cons : (n : ℕ) → A → Vec A n → Vec A (suc n)

f (n : ℕ) (A : Type) (v : Vec A n) : P n A v
  := match v with
     | nil -> ?0
     | cons k a w -> ?1
     end

Now in the branch ?1, not only must we have v := cons k a w, we must specialize the index n := suc k. Thus, if we are simply inserting the new variables (k : ℕ) (a : A) (w : Vec A k) somewhere in the context (n : ℕ) (A : Type) (v : Vec A n), we have to insert them at the beginning so that the value n := suc k can depend on the variable k appearing before it:

k : ℕ
a : A
w : Vec A k
n : ℕ := suc k
A : Type
v : Vec A n := cons k a w

But now the variable a has a type A that depends on the parameter variable A that appears after it! So it seems that some more complicated shuffling or permuting of contexts is required, or else to represent contexts in some more nonlinear DAG-like way in the first place.

It sounds like maybe Agda is the only extant proof assistant that does anything like this, so I guess this is a question mostly about the implementation of Agda.

The general case here is that n is an expression, not a variable. Is there some reason you are specializing to a variable only? I do not know how existing proof assistants do it, but precisely for this sort of reason Andromeda (and finitary type theories that it's based on) allows judgemental equations in contexts. — Andrej Bauer, Commented Feb 29 at 8:32
The problem with judgmental equalities is that they are slightly too weak in this context, because they are a priori not injective: in many of these cases, you get equations like suc x = suc y, and you really want to be able to conclude x = y instead. This kind of injectivities is an important step of the work Equations or Program do in Coq. — Meven Lennon-Bertrand, Commented Feb 29 at 11:01
@AndrejBauer Restricting to n being a variable is what allows the motive to be refined in the branches. Otherwise you have only a recursion principle, not an induction principle. — Mike Shulman, Commented Feb 29 at 16:35
@MevenLennon-Bertrand Of course, but in that case my pseudo-syntax is not expressive enough, since you need to specify the motive by hand. As you pointed out in the last paragraph of your answer, you can't in general expect the proof assistant to guess it, because higher-order unification is undecidable. — Mike Shulman, Commented Feb 29 at 18:09
@MevenLennon-Bertrand: Your example shows that there is too little deductive power (can't deduce x = y from suc x = suc y), not that equalities themselves are too weak (you managed just fine using them to explain what you want). — Andrej Bauer, Commented Feb 29 at 21:30

James McKinna · Accepted Answer · 2024-03-11 01:29:48Z

7

Great question Mike!

Since Thierry Coquand's talk at the 1991 TYPES meeting in Edinburgh, and reading Martin Löf's "Mathematics of infinity" paper, and various things by Thierry subsequently, I think the 'right' (mathematical!) way to understand these 'let-bindings' (which I think is not the right way to see them, at least on a first cut) is via context morphisms, together with a suitable notion, for pattern matching, of when a collection of context morphisms 'covers' another context (in the Grothendieck topology/crible sense).

From this point of view, the contexts in which each RHS of the evolving case tree are well typed are distinct (they just look as though they 'share' bindings), but they are all suitably correlated by the morphisms between them which furnish the 'let'-like instantiations. The 'evolving' nature of the pattern analysis also fits well with contexts-and-morphisms as a mathematically satisfactory account of 'stages of knowledge' along more traditional Brouwerian/sheaf-theoretic lines.

Whether or not any of our implementations actually realises this conceptual analysis is another matter.

edited Mar 11 at 1:29

James McKinna

31 bronze badge

answered Mar 1 at 7:49

James McKinna

712 bronze badges

$\begingroup$ I have the impression that this point of view goes relatively close to strong extensionality principles for inductive types (as asked by @MikeShulman here) but that would be too strong for natural numbers (it is likely to make the conversion problem undecidable). I wonder whether there is an intermediate notion closer to this question (that does not give full extensionality). $\endgroup$
– kyo dralliam
Commented Mar 1 at 14:25
$\begingroup$ Copying James's comment on this from his answer below: "I really don't know what extensionality, one way or the other, has to do with the explanation of successive refinements of a case-tree by patterns in terms of context morphisms. But happy to be enlightened on the topic!" $\endgroup$
– Mike Shulman
Commented Mar 1 at 16:00
$\begingroup$ (I also don't see the connection.) $\endgroup$
– Mike Shulman
Commented Mar 1 at 16:00

Add a comment |

Meven Lennon-Bertrand · Accepted Answer · 2024-02-29 10:59:37Z

Coq (and probably Lean too):

The answer for (vanilla) Coq is simple: it does not. Nothing is done to n to remember its relation to the pattern in the branch. The only way to keep a relation is to channel it through P. The usual technique is what has been called the convoy pattern by Chlipala, which adds an extra equality in P to recover that missing relation. In practice, this is what the fancier tools (Program, Equations) give you.

So with a little bit of work, you get a propositional equality. This is fairly generic, but also somewhat weak compared to a definitional equality in context (be it as a definitional equality, as in Andromeda, or a defined variable, as you suggest). The advantage is that it is relatively simple (and thus can be automated), and elaborates back to the "simple" pattern-matching. And as Andrej remarks, it works equally well if n is not a variable.

Agda:

Agda does something smarter, the only source I know of is Jesper Cockx's talk at Types '23 (here, around 7:30). Basically, when case-splitting on x you insert the variables corresponding to the pattern in the context at the place where x was. In the rules given in the talk, you then simply erase x, and apply the relevant substitution to the rest of the context and motive. If you have definitions in contexts, then you could insert the new variables just before x, and make x a definition (this is really a different way to represent the substitution).

The main drawback is that this only works if the case-split is on a variable. And you cannot just get around by doing something like let x := t in match x …, because your variable need to not to have a body in the context, which you would get with a let-bound one. But given the pseudo-syntax you are giving, you might be content with this.

Slightly out of topic, but having only pattern matching on variables has the other advantage that it is easier to figure out what the motive of the pattern-matching (your P) is, because the unification problem ?P t = … is much nicer to solve when t is in fact a variable.

I suppose the last two paragraphs, together with the pseudo-syntax of the question, lead one towards having with-abstraction expressions (like the generalize tactic, IIRC, in Coq). — James Wood, Commented Feb 29 at 15:21
Thanks! Cockx's talk is close to answering what I want to know. If I understand correctly, he's trying to introduce a version of well-scoped syntax that permits (among other things) modifying the scope by replacing x with new variables, for the purpose of a correct-by-construction implementation of core Agda. What he comes up with is interesting but doesn't quite do what he wants. But I would be content to know what Agda currently does! What sort of variables and contexts does it use that allow replacing x with new variables? — Mike Shulman, Commented Feb 29 at 17:59
Heads up that I added a bit more to the question. In particular, it seems to me that in the presence of indexed datatypes, it isn't sufficient to just insert the all the new variables in any one place in the context. — Mike Shulman, Commented Mar 1 at 7:00
You are right, things are hairier with indices. I think there is no easy solution there, you need to manipulate whole contexts and rely on a form of unification. The relevant reference is probably Jesper Cockx's PhD thesis. Skimming through, on p. 44 he gives a high level condition for the context in which each branch lives, which cannot be simply described from the one of the whole pattern-matching. Then a large part of the thesis is concerned with how to construct contexts (and case-trees) satisfying the criterion and being well-behaved. — Meven Lennon-Bertrand, Commented Mar 1 at 12:29
Thanks. It looks to me like p44 is defining a substitution to be used when evaluating a pattern-matching definition. My question is about what the codomain context of this substitution is. This seems like it ought to be somewhere in section 2.2 about type-checking, but on a first skim I cannot find where he specifies how to reorder the context. — Mike Shulman, Commented Mar 1 at 18:05

Stack Exchange Network

Binding variables to terms involving later variables

2 Answers 2

Coq (and probably Lean too):

Agda:

Not the answer you're looking for? Browse other questions tagged
agda
implementation
pattern-matching
or ask your own question.

Linked

Hot Network Questions

Binding variables to terms involving later variables

2 Answers 2

Coq (and probably Lean too):

Agda:

Not the answer you're looking for? Browse other questions tagged agdaimplementationpattern-matching or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
agda
implementation
pattern-matching
or ask your own question.