12
$\begingroup$

Suppose you were to add a non-constructive axiom which only applies to irrelevant types, such as the irrelevance axiom. To my understanding canonicity and strong normalization are defining features of constructivity. However:

  • irrelevant types don't compute by definition, so I don't think it would affect the computational properties of the type theory if the axiom doesn't compute; and
  • if my understanding of conservativity is correct, then it shouldn't matter that it's non-conservative because it's definitionally proof-irrelevant.

I would expect the axiom to still break canonicity, but only for irrelevant types. Why does it matter if canonicity holds for irrelevant types? Are there any other desirable properties that it would break?

$\endgroup$
5
  • 1
    $\begingroup$ However proof-irrelevant it may be, it still breaks canonicity, and if you value canonicity you have to reject it. Of course, you can propose things like weak canonicity, that only cares about relevant types. It's a matter of taste. $\endgroup$
    – Trebor
    Commented Mar 6, 2022 at 5:19
  • 1
    $\begingroup$ That makes sense. I certainly value weak canonicity in that sense, though it still leaves the question of what disadvantages weak canonicity has compared to strong canonicity, and how much you can gain from sacrificing strong canonicity. $\endgroup$ Commented Mar 6, 2022 at 5:52
  • $\begingroup$ @Trebor I think fundamentally, that describes the interesting bit of the question I was asking better than the question I actually asked. I edited down the question to reflect that. $\endgroup$ Commented Mar 6, 2022 at 6:26
  • 1
    $\begingroup$ Something that might interest you: HoTT (with the UA as an axiom) also satisfies weak canonicity, in the sense that every term is propositionally equal to a canonical term. $\endgroup$
    – Trebor
    Commented Mar 6, 2022 at 6:30
  • $\begingroup$ @Trebor Ooh, that's interesting. I don't think that property holds here (e.g. LEM for an unprovable proposition). $\endgroup$ Commented Mar 6, 2022 at 6:56

3 Answers 3

5
$\begingroup$

I will focus here on a universe of definitionally irrelevant types; let's call it SProp. In short, we can postulate any consistent SProp axiom, without breaking any constructive metatheoretic property. Formally, SProp terms are propositionally truncated, which means that no model can distinguish terms of the same SProp type. We could say that SProp terms trivially enjoy canonicity. For example, any (open) proof of t = t : SProp is definitionally equal to refl. But actually we don't really care if there is a "canonical" proof like refl, because all proofs are equal, so if any one of them is "canonical", then they all are.

Compare e.g. the treatment of negative types in canonicity proofs. We don't have to prove anything about terms of negative types; we already know that any $t : \top$ is $\mathsf{tt}$, any $t : \Pi\,A\,B$ is a $\lambda$, and any $t : \Sigma\,A\,B$ is a pairing.

Of course, if we postulate an inconsistent axiom in SProp, that breaks computational properties, since we can eliminate from $\bot$ to proof-relevant types, and define loops and closed non-canonical relevant values.

The SProp variant of the irrelevance axiom is also computationally adequate, since it's just an axiom in SProp:

data Squash (A : Set) : Prop where 
  squash : A -> Squash A

irrelevance : (A : Set) -> Squash (Squash A -> A)

This formulation seems to be equivalent to an SProp-indexed axiom of choice:

Irrelevance = {A : Set} → Squash (Squash A → A)
PropChoice  = {A : Prop}{B : Set} → (A → Squash B) → Squash (A → B)

to : Irrelevance → PropChoice
to irr {A} {B} f with irr {B}
... | squash g = squash λ a → g (f a)

from : PropChoice → Irrelevance
from pch {A} = pch λ sqa → sqa

In observational type theories, the ability to postulate SProp axioms without breaking canonicity becomes a bit more interesting, since in that setting we can eliminate from SProp propositional equality to any type.

$\endgroup$
3
  • 4
    $\begingroup$ The funny thing with SProp is that it gives a generalization of the result that consistent negative axioms can be postulated without breaking canonicity. All recent proof techniques relying on SProp (e.g. this paper) prove canonicity under consistency hypotheses. So SProp is a way to internalize "true" things (i.e. coming from the meta). $\endgroup$ Commented Mar 6, 2022 at 11:33
  • $\begingroup$ Are you sure that's an accurate translation of the irrelevance axiom? The irrelevance axiom is itself irrelevant, so wouldn't it be more like Squash (Squash A -> A)? $\endgroup$ Commented Mar 6, 2022 at 15:40
  • 1
    $\begingroup$ @JamesMartin you're right, I missed the irrelevance. I'll edit the answer. $\endgroup$ Commented Mar 6, 2022 at 16:18
6
$\begingroup$

Consider the following type theory:

\begin{gather} \frac{ }{\vdash G \; \mathsf{type}} \qquad \frac{ }{\vdash \Lambda \; \mathsf{type}} \\[2ex] \frac{\vdash g : G \qquad \vdash e_1 : \Lambda \qquad \vdash e_2 : \Lambda}{\vdash e_1 \, e_2 : \Lambda} \\[2ex] \frac{\vdash g : G \qquad x : \Lambda \vdash e : \Lambda}{\vdash \lambda x \,.\, e : \Lambda} \\[2ex] \frac{\vdash g : G \qquad x : \Lambda \vdash e'_1 : \Lambda \qquad \vdash e_2 : \Lambda}{\vdash (\lambda x \,.\, e'_1) \, e_2 \equiv e'_1[e_2/x] : \Lambda} \end{gather}

It is the untyped $\lambda$-calculus with a proof-irrelevant “guard” type $G$: in order to form any terms of type $\Lambda$ or to apply the $\beta$-rule, we must first exhibit a proof-irrelevant term of type $G$.

The theory has a model in which both $G$ and $\Lambda$ are interpreted as the empty set. Therefore, there are no closed terms of any type (the only types are $G$ and $\Lambda$ – we did not assume any extra structure). It is vacuously the case that every closed term is normalizing for any notion of normalization.

Observe also that in the non-empty context $z : G$ the untyped $\lambda$-calculus springs to life because $z$ can be used to justify the construction of terms and applications of the $\beta$-rule. Hence, as soon as we add a (proof-irrelevant!) constant

$$\frac{ }{\vdash g : G}$$

all hell breaks loose in the empty context (of course the untyped $\lambda$-calculus is not normalizing).

You may think the example a silly one, and I concur. However, it demonstrates a point well: the addition of a proof-irrelevant constant may change the behavior of proof-relevant closed terms.

If we wish to avoid the above travesty, we need to find a meta-theoretic property of type theories which prevents it. I would be quite impressed to see a reasonably general and non-artificial property that accomplishes the task.

It is interesting to ask whether we could modify the above example so that $G$ becomes proof-relevant. I think so, like this:

\begin{gather} \frac{ }{\vdash G \; \mathsf{type}} \qquad \frac{ }{\vdash \Lambda \; \mathsf{type}} \\[2ex] \frac{\vdash g : G \qquad \vdash e_1 : \Lambda \qquad \vdash e_2 : \Lambda}{\vdash \mathsf{app}(g, e_1, e_2) : \Lambda} \\[2ex] \frac{\vdash g : G \qquad x : \Lambda \vdash e : \Lambda}{\vdash \lambda(g, x \,.\, e) : \Lambda} \\[2ex] \frac{\vdash g_1 : G \qquad \vdash g_2 : G \qquad x : \Lambda \vdash e'_1 : \Lambda \qquad \vdash e_2 : \Lambda}{\vdash \mathsf{app}(g_1, \lambda(g_2, x \,.\, e'_1)) \, e_2 \equiv e'_1[e_2/x] : \Lambda} \end{gather}

$\endgroup$
5
  • $\begingroup$ András Kovács's answer gives consistency as a precondition for not changing the behavior of relevant terms with Prop, but on the other hand Dan Doel's answer notes that Prop is also computationally irrelevant, not just proof irrelevant. Is "consistent and computationally irrelevant" an example of the sort of meta-theoretic property you're talking about? $\endgroup$ Commented Mar 6, 2022 at 19:43
  • $\begingroup$ How do you define “consistent” and how do you define ”computationally irrelevant”? I am not even sure I have ever seen a definition of “proof relevant”. Well, perhaps I have. $\endgroup$ Commented Mar 6, 2022 at 21:55
  • $\begingroup$ I had SProp in mind. I thought it would generalize, but when I thought about it more, what I came up with was pretty contrived, and it's not obvious if it's inequivalent to SProp when you add the condition of proof-irrelevance. So I guess it'd be a property, but it's not "reasonably general and non-artificial", like you said. $\endgroup$ Commented Mar 6, 2022 at 23:39
  • $\begingroup$ I don't understand how this makes a difference, say if $G$ is the unit type? $\endgroup$
    – Couchy
    Commented Mar 7, 2022 at 16:41
  • $\begingroup$ We are discussing the following: adding a constant to a type may cause a normalizing theory become non-normalizing. If $G$ is the unit type then the theory is non-normalizing from the outset and cannot serve as an example of what is being discussed. $\endgroup$ Commented Mar 7, 2022 at 22:31
4
$\begingroup$

One answer is that (judgmental) proof irrelevance does not correspond to computational irrelevance. Proof irrelevance means that every proof is equal in some sense. But that doesn't mean the values can't be interesting.

For instance, in something like a setoid or realizability interpretation, values of propositions like $∃n:ℕ. ...$ actually have natural numbers witnessing them; potentially multiple natural numbers. Even in such a case, the proofs are considered equal by fiat. They don't need to be represented uniquely, the system just needs to ensure that equal results are produced from any input proposition. In particular this means that implication of propositions can be witnessed by any appropriate function (i.e. not necessarily constant), since the results will also all be considered equal.

This matters if you want to extract the computational content from the propositions at a level below the surface language. You can do the calculations underlying the proofs and get some kind of actual result. The result you get will depend on the details of the proof, but you might not care so long as you get a result. An example would be Andrej Bauer's and Paul Taylor's work on computing with Dedekind reals, where the proofs involved let you calculate rational approximations. You can't do this if you add an axiom that you don't know how to calculate with.


However, the above is actually unrelated to your example, because Agda's irrelevant arguments are about computational irrelevance and erasure. The idea there is that we intend to actually strip the annotated things out of the program. That can be related to the above, because stripping all the information out of a proof is a way to make it irrelevant in the above sense, but in general you can have either without the other, I believe.

Even computational irrelevance is subtle, though, because in type theories it matters when you plan to erase things. There are at least two stages of evaluation in dependent type theories. One is type checking, and the other is running the 'program' you have specified. The big difference is that the former operates on open terms with variables, while the latter generally happens on something corresponding to closed terms.

So, one issue is that things that are computationally irrelevant during closed term evaluation are not irrelevant during open term evaluation. The obvious traditional example is types. It's desirable to not be calculating the type of everything during program execution, but types matter while you're checking them. And there are many other examples of things that are computationally irrelevant during closed term evaluation, but might cause problems if they're erased during open term evaluation.

The justification behind irrelevant projections is also that there is only one 'stage' of irrelevance; once you've shifted to the erased part, all irrelevant things become relevant, and there are no further phase distinctions. But one might want to have multiple stages of computational (ir)relevance that don't collapse. That could make it matter whether you add other non-computational axioms as well.

In the Agda case, I'm not sure it really runs much deeper than the feature being a frequent source of bugs, though.

$\endgroup$
7
  • 1
    $\begingroup$ What do you mean by “computational irrelevance”, if not either judgemental irrelevance or erasure (and presumably not propositional irrelevance)? $\endgroup$
    – James Wood
    Commented Mar 6, 2022 at 9:08
  • $\begingroup$ The intended meaning of "irrelevant types" in my question is in the sense of "definitional proof irrelevance without K", e.g. members of Agda's Prop and Coq's SProp. Which category do they fall under? The Agda docs say you can simulate irrelevant arguments with a squash type in Prop, so I was under the (false?) impression that the two were interchangable. Is my question wrong? $\endgroup$ Commented Mar 6, 2022 at 9:28
  • 1
    $\begingroup$ In that paper, I think propositions are both proof irrelevant and computationally irrelevant. Unfortunately these two things are frequently conflated in the literature, but they needn't be the same. The paper starts by talking about the sort of proof irrelevance in the first half of my answer, but it also talks about some stuff in the second half. It's not unusual to want/have both for certain types, and certain choices (like wanting rules to hold judgmentally) might force them to occur together in certain ways. $\endgroup$
    – Dan Doel
    Commented Mar 6, 2022 at 18:55
  • 1
    $\begingroup$ Computational irrelevance is about tracking which things could cause different results to some designated computation of interest. It can also happen for multiple reasons. A value could be irrelevant because there is only one possible representation, so it just can't affect the result. But, it could also be irrelevant because it's never looked at (modulo some other erasing), and things can be irrelevant in this way without being proof irrelevant. You could also just say that you're going to throw out potential computations underlying propositions, and only non-prop computations matter. $\endgroup$
    – Dan Doel
    Commented Mar 6, 2022 at 19:08
  • $\begingroup$ Do you have an example of computational content being affected by a proof-irrelevant axiom? András Kovács's answer establishes that both kinds of irrelevance and consistency are sufficient, and Andrej Brauer gives a non-consistent counterexample, and I think a consistent, computationally-relevant counterexample would drive the distinction home. $\endgroup$ Commented Mar 6, 2022 at 20:01

Not the answer you're looking for? Browse other questions tagged or ask your own question.