13
$\begingroup$

Off and on I have heard of the jargon "commuting conversion" but I don't really know what it means.

I've heard commuting conversions are problematic but I don't know why.

$\endgroup$
9
  • 1
    $\begingroup$ Can you give some examples of where you've heard the jargon? $\endgroup$
    – Eric
    Commented Mar 31, 2022 at 7:46
  • 1
    $\begingroup$ This is a very common term in structural proof theory / logic / sequent calculus --- and I almost answered this question but then I realized that in spite of being able to recognize commuting conversions when I see them, I cannot figure out how to formally define them! I hope someone will provide an answer that objectively distinguishes commuting conversions from other kinds of equations. $\endgroup$ Commented Mar 31, 2022 at 10:19
  • 1
    $\begingroup$ The term originates, to my knowledge, in the context of the Gentzen cut elimination proof for sequent calculus; there are a lot of cases, which are divided into various subgroups. One of those subgroups is the "commutative cases" subgroup, which is when the cut formula is a "side formula". These cases of cut elimination inspired a name for a certain class of equations, in the same way that the cut elimination process inspires the equational theory of various calculi in general. $\endgroup$ Commented Mar 31, 2022 at 10:22
  • 1
    $\begingroup$ @Eric i was reminded of the topic on a comment on hereditary substitution stuff proofassistants.stackexchange.com/a/1195/153 so moving function application into an if then else expression. $\endgroup$ Commented Apr 1, 2022 at 19:02
  • 2
    $\begingroup$ I don't know what is frowned upon, I just prefer bigger fonts. $\endgroup$ Commented Apr 5, 2022 at 12:57

2 Answers 2

11
$\begingroup$
  1. Syntactically, commuting conversions are part of the $\eta$-rules for left-invertible types -- i.e., types which have pattern-matching eliminators. So if $e : A + B$, the $\eta$-equation for it looks like:

    $$ \newcommand{\c}[1]{\mathsf{#1}} C[e] \equiv \c{case}(e, \c{inl}\,x \to C[ \c{inl}\,x], \c{inr}\,y \to C[ \c{inr}\,y]) $$

    This rule says that if you have a term with an $e : A+B$ occuring in it somewhere (this is the $C[e]$), then it is the same as matching on $e$ and running $C[\c{inl}\, x]$ in the left branch and $C[\c{inr}\, y]$ in the right branch.

    The algorithmic difficulty this poses is that this rules is extremely nondeterministic and non-syntax-directed -- if you have a term $t$, you can break it into a context $C$ and expression $e$ such that $C[e] = t$ in potentially many different ways. Moreover, since the context $C$ is duplicated in each branch of the case statement, the context in each can be independently transformed some more into terms which look radically different from one another.

  2. Semantically, commuting conversions arise from the fact that the sum type can be viewed as a coproduct. Since coproducts are a kind of colimit, they satisfy both an existence property, and a uniqueness property.

    The existence property of coproducts says that for each $X$ and $Y$, we have a type $X + Y$ and injection maps $\c{inl} : X \to X + Y$ and $\c{inr} : Y \to X + Y$ such that for each $f : X \to Z$ and $g : Y \to Z$, we have a map $[f, g] : X + Y \to Z$ such that $[f,g] \circ \c{inl} = f$ and $[f,g] \circ \c{inr} = g$.

    You can see that these two equations correspond to the $\beta$-rules for sums:

    $$ \c{case}(\c{inl}\,z, \c{inl}\,x \to e_1, \c{inr}\,y \to e_2) \equiv [z/x]e_1 $$ $$ \c{case}(\c{inr}\,z, \c{inl}\,x \to e_1, \c{inr}\,y \to e_2) \equiv [z/y]e_2 $$

    The uniqueness property tells us this map is unique. In particular, uniqueness means that if you have a map $h : X+Y \to Z$, then $h \equiv [h \circ \c{inl}, h \circ \c{inr}]$. (This is easy to see if you draw the coproduct diagram for this.)

    This is exactly the $\eta$-law, once you remember that composition of morphisms and substitution of terms are the same thing:

    $$ \newcommand{\c}[1]{\mathsf{#1}} C[e] \equiv \c{case}(e, \c{inl}\,x \to C[ \c{inl}\,x], \c{inr}\,y \to C[ \c{inr}\,y]) $$

  3. Colimits without the uniqueness property are called "weak colimits", and come up fairly often in category theory. But it is still weird that $\eta$ for projective tuples and lambdas is easy, but that it's so hard for sum types.

$\endgroup$
1
  • $\begingroup$ This is a great answer. I'm a little curious how it might apply to alternative calculi like System LK/Lambda-bar Mu Mu-tilde. So I'm pretty sure you can have the problem in reverse for products with co-intuitionistic logic but for weird mixed things are weirder. Also interested in how impredicative encodings deal with the issue. $\endgroup$ Commented Apr 19, 2022 at 17:48
7
$\begingroup$

In the comments, it was elaborated that one possible interpretation of a "commuting conversion" refers to $F(G(x))$ vs $G(F(x))$ where one of $F$ or $G$ is a recursor.

The problem

Here's an example in Lean of where commuting function application with a recursor application causes two terms to not be definitionally equal:

import data.sum.basic

variables {α β : Type*} (f : α → β)

def foo_1 : α ⊕ α → β := sum.elim f f 
def foo_2 : α ⊕ α → β := λ x, f (sum.elim id id x)

example : foo_1 f = foo_2 f := rfl  -- fails
example : foo_1 f = foo_2 f := by { ext x, cases x; refl } -- ok

This causes problems if we're working with typeclass instances; consider wanting to put an addition operator on the type sum.elim A B x which just inherits the addition of A l when x = inl l and B r when x = inr r.

The naive approach looks something like:

import data.sum.basic
import algebra.group.basic

universes u
variables {α β : Type*} {A : α → Type u} {B : β → Type u}

instance sum.elim.has_add
  [ia : Π a, has_add (A a)] [ib : Π a, has_add (B a)] :
  Π x, has_add (sum.elim A B x) :=
@sum.rec _ _ (λ x, has_add (sum.elim A B x)) ia ib

instance sum.elim.add_semigroup
  [ia : Π a, add_semigroup (A a)] [ib : Π a, add_semigroup (B a)] :
  Π x, add_semigroup (sum.elim A B x) :=
@sum.rec _ _ (λ x, add_semigroup (sum.elim A B x)) ia ib

But this results in the two paths to the has_add instance not being definitionally equal:

example [Π a, add_semigroup (A a)] [Π a, add_semigroup (B a)] (x : α ⊕ β) :
  add_semigroup.to_has_add (sum.elim A B x) = sum.elim.has_add x := rfl  -- fails

which means lemmas about add_semigroup will not apply to sum.elim.has_add:

example [Π a, add_semigroup (A a)] [Π a, add_semigroup (B a)] (x : α ⊕ β)
  (a b c : sum.elim A B x) : a + b + c = a + (b + c) := add_assoc _ _ _  -- fails

The solution

The solution is to push the recursors inside the application of has_add.mk and add_semigroup.mk:

import data.sum.basic
import algebra.group.basic

universes u
variables {α β : Type*} {A : α → Type u} {B : β → Type u}

instance [ia : Π a, has_add (A a)] [ib : Π a, has_add (B a)] (x : α ⊕ β) :
  has_add (sum.elim A B x) :=
{ add := @sum.rec _ _ (λ x, sum.elim A B x → sum.elim A B x → sum.elim A B x)
           (λ l, (ia l).add) (λ r, (ib r).add) x }

instance
  [ia : Π a, add_semigroup (A a)] [ib : Π a, add_semigroup (B a)] (x : α ⊕ β) :
  add_semigroup (sum.elim A B x) :=
{ add := @sum.rec _ _ (λ x, sum.elim A B x → sum.elim A B x → sum.elim A B x)
        (λ l, (ia l).add) (λ r, (ib r).add) x,
  add_assoc := begin
    cases x,
    exact add_assoc,
    exact add_assoc,
  end }

which makes the problem go away:

example [Π a, add_semigroup (A a)] [Π a, add_semigroup (B a)] (x : α ⊕ β) :
  add_semigroup.to_has_add (sum.elim A B x) = sum.elim.has_add x := rfl  -- ok

example [Π a, add_semigroup (A a)] [Π a, add_semigroup (B a)] (x : α ⊕ β)
  (a b c : sum.elim A B x) : a + b + c = a + (b + c) := add_assoc _ _ _  -- ok
$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.