49
$\begingroup$

It is often said that instead of proving a great theorem a mathematician's fondest dream is to prove a great lemma. Something like Kőnig's tree lemma, or Yoneda's lemma, or really anything from this list.

When I was first learning algebra, one of the key lemmas we were taught was Zorn's lemma. It was almost magical in its power and utility. However, I can't remember the last time Zorn's lemma appeared in one of my papers (even though I'm an algebraist). In pondering why this is, a few reasons occurred to me, which I'll list below. I don't want to lose my old friend Zorn, and so my question is:

What are some reasons to keep (or, perhaps in line with my thoughts, abandon) Zorn's lemma?

Edited to add: One purpose to this question is to know whether or not I should be rewriting my proofs to use Zorn's lemma, instead of my usual practice of using transfinite recursion, if there is a mathematical reason to prefer one over the other. Hopefully this clarifies the mathematical content of this question.


To motivate the discussion, let me give an example of how I would now teach ungraduates a result that was taught to me using Zorn's lemma.

Theorem: Every vector space $V$ has a basis.

Proof: First, fix a well-ordering for $V$. We will recursively work our way through the ordering, deciding whether to keep or discard elements of $V$. Suppose we have reached a vector $v$; we keep it if it is linearly independent from the previously kept vectors (equivalently, it is not in their span), otherwise we discard it. If $B$ is the set of kept vectors we see it is a basis as follows. Any vector $v\in V$ is in the span of $B$, because it is either in $B$ or in the span of the vectors previously kept. On the other hand, the elements of $B$ are linearly independent because a nontrivial combination $c_1 v_1 + \dotsb +c_k v_k=0$, where $v_1<v_2<\dotsb<v_k$ and $c_k\neq 0$ can be rearranged so $v_k$ is a linear combination of the previous vectors, so $v_k$ cannot belong to $B$ after all. $\quad\square$

Here are some of the benefits I see for this type of proof over the usual Zorn's lemma argument.

1. The use of choice is disentangled from the other parts of the proof.

When applying Zorn's lemma, it is difficult to see exactly how the axiom of choice is being used to reach the conclusion of a maximal element. One way to visualize its use is that Zorn's lemma lets us recursively build a maximal chain through the poset. This chain must have a greatest element. However, this construction is hidden behind the magic words "Abracadabra Zornify".

Is it a historical artifact that choice is hidden this way?

2. We can more easily see whether or not to use a choice principle.

In the proof above, if $V$ is already well-orderable (without AC), then we don't need to ever use the axiom of choice.

3. Zorn's lemma is no easier than transfinite recursion.

Each part of transfinite recursion already (implicitly) occurs in most Zorn's lemma arguments. The base case of the recursion corresponds, roughly, to showing that that the poset is nonempty (i.e., has some starting point). The successor ordinal step often occurs at the end; after asserting that some maximal element of the poset exists, we show that this maximal element has some claimed property by working by contradiction, and then passing to a slightly bigger element of the poset (i.e., the next successor). The limit ordinal step occurs when we show that chains have upper bounds.

4. Zorn's lemma often includes unnecessary complications.

In the proof I gave above, there is no need to define a complicated set, together with a poset relation. We can use strong induction, to avoid differentiating between the zero, successor, and nonzero limit steps. We don't need to combine the contradiction at the end with any successor step; they are entirely separated.

5. Transfinite recursion is a more fundamental principle.

As a matter of pedagogy, shouldn't we teach students about transfinite induction before we teach them a version of it that is also combined with AC, and that requires the construction of a complicated poset?

6. Transfinite recursion applies to situations where Zorn's lemma does not.

To give just one example: There are some recursions that continue along all of the ordinals (for a proper class amount of time). Zorn's requires, as a hypothesis, an end.

$\endgroup$
25
  • 27
    $\begingroup$ So, what is your question? $\endgroup$
    – Asaf Karagila
    Commented Dec 11, 2022 at 18:03
  • 17
    $\begingroup$ I would restate your question as: When presenting proofs to undergraduates, would you emphasize proofs by Zorn’s lemma or proofs by transfinite induction, and why? I think that is the clear question that fits the rest of your post, though it fits the MathEducators site better than this one. $\endgroup$
    – user44143
    Commented Dec 11, 2022 at 18:12
  • 13
    $\begingroup$ I don't think this is worth making an answer of, but it's worth noting that, in a constructive math setting, Zorn's lemma might hold even in settings where the Axiom of Choice and transfinite induction (in the guise of the Bourbaki-Witt principle) fail: see this other question and the paper of Bell it cites; see also this paper on the Bourbaki-Witt principle and its relation to the existence of large enough ordinals. $\endgroup$
    – Gro-Tsen
    Commented Dec 11, 2022 at 19:08
  • 15
    $\begingroup$ Can we please leave this question open because the discussion around it gives very interesting insights about the connection between AC and ZL? Thank you. And on a personal note, ZL is a good friend of mine, :-). $\endgroup$ Commented Dec 11, 2022 at 21:46
  • 15
    $\begingroup$ Yes, Zorn's Lemma is just a way of setting up a transfinite recursion, and any proof that invokes it could just as well inline its argument instead. But that's hardly specific to Zorn's Lemma. Every lemma is just a way of using that lemma's proof. Zorn's Lemma is a particular usage pattern for transfinite recursion which comes up over and over. Ok, let's give that a name to recognize and understand that pattern quickly each time it comes up again. That's what we usually do in math, we name the patterns we see a lot so we needn't think about them from scratch each time. No more to it than that. $\endgroup$ Commented Dec 12, 2022 at 6:00

7 Answers 7

58
$\begingroup$

I agree with almost everything in your post. But still, I believe I know why people use Zorn's lemma.

My answer. Zorn's lemma encapsulates succinctly many of the consequences of AC via transfinite recursion, but without requiring any involvement of the ordinals or knowledge of transfinite recursion to be used.

To those who are deeply familiar with transfinite recursion, of course, every use of Zorn's lemma can be seen as sumblimating the underlying construction, which achieves the maximal elements by a transfinite process that simultanesously explains why they exist. To appeal to Zorn seems to hide this essential explanatory underlying mechanism.

And yet, the alternative perspective is that Zorn's lemma abstracts away from the recursive process, producing in the end a simpler argument that relies only on the core consequences of the recursive process, which do not rely on any explicit engagement with ordinals or recursion. And precisely because of that feature, Zorn's lemma arguments can be undertaken and understood by mathematicians who are unfamiliar with the ordinals and transfinite recursion.

In the vector space example, to show every vector space has a basis, one can mount a transfinite recursive process: you pick an element, and then if it doesn't span, you pick another, and so on transfinitely until you have a basis. (My view of this example is a little different from how you described it, since I view the choice function as more primitive than the well order — I would build the basis by choosing amongst the element not yet in the span — indeed I prefer to view WOP itself as the outcome of recursively choosing elements.)

With Zorn's lemma, however, there is no need for ordinals or transfinite recursion, and the Zorn's lemma argument instead encapsulates abstractly their effects — the partial order consists in effect of partial undertakings of the recursive process. In this sense, the Zorn argument is simpler, abstracting away from the transfinite constructive "process".

I find the situation to be analogous to Martin's axiom and forcing. Martin's axiom is the poor mathematician's forcing, just as Zorn's lemma is the poor mathematician's choice+transfinite recursion.

My personal view is that the ordinals and transfinite recursion are one of the wonders of mathematics, a sublime achievement of the intellect resulting in many beautiful arguments and constructions. I tend to prefer the transfinite recursive arguments as providing a deeply explanatory account of the consequences of Zorn's lemma. (Even the well-order theorem seems fundamentally less mysterious when explained via the transfinite recursion — pick any element as the least element, and now pick a next element, and a next, and so on transfinitely.)

Further, although I recognize that many mathematicians have little involvement or experience with the ordinals and transfinite recursion, I also believe that their mathematical life would be improved by knowing more of them.

$\endgroup$
14
  • 24
    $\begingroup$ I agree. Well-ordered uncountable sets, ordinals, and transfinite induction are often regarded by "ordinary mathematicians" as unpleasant intrusions from the realm of logic and set theory. Zorn's lemma looks more like "ordinary mathematics" and hence is regarded as a minimally invasive surgical procedure. So it's a question of whether you want to cater to their prejudices, or instead teach them to stop worrying and learn to love the bomb. $\endgroup$ Commented Dec 11, 2022 at 18:36
  • 4
    $\begingroup$ While I follow your reasoning that "transfinite recursive arguments [provide] a deeply explanatory account of the consequences of Zorn's lemma", I find your claim that "Zorn's lemma is the poor mathematician's choice+transfinite recursion" somewhat bewildering. Obviously, when writing proofs (be it for a course or for a paper), it is necessary to make certain choices regarding which points to stress in the presentation. A deeper explanation of a topic also means that more resources need to be invested by the audience to understand this explanation. So when writing, say, [...] $\endgroup$ Commented Dec 11, 2022 at 21:24
  • 3
    $\begingroup$ @JochenGlueck: Does ZL require less overhead? Not really. Just like you don't prove things by (standard) induction by defining a set $S=\{n\in\Bbb N\mid n\text{ has some property}\}$ and then showing that $S$ is an inductive set; but rather you just write the inductive proof, this is not very different from transfinite recursion. Especially in ZL-type arguments that always have finite character (see the Teichmüller–Tukey Lemma), and so the limit steps tend to be kind of trivial. $\endgroup$
    – Asaf Karagila
    Commented Dec 12, 2022 at 0:27
  • 6
    $\begingroup$ @TimothyChow, I hope the joke in your analogy was intentional: Dr Strangelove was modeled in part on the very mathematician who gave the now-standard definition of von Neumann ordinals. $\endgroup$
    – user44143
    Commented Dec 12, 2022 at 3:40
  • 4
    $\begingroup$ @MattF. I often see that claim repeated on blogs, but at least Peter Sellers said quite definitively that "[Dr. Strangelove] was always Wernher von Braun." $\endgroup$ Commented Dec 12, 2022 at 20:01
41
$\begingroup$

I agree with the existing answers, but I personally like Zorn's lemma both pedagogically and mathematically for an additional reason: the "poset of partial solutions" that it introduces is a valuable new idea in its own right. Even when it fails to have a maximal element for some reason - either because we're working in a choiceless context or, more commonly, because the poset doesn't satisfy the Zornian hypotheses - it can still be a useful tool. In particular I think of forcing arguments as such an application.

So for me, Zorn's lemma feels central because of how it enriches the idea of partial orders rather than because of its role as a theorem-prover. This is of course subjective, but transfinite recursion/induction doesn't really have the same impact for me.


EDIT: In this context, as much as I love well-orderings I think the reaction to transfinite recursion/induction described in Timothy's comment to Joel's answer is understandable. If I want to use TR/I to prove that maximal ideals exist, I need choice to build me the relevant well-ordering (or choice function) but once I have it everything is choice-free. By contrast, the poset of partial solutions exists and is easy to describe, it's just that in the absence of choice it might not have the properties I need it to. In caricature, Zorn's lemma amounts to a wild analysis of a straightforward object while the TR/I approach is a straightforward analysis of a wild object.

$\endgroup$
23
  • 3
    $\begingroup$ This is in line with Matteo Viale's approach to Zorn's Lemma as a forcing axiom. $\endgroup$
    – Asaf Karagila
    Commented Dec 11, 2022 at 21:47
  • 4
    $\begingroup$ One good example to illustrate this is the lemma in field theory that if $L/K$ is algebraic and $M/K$ is algebraically closed, there is a $K$-homomorphism $L \to M$. The poset of all partial extensions is clearly interesting in its own right, and to be honest it would be very awkward to me to write this proof down with a well-ordering of $L$. $\endgroup$ Commented Dec 11, 2022 at 21:53
  • 3
    $\begingroup$ That "caricature" is spot on. $\endgroup$
    – Asaf Karagila
    Commented Dec 12, 2022 at 0:28
  • 4
    $\begingroup$ @OlegEroshkin That's a much harder question to answer, because if I had an example, one could always argue that the basis you get from the well-order is an explicit one. $\endgroup$ Commented Dec 12, 2022 at 21:33
  • 4
    $\begingroup$ I think this conversation should probably continue in chat rather than the comment thread (if only for the selfish reason that I keep getting pinged!). $\endgroup$ Commented Dec 12, 2022 at 21:40
19
$\begingroup$

Your choice (ha) to prove the existence of a basis by using a well-ordering in place of Zorn’s Lemma turns things around historically: the first proof of the existence of algebraic bases (using only finite linear combinations) in an infinite-dimensional example was given by Hamel in 1905 for $\mathbf R$ as a $\mathbf Q$-vector space using well-orderings (his interest in this was to show there are additive maps $\mathbf R \to \mathbf R$ that are not of the form $f(x)=cx$). That is why vector space bases in the algebraic sense are called Hamel bases.

Zorn’s paper proving Zorn’s lemma appeared in 1935. But many abstract existence theorems that are usually proved today with Zorn’s lemma were known before 1935 and they were first proved by well-ordering or transfinite induction. For example, Steinitz proved the existence of an algebraic closure of every field in 1910 using well-ordering and Hahn and Banach proved (independently) the existence of an extension of bounded linear functionals in the late 1920s used transfinite induction. Some originally controversial proofs based on the axiom of choice are listed in Section 6 here, and it is still common to use choice directly in some of those proofs today (like Vitali’s nonmeasurable subset of the real numbers).

While you may feel proofs making a direct use of a well-ordering or transfinite induction is more appealing than Zorn’s lemma, the formulation of these equivalent statements in a proof as Zorn’s lemma is more attractive to others, at least psychologically: it is regarded as more convenient. (Ultimately such proofs are nonconstructive no matter what method is used.) Your point $3$ (“Zorn’s lemma is no easier than transfinite induction”) is not borne out in practice: most mathematicians do think Zorn’s lemma is easier. You could argue that this is only true today because all the modern textbooks outside of set theory have been written to emphasize Zorn’s lemma over other approaches (to the extent that the other ones may not even be mentioned except in passing), but the reason they are written this way is because mathematicians in the era after 1935 also felt Zorn’s lemma is simpler. After all, many fundamental existence theorems we see today that are proved by Zorn’s lemma were first proved by well-ordering or transfinite recursion. Ask yourself why those original proofs were abandoned.

The purpose of Zorn’s paper was to offer a new principle that could replace the direct use of transfinite induction, especially in algebra, and it succeeded: outside of set theory, proofs that were originally done with well-orderings or transfinite induction are now essentially all done with Zorn’s lemma. As Joel indicates in his answer, proofs using Zorn’s lemma avoid needing to develop the background material about ordinal numbers and their well-ordering. That is arguably why proofs by Zorn’s lemma became the dominant approach in mathematics.

While you said that in contrast to your student days you can’t remember the last time you used Zorn’s lemma, perhaps you have more recently used results that are consequences of it (e.g., maximal ideals in a general nonzero commutative ring or the algebraic closure of a general field or the extension of a non-Archimedean absolute value from a field to a huge extension of that field). If so, then your mathematical work would still be relying on it. Also consider that Zorn’s lemma is a popular tool for foundational existence theorems in great generality throughout mathematics, and once you get past the foundations you are using those existence theorems all the time rather than the methods like Zorn’s lemma that proved those existence theorems. For example, the most fundamental foundational result in functional analysis is the Hahn-Banach theorem. It is proved by Zorn’s lemma in all functional analysis books today, so even if a functional analyst is not directly using Zorn’s lemma in their work, they are indirectly using it each time they appeal to the Hahn-Banach theorem.

$\endgroup$
22
  • 7
    $\begingroup$ I am curious about the point "most mathematicians do think Zorn’s lemma is easier." How do you know? At least for me, one thing reading MO teaches me is that most things I think I know about most mathematicians are wrong. $\endgroup$
    – LSpice
    Commented Dec 11, 2022 at 19:03
  • 5
    $\begingroup$ @LSpice concerning most mathematicians thinking Zorn’s lemma is easier to use, today I think this holds from the way papers and books have been written for many decades. (Shall we blame van der Waerden?) Zorn’s lemma takes getting used to, but it’s in books on group theory, field theory, commutative algebra, topology, functional analysis, etc. so you catch on to how it works. Proofs 100 years ago via transfinite recursion or well-ordering have largely disappeared from the literature that is not in set theory or areas of math near set theory. Do you often see proofs by transfinite induction? $\endgroup$
    – KConrad
    Commented Dec 11, 2022 at 19:41
  • 7
    $\begingroup$ @KConrad I have caught myself coming up with proofs by transfinite recursion and then rewriting them to cover my tracks. There might be others. $\endgroup$ Commented Dec 11, 2022 at 19:49
  • 6
    $\begingroup$ @LSpice: I agree with your point that it's difficult to know what "most mathematicians" think. Regarding your remark on reading MO though, I think it's fair to point out that the MO community is not representative of the mathematical research community in general. Arbitrary example: a search for the algebraic geometry tag on MO yields 20,059 questions, while a search for the PDE tag yields 3,826 questions. Comparing this to the arXiv (which yields, for 2022, twice as many results for PDEs than for algebraic geometry) indicates that algebraic geometry is extremely overrepresented on MO. $\endgroup$ Commented Dec 11, 2022 at 20:07
  • 3
    $\begingroup$ @KConrad I mostly agree with you, but as one example of a proof by transfinite induction in an algebra book, see Chapter 4, Exercise 17 in Atiyah and Macdonald. $\endgroup$ Commented Dec 11, 2022 at 20:45
13
$\begingroup$

The answers and comments so far contain a lot of abstract and philosophical discussion. Personally I think that, when comparing the advantages and disadvantages of two approaches, concrete examples are important. So I'll give one here (in response to the OP's suggestion from a comment).

Let us compare how Zorn's lemma and transfinite recursion work for the following theorem.

Theorem. Let $R$ be a non-zero ring with a multiplicatively neutral element $1$. Then there exists a maximal ideal in $R$.

Below, there are four proofs of the theorem - one by Zorn's lemma and three by transfinite recursion. The one by Zoern's lemma and the first two by transfinite recursion need the following observation, which I thus outsource into the subsequent lemma.

Note. In an earlier version of this answer I (that contained fewer proofs rather than four) I compared the proofs and stated my opinion about which of the proofs I find easier. Since then, the answer has been expanded, so I removed my personal evaluation of the "simplicity" of the proofs - but I think it is still worthwhile to have all those proofs explicitly written down here, so that one can easily compare them.

Lemma. If $\mathcal{I}$ is a non-empty set of proper ideals in $R$ which is totally ordered with respect to set inclusion, then $\bigcup \mathcal{I}$ is also a proper ideal in $R$.

Proof of the lemma. Let $a,b \in \bigcup \mathcal{I}$ and $r \in R$. Then there are $I_1,I_2 \in I$ such that $a \in I_1$ and $b \in I_2$. As $\mathcal{I}$ is totally ordered, one of the ideals $I_1,I_2$ - call it $I_k$ - contains the other. Hence, $a,b \in I_k$, and thus the elements $a+b$, $ab$, $ar$, $ra$ are also in $I_k$ and thus in $\bigcup \mathcal{I}$. So $\bigcup \mathcal{I}$ is indeed an ideal. Finally, this ideal is proper since none of the ideals in $\mathcal{I}$ contains $1$, and hence neither does $\bigcup \mathcal{I}$. $\square$

Proof of the theorem via Zorn's lemma: Let $\mathcal{J}$ denote the set of all proper ideals in $R$; this is a poset with respect to set inclusion. Note that $\mathcal{J}$ is non-empty since it contains $\{0\}$. If $\mathcal{I} \subseteq \mathcal{J}$ is a non-empty chain, then the lemma implies that $\bigcup \mathcal{I} \in \mathcal{J}$, and clearly $\bigcup \mathcal{I}$ is an upper bound of $\mathcal{I}$. So by Zorn's lemma $\mathcal{J}$ has a maximal element. $\square$

First proof of the theorem by transfinite recursion (taken from a comment by Joel David Hamkins). We recursively define an increasing family of proper ideals $I_\beta$, indexed over the ordinal numbers:

  • $I_0 := \{0\}$.

  • If $\beta$ has a predeccesor $\alpha$, see if $I_\alpha$ is maximal. If yes, the proof is complete; if not, choose $I_\beta$ to be a larger proper ideal than $I_\alpha$.

  • If $\beta$ is a limit point, define $I_\beta$ to be the union of the preceding ideals. Then $I_\beta$ is, according to the lemma, also a proper ideal.

This procedure has to terminate at some ordinal number; if it didn't, we would find an ideal in $R$ whose cardinality is strictly larger than the cardinality of $R$. $\square$

Second proof of the theorem by transfinite recursion. Endow $R$ with a well-ordering. We recursively define proper ideals $I_r$ for each $r \in R$: assume that $I_s$ has already been defined for all $s < r$. If the ideal generated by $\bigcup_{s < r} I_s \cup \{r\}$ is not equal to $R$, then we define $I_r$ to be this ideal; this is clearly proper. Otherwise we define $I_r := \{0\}$ if $r$ is the smallest element in $R$ (so $I_r$ is proper since $R \ne \{0\}$), and $I_r:= \bigcup_{s < r} I_s$ in any other case; then $I_r$ is a proper ideal due to the lemma.

The family $(I_r)_{r \in R}$ is increasing, so the lemma implies that $I := \bigcup_{r \in R} I_r$ is a proper ideal. If $t \in R$ is not in $I$, then $t \not\in I_t$, so the ideal generated by $\bigcup_{s < t} I_s \cup \{t\}$ is $R$, and hence the ideal generated by $I \cup \{t\}$ is $R$. Thus, $I$ is maximal. $\square$

Added by Pace Nielsen:

A different proof of the theorem by transfinite recursion, without using the lemma. Given a well-ordering $(r_{\alpha})$ for $R$, we recursively set $$s_{\alpha}:= \left\{ \begin{matrix} r_{\alpha} & \text{ if }1\notin \langle r_{\alpha},s_{\beta}\ :\ \beta<\alpha\rangle\\ 0 & \text{ if }1\in \langle r_{\alpha},s_{\beta}\ :\ \beta<\alpha\rangle.\end{matrix}\right.$$ The ideal $M$ generated by the $s$'s is not $R$, otherwise $1$ would be generated by a minimal finite number of them $s_{\alpha_1},\ldots,s_{\alpha_k}$, with $\alpha_1<\ldots< \alpha_k$. This contradicts the minimality of $k$ if $s_{\alpha_k}=0$ (noting that $k\neq 0$ since $R\neq \{0\}$), otherwise this directly contradicts the definition of $s_{\alpha_k}$. Moreover, no new element can be adjoined to $M$ without generating the trivial ideal, so $M$ is maximal. $\square$

(If you like, you could also explain why the ideal $M$ equals the set of $s$ elements.)

$\endgroup$
31
  • 6
    $\begingroup$ So, your argument is "I am much less familiar with TR/I, so it requires more careful proof writing and therefore significantly more overhead and more time in reading it as well". But that is not a good argument. Now, don't get me wrong. Some proof do lend themselves to ZL much more naturally than to TR/I, but the "overhead" is about how sloppy you are willing to be. In the case of ZL we all have so much experience, that we can be very sloppy. I have seen enough set theory papers where the proof literally read "By transfinite recursion." or something like that, which have no overheads. $\endgroup$
    – Asaf Karagila
    Commented Dec 12, 2022 at 11:53
  • 8
    $\begingroup$ @AsafKaragila: No, my argument is that both proofs in my answer are equally carefully written, and still the second one is much longer (in fact I'm not under the impression that I am less familiar with transfinite induction than with ZL; I use both of them now and then). You seem to argue that the second proof includes more details than the first one, though. Could you please elaborate which details you would add to the ZL proof such that you would consider it to have a similar level of detail as the second proof? $\endgroup$ Commented Dec 12, 2022 at 12:17
  • 4
    $\begingroup$ "Fix a well-ordering of $R$, and construct by recursion an increasing sequence of ideals, $I_r$, by adding the $\alpha$th element in the $\alpha$th stage, if possible; taking unions at limit stages. This has to terminate, and therefore the ideal has to be maximal." $\endgroup$
    – Asaf Karagila
    Commented Dec 12, 2022 at 13:19
  • 4
    $\begingroup$ My version of the proof by transfinite recursion: start with the trivial ideal, and at each stage, choose a proper extension if there is one. At limits take unions. Since this can't continue forever, one finds a maximal ideal. $\endgroup$ Commented Dec 12, 2022 at 13:19
  • 6
    $\begingroup$ @AsafKaragila: "Fix a well-ordering of R, and construct by recursion an increasing sequence of ideals, $I_r$, by adding the αth element in the αth stage" But this is not what actually happens: at some stages you do not add the $\alpha$th element, but continue without it. Apart from this, you are suggesting to shorten the recursive proof by leaving out a lot of technical details, whereas the ZL proof that I gave does not leave out any technical details at all. (If you disagree with this, it would be nice if you could point out which technical details you think are missing.) $\endgroup$ Commented Dec 12, 2022 at 14:07
9
$\begingroup$

Before transfinite induction or Zorn's lemma, there's already a choice with the integers. You can study elementary number theory using weak induction (WI)

$$(\forall P \subseteq \mathbb N)\bigl((P(0) \wedge \forall n (P(n) \implies P(n+1))) \implies \forall n P(n)\bigr) \tag{*}\label{WI}$$

or using the least-number principle (LNP):

$$(\forall P \subseteq \mathbb N)\bigl(P = \emptyset \vee \exists n (P(n) \wedge \forall m (P(m) \implies n \leq m))\bigr). \tag{**}\label{LNP}$$

Remarks:

  • I find WI closer in spirit to transfinite induction, and LNP closer in spirit to Zorn's lemma.

  • My sense is that WI is a more common pedagogical choice in this setting than LNP.

  • Nevertheless, it's possible to teach elementary number theory by invoking LNP repeatedly without ever discussing WI. For instance, the Ross Mathematics Program takes this approach systematically.

  • (It occurs to me that LNP only says that $\mathbb N$ is well-ordered, while WI also proves that $\mathbb N$ has order type $\omega$. The stronger claim follows from LNP plus the fact that $\mathbb N$ is an ordered commutative monoid with cancellative addition.)

Pros and cons of WI versus LNP

The philosophy at Ross is that students understand things more deeply by using LNP rather than WI.

  1. WI feels much more like a separate "law of reasoning", on a par with modus ponens, whereas LNP feels much more like an "axiom", more on a par with the associativity of multiplication in a group. I think that most mathematicians outside logic more-or-less learn the basic "laws of reasoning" once, and then set them in stone, while fluidly learning, inventing and considering new axioms for new mathematical objects, all the time. It’s convenient for them to keep the "laws of reasoning" to a minimum, packaging as many mathematical concepts as possible as "axioms", which are more flexible and more amenable to variations, generalizations, and modifications later.

  2. This is particularly convenient at Ross, where students are introduced to many things at once, e.g. the basic laws of reasoning at the same time as the abstract notion of a ring. For them the natural numbers $\mathbb N$ play a dual role — as a primitive domain for performing induction, and as a first example of the general notion of commutative semirings. So using LNP nicely identifies $\mathbb N$ as “the unique commutative semiring which is well-ordered”, while WI identifies $\mathbb N$ more awkwardly as “a commutative semiring with a separate logical role.”

  3. LNP encourages second-order reasoning, and picturing subsets geometrically, in a way that WI doesn't.

Pros and cons of transfinite induction versus Zorn's lemma

For transfinite induction and Zorn's lemma, maybe the shoe is on the other foot. After all, the ordinals are rarely thought of as a special instance of a more general class of mathematical objects. Perhaps it's more honest to consider the ordinals logically privileged. Maybe the image of building something step by step is more helpful than the image of teleporting to the maximal element.

$\endgroup$
15
  • 3
    $\begingroup$ For me, the reason the well-ordering principle is used at the Ross program in place of induction is that when using well-ordering there to prove there are no integers between 0 and 1 (sketch: if there is $0 < n < 1$ for a positive integer $n$, then $0 < n^2 < n$, so the nonempty set $\{n : 0 < n < 1\}$ has no least element, contradicting well-ordering), it feels like a proof. But proving that by induction ($1 \geq 1$, and if $n \geq 1$ then $n+1 \geq 1$) feels circular because the only reason we accept induction is that we know intuitively that the positive integers start at 1! $\endgroup$
    – KConrad
    Commented Dec 11, 2022 at 20:49
  • 6
    $\begingroup$ Slightly off topic, but people might find this article interesting: Are induction and well-ordering equivalent? by Lars–Daniel Öhman. $\endgroup$ Commented Dec 11, 2022 at 20:49
  • 2
    $\begingroup$ To my way of thinking, ($\ast$) and ($\ast\ast$) are essentially identical, proved equivalent by a very easy argument, and it seems unimportant to distinguish greatly between them. A deeper understanding of induction emphasizes this flexibility. In contrast, the main distinction to make wrt the question is not involving transfinite induction at all but rather transfinite recursion, that is, the distinction between a method of proof and a method of construction. Applications of Zorn are often viewed as constructions, and the alternative in question is a construction by transfinite recursion. $\endgroup$ Commented Dec 11, 2022 at 21:52
  • 5
    $\begingroup$ The distinction described in this answer (and in the nice paper of Öhman that @TimothyChow links) is quite different from the Zorn’s Lemma / transfinite induction distinction, and also from other distinctions sometimes called “well-ordering” vs “induction”. The distinction here is specific to the naturals: it’s the difference between inductiveness/well-foundedness of the “sucessor” relation, and of the “less-than” relation. So it’s a nice point to discuss, and pedagogically important, but quite orthogonal to questions about more general well-orders. $\endgroup$ Commented Dec 12, 2022 at 16:18
  • 3
    $\begingroup$ @DavidESpeyer, you can also say “minimal criminal” instead of “minimal counterexample” for LNP. Maybe undergrads would enjoy the rhyme — I don’t know that I’ve heard that phrase more than once, but I remember it from my undergrad combinatorics class with Katalin Vesztergombi. $\endgroup$
    – user44143
    Commented Dec 12, 2022 at 20:52
6
$\begingroup$

First, I like this organization because it gives a better intuition for what we're likely to lose if we would prefer to trade Choice for Countable Choice, say, since all the work of AoC is done in "Fix a well-ordering."

To answer your question, I'd say your presentation suggests that we should be exactly as comfortable with Zorn as we are with the Well-Ordering Principle, and allows the reader to follow that intuition where it leads -- if a reader is fine with asserting that we can fix a well-ordering, the rest follows; if a reader is going to choke on that then the rest is irrelevant, to some extent.

$\endgroup$
6
$\begingroup$

Here's another example, from a rather worm's-eye view. (Apologies for too many edits.)

Theorem: Let $R$ be a non-zero commutative ring with $1$. Then there exists a minimal prime ideal in $R$, i.e., a prime ideal which is minimal among prime ideals.

Lemma: Let $C$ be a totally-ordered collection of prime ideals. The intersection $\bigcap C$ is a prime ideal.

The proof of the lemma is pretty straightforward algebra.

Proof of Lemma: Suppose $xy \in \bigcap C$, but $x \notin \bigcap C$. There is some $P \in C$, $x \notin P$. Let $Q \in C$. If $Q \subseteq P$, we have $x \notin Q$, but $xy \in Q$; since $Q$ is prime, $y \in Q$. In particular, $y \in P$. If $P \subseteq Q$ then $y \in Q$. So $y \in Q$ for all $Q \in C$, and $y \in \bigcap C$. $\square$

Now, the theorem.

Proof (Zorn's Lemma): Let $S$ be the set of prime ideals in $R$. $S$ is nonempty, e.g., $R$ has at least one maximal ideal and it is prime. If $C \subseteq S$ is a totally-ordered collection of prime ideals then the intersection $\bigcap C$ is a prime ideal by the lemma. The collection $S$ has chains bounded (below), so by Zorn's lemma it has a minimal element. $\square$

Proof (Transfinite recursion; following comment below): Start from a maximal ideal $P_0$. At each $\alpha$, if $P_\alpha$ is not minimal then let $P_{\alpha+1}$ be a smaller prime ideal. At limit steps, take intersections (this uses the algebraic lemma). The recursion ends on a minimal prime ideal. $\square$

Proof (Well ordering): Well-order the elements of $R$. Start from a maximal ideal $P_{\varnothing}$. At each element $r$, consider the intersection of the prime ideals at elements before $r$; if it has a prime subideal not containing $r$, choose one such---maybe the lexicographically first such?---to be $P_r$, otherwise set $P_r$ to be that intersection. This gives a nested sequence of prime ideals, whose intersection is a prime ideal, and it is minimal because if there were a prime subideal avoiding any particular element $r$ in the intersection, then we would have already passed to a subideal avoiding $r$ at the $r$'th step. $\square$

All three of these proofs are very similar. They produce the minimal prime ideal by "sculpting" rather than by "building", i.e., carving away bad parts of the ring instead of gathering together elements of the ideal. (The slightly embarrassing earlier edits of this answer reveal my failure to find a "building" proof.)

I can't argue that the last two proofs are any harder to find or less natural than the Zorn's lemma proof. (I needed a hint, but that's just me.) But at least for this example there is no way that the Zorn proof is longer or more complicated.

$\endgroup$
3
  • 4
    $\begingroup$ A transfinite recursion proof would proceed as follows: We construct a descending chain of prime ideals. To start, let $P_0$ be any maximal ideal (which requires AC to produce in the first place). At the successor steps, if $P_{\alpha}$ is not minimal, let $P_{\alpha+1}$ be any prime ideal below it (which is where we use choice). At the limit steps, take intersections. The recursion stops at a minimal prime ideal. $\endgroup$ Commented Dec 12, 2022 at 22:32
  • 1
    $\begingroup$ Thanks. That makes sense. ... Is that also a hint for well-ordering? Maybe I should have started with a maximal ideal, and for each $r$, if there's a prime subideal of the current ideal that avoids $r$, then move to that ideal. We get a descending chain of prime ideals that avoids "as much as possible". $\endgroup$ Commented Dec 12, 2022 at 22:36
  • 1
    $\begingroup$ That would work, although I should mention that my question isn't so much about well-ordering. Rather, it is about the difference between recursively constructing objects (while using choice as needed, be it via well-ordering the back ring, or not) vs. using Zorn's lemma to assert the existence of the object you want. $\endgroup$ Commented Dec 12, 2022 at 23:07

Not the answer you're looking for? Browse other questions tagged or ask your own question.