Harald Hanche-Olsen posted some interesting code in answer to this question.

Could someone explain how it works?

I understand (I think) the futurelet and the expansions. I don't understand:

  1. Where does the <null> character come from in \ifcat\next9?
  2. How does the expanded text actually get broken into the \hbox's? (Pages 98-99 of the TeXbook didn't really help me understand the trace output in this case, so I'm hoping someone can clarify.)


Here is the code reproduced from the referenced answer:


\def\foo#1#2{\vtop{\hsize=#1\rightskip=0pt plus 1fil \leftskip=0pt\noindent\breaknumberanywhere#2}}

Here is a long number:
Short answer to your question number 2: 178239… becomes

\hskip0pt 1\hskip0pt 7\hskip0pt 8\hskip0pt 2\hskip0pt 3\hskip0pt 9\hskip0pt …

and that gets broken into lines using TeX's builtin algorithm for breaking paragraphs into lines. That is why I set \rightskip to be stretchable; otherwise I get overfull \hboxes, unless I very carefully set \hsize to be an exact multiple of the width of a digit (most fonts, it appears, have all digits being the same width, which is very handy in number tables).

Regarding the first question, I can at least explain that \ifcat\next9 checks for equality of category codes, or catcodes for short. The number is terminated by a closing brace, which has a different catcode than digits, so I just compare the catcode of the token to which \next has been \futurelet with that of a randomly chosen digit, to see if \next really is a digit.

\breaknumberanywhere works by assigning the next token to \next without removing it from the input stream, then inserting \breaknumberi at the front of the input stream. So that macro, when run, has the next token available as \next for testing purposes, and if it is a digit, inserts \breaknumberii to the front of the input stream. The \expandafter trick is to get rid of the \fi token (while terminating the \if, so \breaknumerii will see the next digit as the first available token (where it will become #1). Then it restarts \breaknumberanywhere to process the next digit.

And here endeth today's sermon.

I figured this out before I noticed Harald's answer, which is probably better, but here is my stab at it. Say the TeX file has

\breaknumberanywhere 178239

in it.

  1. \breaknumberanywhere is replaced with \hskip0pt\futurelet\next\breaknumberi. So the line becomes

    \hskip0pt\futurelet\next\breaknumberi 178239
  2. \futurelet\a\b\c is equivalent to \let\a\c\b\c, so you have something like

    \hskip0pt\let\next1\breaknumberi 178239
  3. Now \breaknumberi is expanded, resulting in

    \hskip0pt\let\next1\ifcat\next9\expandafter\breaknumberii\fi 178239
  4. Now \ifcat compares the category codes of the next two tokens, which are \next (that is, 1) and 9. (It does not check if the category code of \next is 9, the ignored or <null> character, which is what you might be thinking based on your first question). These do have the same category codes, so the true text is expanded, but actually it's delayed by the \expandafter allowing the conditional to finish up. The result is

    \hskip0pt\breaknumberii 178239
  5. \breaknumberii takes the next token (1) as its argument and replaces it, resulting in

  6. Now we go back to step 1!

The result is \hskip0pt inserted between every two digits of the long number, which allows TeX to break lines anywhere. The token <tok>coming after the 9 will not have the same category code as a digit, so \breaknumberi <tok> will just result in <tok> by itself instead of \breaknumberii <tok>

