Need add a space before character in changing category of character?

Question

At the bottom of page 307 of The TeXbook, it says

TEX always reads the token following a constant before evaluating that constant.

So,

{\catcode‘\>=2 >

is different from

{\catcode\`>=2>

It is true, as {\catcode`\>=2>hello\par outputs

But if there is

\catcode`\<=1

before that, the result is changed. Such as

{\catcode`\>=2>hello\par
\catcode`\<=1
{\catcode`\>=2>hello\par
\bye

outputs

Why does

\catcode`\<=1

affects

{\catcode`\>=2>hello\par

It doesn't. If you comment out \catcode\<=1` you get the same output. — cfr, Commented Nov 23, 2023 at 4:15
I think it is because TeX finds the closing group in the third line because of what is on the first line. It's the fact you've changed the catcode already which makes the difference. The third line is equivalent to {\catcode\>=2}hello\par` because > is end group when TeX looks for the matching end group. — cfr, Commented Nov 23, 2023 at 4:18
@cfr But why does the third line work well? Isn't it should be the same as the first line? — Y. zeng, Commented Nov 23, 2023 at 4:20
No because you've changed the meaning. > is already end group when that line is read. On the first line, > still has its original meaning. — cfr, Commented Nov 23, 2023 at 4:21

egreg · Accepted Answer · 2023-11-23 10:43:00Z

Let's see: if you try

\tt
{\catcode‘\>=2 >
\bye

you get No pages of output.

On the other hand, with

\tt
{\catcode`\>=2>
\bye

you get output, precisely “>” and

(\end occurred inside a group at level 1)

### simple group (level 1) entered at line 3 ({)
### bottom level

The explanation is easy: in the second case, the character > is tokenized before the category code assignment is finalized, hence it has category code 12. The use of \tt is to avoid getting “¿” in output.

Actually \catcode`\<=1 affects nothing and you get the same output (and the same warning) without it.

I'll just use h and remove \par that serve no purpose and just complicates tha analysis:

{\catcode`\>=2>h{\catcode`\>=2>h

You have the following token list:

{₁ |catcode| `₁₂ |>| =₁₂ 2₁₂ >₁₂ h₁₁ {₁ |catcode| `₁₂ |>| =₁₂ 2₁₂ >₂ h₁₁

Can you see the difference? The first > has catcode 12 for the reasons explained above, but the second one has category code 2, because the assignment has been performed. So the second assignment is actually performed in a group and has essentially no effect. But you also see that adding \catcode`\<=1 has no role whatsoever.

The output is “>hh” and the warning about the unclosed group is the same, because the initial {₁ isn't balanced, while the second one is, by >₂.

Side note

The codes

\catcode`\>=2
\catcode`>=2

are completely equivalent when > has its standard category code 12. Escaping the character is a safety measure that's actually only needed when the character has category code 0, 5, 9, 14 or 15. For instance, you cannot assign active category code to ^^@ (byte 0) with

\catcode`^^@=13

because normally ^^@ has category code 9 (ignored). To the contrary

\catcode`\^^@=13

would do. In the \obeylines macro you'll find

\catcode`\^^M\active

for the same reason.

Thanks. 1. Why can \tt avoid getting "¿" in output? 2. How did you get {1 |catcode| and so on from log file? 3. Why should it be \catcode'\>=2 not \catcode'>=1, that is why should use \> not just >? 4. In your last case, initial {1 is balanced by >2, so it shouldn't be warned. Why not balanced? — Y. zeng, Commented Nov 23, 2023 at 10:21
@Y.zeng The original Computer Modern Roman fonts are 7 bit and have ¿ in place of >. Not Computer Modern Typewriter, though. Anyway, I didn't look in the log file: I know how to guess the category codes. About the warning, you apparently forgot that you have another { in the code, didn't you? — egreg, Commented Nov 23, 2023 at 10:26
Yes. You are all right. I know now. Thanks. But what about the third question? — Y. zeng, Commented Nov 23, 2023 at 10:27
@Y.zeng \catcode`\>=2 and \catcode`>=2 are completely equivalent. — egreg, Commented Nov 23, 2023 at 10:34

cfr · Accepted Answer · 2023-11-23 05:03:07Z

{\catcode`\>=2>hello\par
\catcode`\<=1
{\catcode`\>=2>hello\par
\bye

\catcode<=1` is not relevant here. MWE:

{\catcode`\>=2>hello\par
{\catcode`\>=2>hello\par
\bye

When TeX reads the first line, > has its standard meaning, which is to produce a punctuation symbol. So because TeX reads the > before evaluating the 2 and changing the meaning of >, the punctuation is produced in the output stream.

TeX then evaluates the 2 and makes the category code change. That is, it changes the meaning of >. Then it continues with the line.

When TeX reads the next line, > no longer has that meaning. It now has category code 2, so it is an end-of-group marker, like }. So when TeX reads it before evaluating the 2 this time, it reads it as the end of the group rather than an instruction to typeset something.

TeX then evaluates the 2 and makes the category change (which doesn't actually change anything). Then it continues with the line.

The second line is thus equivalent to

{\catcode`\>=2}hello\par

> produces ¿ by design, just as Q produces Q. That's just how you input that punctuation mark. In LaTeX, you get the same symbol with the default font encoding, which is no surprise as it is just the default TeX encoding (OT1). If you're using a later encoding, such as T1, you can produce the same symbol with \textquestionmarkdown.

That is, the reason > produces that particular symbol is just the same reason \textquestionmarkdown produces it: that's how TeX has been told to expand those tokens (> in the one case and a macro in the other).

To see this, try

>hello\par
{\catcode`\>=2>hello\par
{\catcode`\>=2>hello\par
\bye

It might be easier to rewrite Knuth's example to use something more familiar. Consider

{\catcode`\Q=2Qhello\par
{\catcode`\Q=2Qhello\par
\bye

There is another question. On the third line, when \cat'\>= is ran, why does the second > in the line three still has the definition of group? In this situation, should it wait the new definition? — Y. zeng, Commented Nov 23, 2023 at 4:46
@Y.zeng It can't, can it, if TeX reads it before evaluating the 2? That's the point Knuth is making, isn't it? [Also, the definition isn't actually any different, but I don't know that's relevant.] — cfr, Commented Nov 23, 2023 at 4:49
@Y.zeng Try running Knuth's example with Q rather than > (or see edit above). It's exactly the same principle but without the > business, though note > does the same in LaTeX by default e.g. \documentclass{article}\begin{document}>hello\end{document} compiled with pdfLaTeX will produce the same symbol. If you switch to the T1 encoding, you get > instead. — cfr, Commented Nov 23, 2023 at 4:57

Stack Exchange Network

Need add a space before character in changing category of character?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
tex-core
catcodes
grouping
.

Hot Network Questions

Need add a space before character in changing category of character?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged tex-corecatcodesgrouping.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
tex-core
catcodes
grouping
.