143

Me and a friend were joking about aleph's. Upon trying to type א0 (switch those 2 chars), they switched themselves! Any sequence of symbols does not stop this effect. Why is this!??

Try to type these with the 0 and א reversed (c&p for א):

א0

א - 0

א \\\ 0

א -./ 0

Words however separate them

א foobar 0

I'm on arch linux and have not tested this on any other OS yet

EDIT: Number does not have to be zero. It works with numbers, but not letters.

8
  • 14
    At first glance I thought you were crazy. Turns out it's simply an artefact of how different language directions are used. Great question!
    – user366447
    Commented Jul 31, 2017 at 13:09
  • 6
    For writing Hebrew text, this order makes sense. Otherwise it would be most annoying to type stuff like ב-5 דקות (in 5 minutes).
    – ugoren
    Commented Jul 31, 2017 at 14:50
  • 7
    @jamesqf: And there is one, see IllidanS4's post. Commented Jul 31, 2017 at 18:32
  • 28
    @jamesqf, Hebrew letters exist in Unicode for writing Hebrew. And I dare say there are more of us who write Hebrew (7 millions or so) than those who write about set cardinalities.
    – ugoren
    Commented Jul 31, 2017 at 20:30
  • 29
    @ugoren by some counts there are 130k mathematicians, but in truth, mathematics is the universal language, so there's really ℵ₀.
    – Nick T
    Commented Jul 31, 2017 at 22:34

8 Answers 8

110

'א', 'HEBREW LETTER ALEF' (U+05D0) has the BIDI (bi-directional) class "Right-to-Left [R]", because Hebrew is traditionally written right-to-left. Digits, on the other hand, have no specific directionality assigned to them, and so the whole chunk of aleph and zero is interpreted as being right-to-left. In this case, the following character may not necessarily be located on the right of the preceding character, as Unicode's rather complex bi-directional rules dictate.

You have several options to work around this issue.

  1. You can use 'ℵ', 'ALEF SYMBOL' (U+2135). It's a symbol and has the left-to-right property: ℵ0.

  2. Instead of the usual digit 0, you can use a zero-like character with left-to-right directionality, such as '〇', 'IDEOGRAPHIC NUMBER ZERO' (U+3007).

  3. The cleanest way is to use the 'LEFT-TO-RIGHT MARK' (U+200E) character (Wikipedia) after the aleph: "א‎0". This is an invisible zero-width character that is defined to have left-to-right directionality. Thus, it has the same effect on the bidirectional text layout algorithm as inserting, say, a left-to-right Latin letter after the א, except that no visible letter will appear there.

9
  • 74
    In a mathematical context (which I expect this is), U+2135 is the correct character to use.
    – cmbuckley
    Commented Jul 31, 2017 at 13:46
  • 10
    You have to be careful with overrides - where you place them in text it is important to remove them (using the "pop directional formatting" character U+202C) when the contex you wish them to operate on completes.
    – J...
    Commented Jul 31, 2017 at 14:01
  • 4
    Also, the "override" characters are kind of overkill, "embedding" is sufficient for this use case. There's also a new class called "isolate", not sure what the difference is in this situation.
    – Random832
    Commented Jul 31, 2017 at 14:48
  • 4
    I'd recommend swapping 2 and 3.
    – wizzwizz4
    Commented Jul 31, 2017 at 15:19
  • 10
    @Random832 All of those are overkill. All you really need is a left-to-right mark (U+200E) between the alef and the zero. That way you don't need any extra "pop" characters, either. Commented Jul 31, 2017 at 17:31
197

Aleph (U+05D0) is a Hebrew letter, and Hebrew is written right-to-left, so Unicode assigns it the "Right-to-Left" bidirectional class. (See Unicode TR9: Bidirectional Algorithm for more details.)

Latin letters are of course "Left-to-Right". However, zero (U+0030) is in the "European Number" bidirectional class, which is a weak class – while LtR by default, it can switch to RtL if there's a "strong" Right-to-Left character before it. (See Bidirectional Character Types and Resolving Weak Types in TR9.)

As a result, the directions of before and after are swapped for the entire word – if you put the zero 'before', it will show up to the right; if you write the zero 'after' aleph, it will show up on the left.

9
  • 14
    This is an extremely common problem in a number of text editors and websites when typing in Hebrew - I imagine it's true of other right-to-left languages as well. It's certainly gotten better over time, but imagine trying to write a word problem - switching back and forth between Hebrew words (like aleph character) and numbers (like the 0 character) repeatedly...
    – Jake
    Commented Jul 31, 2017 at 18:09
  • 3
    @Walt Most textbooks I've seen are the "immersion" type, which use extremely simple Hebrew but pretty much entirely Hebrew. It may seem counter-intuitive to use a language to teach the language, but it allows for a more organic buildup of language skills. You might see a transliteration or translation inline (something like lh4.ggpht.com/-_Vc8TUDwznQ/UlhaLFjnrGI/AAAAAAAAzQk/_zm4BMC0aLw/… - "Shalom Kita Aleph" = "Hello First Grade")
    – Jake
    Commented Jul 31, 2017 at 20:50
  • 1
    @Jake Ah, that makes sense. The only foreign language I really took was Latin; our textbooks tended to be mostly English with a single chunk of Latin text to decipher each chapter, right up until the whole class format switched from "learn Latin" to "translate this whole epic poem, a bit at a time, through the course of the school year".
    – Tin Wizard
    Commented Jul 31, 2017 at 20:55
  • 5
    @Walt: I think there may be a misunderstanding. If I type a latin (LTR) word, then a Hebrew (RTL) word, then another latin word, I can freely have them all in a sentence, and only the Hebrew word renders RTL. It's all designed to easily fit in the same sentence. The problem is that the number 0 is used by both LTR and RTL languages, and so the software just makes it the same direction as the previous letter. If it follows LTR characters, it's LTR. If it follows RTL letters, it's RTL. There's also overrides to swap it. fileformat.info/info/unicode/char/202d/index.htm Commented Jul 31, 2017 at 23:19
  • 4
    The zero isn't becoming RTL - it's still LTR, and a sequence of digits will show up left-to-right even with Hebrew around it, but the embedding levels interact in such a way that the zero shows up on the left of the Hebrew character preceding it in memory order. (Unicode bidirectionality is complicated.) Commented Aug 1, 2017 at 7:48
22

Perhaps, a better way to achieve this would be to:

echo -e "\u200F0א"

And the mandatory xkcd reference https://xkcd.com/1137/

‮LTR

14

It's perfectly possible to have a zero in front as shown in the following example which was made in Notepad++.

Alef with 0

What you're seeing and also becomes apparent if you try to mark the character in your question, is that Hebrew is written right to left and (as the 0 is directly connected) the text is handled in a right to left (instead of left to right) manner.

See the second example for the trouble Firefox (on my end) has with a clear selection.

Firefox selecting a right to left text

2
  • 17
    This is terrible advice, because it plays games with the actual character ordering in order to get a particular visual ordering. The other answers explain why this occurs and some include the right way to deal with it (the override and explicit direction marks).
    – Dranon
    Commented Jul 31, 2017 at 14:01
  • 8
    Could you point out where where I'm including some form of advice? I'm merely showing an example of what happens, that it is indeed possible to have a suffix numeral and providing information about why it happens like it does.
    – Seth
    Commented Aug 1, 2017 at 5:51
13

Hebrew is written right to left - this makes the aleph character carry the information, that the next character should be printed left of it.

If you hex-check your document (or move the cursor through your text with the arrow keys in a suitable editor), you will notice, that you get to the alpeh first, then to the digit.

I.e.: The assumption "next character == character to the right" does not hold.

3
א0 0א 0-א א-0

The issue is where you do this, and the implementation. To get Hebrew-number behavior all the characters must be in right-to-left directionality. In HTML/CSS that is:

<p style="direction:rtl"> א0 0א 0-א א-0 </p>

In the Operating System, Hebrew and bi-directionality must be enabled.

The workarounds by suggesting the use of other characters as substitutes, defeats the purpose of Unicode. The aleph as a mathematical operator may look the same in some character sets, but is an entirely different character than the Hebrew aleph, both in context and how it will be parsed. For example, a Hebrew-native speaker/computer will not process it correctly if used in conjunction with a Hebrew word. Numbers and non-alpha characters are a problem when they are not themselves given the same directional encoding as the alpha characters. Thus, ironically, numbers themselves while seemingly should be independent of a character-set/directionality, take on whatever unicode directionality of the preceding letter. Thus in a Hebrew document - the numbers become 'Hebraicised' i.e. directionally like Hebrew. Whereas an English-Latin document, the Hebrew letters can be mixed up and messed up because of the lack of directionality attributed to the paragraph.

2
  • The OP was trying to use aleph as the numeric operator, no? Commented Aug 7, 2017 at 7:34
  • Well, it was not clear in the post at all. In any case, directionality shouldn't be relevant. Aleph is used in set notation and has an infinite, infinite series designation. It should directionally left to right since all maths is left to right regardless of what language you're using. However using aleph as a character in Hebrew is directionally set at right to left.
    – Danny F
    Commented Aug 8, 2017 at 14:11
1

It's possible:

‭א0

‭א - 0

‭א \\ 0

‭א -./ 0

‭א foobar 0

(This answer didn't answer "why is this", as it is already answered by others. But it does answer the question in the title, "impossible to...?")

1
  • 8
    But it also doesn't answer HOW, and so is almost useless.
    – NH.
    Commented Aug 3, 2017 at 17:43
0

It is actually more obvious if you will copy א in the editor that supports it correctly! Like (yes, hillarious) Chrome url omnibox. The WHOLE style of it will change. Also as you may be interesting, just like with Ohm vs Omega (that is commonly mistaken in TrueType and OpenType fonts files, .Omega glyph is not U+2126 (Ohm), it is U+03A9 (Omega)) for math you should use ℵ, not א.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .