2
\$\begingroup\$

Is there such a thing as a bit-level ASCII anagram, that forms a natural language word when the bits are interpreted as ASCII from left to right, but a different natural language word when interpreted in reverse?

For instance, a test messsage "HELLO_WORLD" (Unicode or ASCII interpreted) is being processed and shows up with inadvertently reversed bits. Now imagine, when decoded or displayed in a memory viewer, it would show up as "SWEETPOTATO" (it doesn't - just a hypethetical illusatration). Does such a word list exist? It doesn't have to be strict english, I imagine, and it could use extended UNICODE.

I could imagine that "license plate" style alphabets would be acceptable, e.g. GR8T would be a great word, allowing numbers, punctuation and symbols as letter substitutes, and getting creative with word spelling. This is where unicode has much more to offer.

I often need to generate random bits to test a system, and the data is entered and read through text editors, memory viewers etc... The initial stumbling block when testing system data at the bit/byte level is that the bits may be reverse ordered, big/small endian switched, inverted, missing an LSB or MSB etc... For those of you coding low-level software, firmware, or RTL for FPGAs and ASICs, I am sure you know the pain and this is no foreign concept.

Yes, I know, we can use common hexadecimal test patterns for this kind of work. Hexadecimal "0F" and "A5" come to mind. And we can test systems with BERT and self-synchronizing sequences. The world would be fine without an ASCII anagram.

But these hex patterns are not as much fun as testing with patterns that stand out more creatively. It's for entertainment, to add some spice when we're down the debug rabbit hole at 2am.

\$\endgroup\$
7
  • 1
    \$\begingroup\$ But is the existence of such patterns an electronics question? And it would limit you to every other letter as ASCII has MSB zero so LSB has to be zero too in each byte. And to be letters, all letters have bit 6 set high, so it will even more limit your choises as bit 1 must also be set high. And no bit pattern that would match non-printables. Have you examined an ASCII table yourself at all, or just asked? \$\endgroup\$
    – Justme
    Commented Apr 29, 2022 at 19:39
  • \$\begingroup\$ @Justme 1) Electronics? Not sure if SO or EE has more binary puzzlers, but bit bashers and H/W debuggers also reside here. 2) ASCII examined, yes. Unicode no. Why not submit the rest of your comment as an answer? \$\endgroup\$
    – P2000
    Commented Apr 29, 2022 at 20:27
  • \$\begingroup\$ Because there already is an answer that performed the steps I explained how to analyze the ASCII table, so there is no need for another answer. And obviously, how it all is in UNICODE will depend on how you encode the UNICODE, and basically all latin alphabets reside in the ASCII group so even if you used direct 16-bit encoding most standard letters will have high byte 0x00 and thus you cannot have any palindromes if the low byte must be always 0x00. Other encodings, such as UTF-8, encode characters with up to four byte sequences. Besides, why would you be debugging at 2AM? \$\endgroup\$
    – Justme
    Commented Apr 29, 2022 at 21:57
  • \$\begingroup\$ @justme we're not restricted to standard letters, for instance U+1E84 is Ẅ, "Latin Capital Letter W with diaeresis", bit reversed it's U+2178, ⅸ, "Small Roman Numeral Nine Unicode Character", and both are legible \$\endgroup\$
    – P2000
    Commented Apr 29, 2022 at 23:22
  • 1
    \$\begingroup\$ The normal definition of "anagram" (for natural-language words made of letters) is that the letters can be in any order, not specifically reversed. In a cryptic crossword clue, a word like "mixed" could indicate an anagram of a nearby word, while a word like "back" would only let you reverse it, not anagram. At a bit-level, an anagram would allow any message of the same length and hamming weight (popcount = number of set bits). I'd suggest rewording your question to whichever one you actually wanted to ask; I assume bit-reversal, not anagrams. \$\endgroup\$ Commented Apr 30, 2022 at 5:07

1 Answer 1

10
\$\begingroup\$

What an interesting question!

Unfortunately, at least for standard ASCII, there are no character sequences whose self and bitwise reverse are both English words. This is easy to demonstrate... I used a simple program to find all uppercase and lowercase letters whose bitwise reverse is also an uppercase or lowercase letter. I found the following:

B <-> B
F <-> b
J <-> R
N <-> r
V <-> j
Z <-> Z
f <-> f
n <-> v

As you can see, there are no vowels in this list, nor Y/y. There are some (highly debatable) English words with no vowels nor Y, but none of them fit this limited set.

I can't speak for languages other than English, or for Unicode characters in any particular encoding such as UTF-8. My gut feeling is that it's unlikely, but not at all impossible, that such a pair exists. If it does exist, it's very likely to be short (2 or 3 characters in some language's alphabet) and not very much fun. Sorry!!

\$\endgroup\$
6
  • \$\begingroup\$ nice approach! And thanks for appreciating the fun of it. I could imagine that licence plate style alphabets would be acceptable, e.g. GR8T would be a great word, allowing numbers, punctuation and symbols as letter substitutes, and getting creative with word spelling. This is where unicode has much more to offer. \$\endgroup\$
    – P2000
    Commented Apr 29, 2022 at 20:23
  • 1
    \$\begingroup\$ @P2000 2 <-> L and 6 <-> l (lowercase ell, not number 1) if you want to have a go at a license plate pair! Keep up the creative spirit! \$\endgroup\$
    – TypeIA
    Commented Apr 29, 2022 at 20:26
  • \$\begingroup\$ Perhaps you’d get a better hit-rate if you reverse the bits and invert (1s complement)? Just guessing, It’s a bit early for me to be doing that in my head \$\endgroup\$
    – Frog
    Commented Apr 29, 2022 at 20:42
  • \$\begingroup\$ @Frog yeah possibly, and unicode also has potential, I just tried U+1E84 which is Ẅ, "Latin Capital Letter W with diaeresis", bit reversed it's U+2178, ⅸ, "Small Roman Numeral Nine Unicode Character", and both are legible \$\endgroup\$
    – P2000
    Commented Apr 29, 2022 at 23:24
  • 1
    \$\begingroup\$ Is UTF-8 really as much fun though? The characters won't be legible in any debugging tool, and I thought the idea was to have something quirky you can see (like the famous 0xDEADBEEF, 0xBAADF00D, etc. patterns). \$\endgroup\$
    – TypeIA
    Commented Apr 30, 2022 at 5:10

Not the answer you're looking for? Browse other questions tagged or ask your own question.