How do you echo a 4-digit Unicode character in Bash?

Question

I'd like to add the Unicode skull and crossbones to my shell prompt (specifically the 'SKULL AND CROSSBONES' (U+2620)), but I can't figure out the magic incantation to make echo spit it, or any other, 4-digit Unicode character. Two-digit one's are easy. For example, echo -e "\x55", .

In addition to the answers below it should be noted that, obviously, your terminal needs to support Unicode for the output to be what you expect. gnome-terminal does a good job of this, but it isn't necessarily turned on by default.

On macOS's Terminal app Go to Preferences-> Encodings and choose Unicode (UTF-8).

Note that your "2 digit one's are easy (to echo)" comment is only valid for values up to "\x7F" in a UTF-8 locale (which the bash tag suggests yours is)... patterns represented by a single byte are never in the range\x80-\xFF. This range is illegal in singl-byte UTF-8 chars. eg a Unicode Codepoint value of U+0080 (ie. \x80) is actually 2 bytes in UTF-8.. \xC2\x80.. — Peter.O, Commented Dec 2, 2011 at 5:51
NB: for me in gnome-terminal, echo -e '\ufc' does not produce a ü, even with character encoding set to UTF-8. However, eg urxvt does print eg printf "\\ub07C\\ub01C" as expected (not with a � or box). — isomorphismes, Commented Mar 11, 2017 at 19:51
@Peter.O Why is the bash tag such a useful hint? Are different terminals common in CJK or … ? — isomorphismes, Commented Mar 11, 2017 at 19:54
@Peter.O zsh, fish, scsh, elvish, etc... there are many different shells, each can handle unicode characters however they want (or not). "bash" makes it clear this question isn't about some weird shell that does things differently. — masukomi, Commented Jul 22, 2017 at 17:47

7 revs, 5 users 64% · Accepted Answer · 2018-01-04 10:59:59Z

289

In UTF-8 it's actually 6 digits (or 3 bytes).

$ printf '\xE2\x98\xA0'
☠

To check how it's encoded by the console, use hexdump:

$ printf ☠ | hexdump
0000000 98e2 00a0                              
0000003

edited Jan 4, 2018 at 10:59

community wiki

7 revs, 5 users 64%
vartec

7

Mine outputs "��" instead of ☠... Why is that?
– trusktr
Commented Sep 29, 2012 at 4:14
9

That's true. I discovered i was using LANG=C instead of LANG=en_US.UTF-8. Now my terminals in Gnome show the symbols properly... The real terminals (tty1-6) still don't though.
– trusktr
Commented Oct 3, 2012 at 0:09
7

For those people trying a hexdump: 0000000 f0 9f 8d ba translates to \xf0\x9f\x8d\xba. Example echo: echo -e "\xf0\x9f\x8d\xba".
– Blaise
Commented May 28, 2015 at 14:25
12

You can also use the $'...' syntax to get the encoded character in to a variable without using a $(...) capturing subshell, for use in contexts that don't themselves interpret the escape sequences: skull=$'\xE2\x98\xA0'
– Andrew Janke
Commented Jul 5, 2015 at 5:14
7

Another thing about hexdump: on my machine, the second command in the answer outputs 0000000 98e2 00a0. Of course the 0000000 is just an unimportant offset, but the bytes after it translate to \xe2\x98\xa0, because the machine uses the little endian byte order.
– sigalor
Commented May 15, 2016 at 18:07

| Show 5 more comments

5 revs, 3 users 46% · Accepted Answer · 2022-02-14 11:36:49Z

182

% echo -e '\u2620'     # \u takes four hexadecimal digits
☠
% echo -e '\U0001f602' # \U takes eight hexadecimal digits
😂

This works in Zsh (I've checked version 4.3) and in Bash 4.2 or newer.

edited Feb 14, 2022 at 11:36

community wiki

5 revs, 3 users 46%
Juliano

23

that just spits out \u2620 when I do it.
– masukomi
Commented Mar 2, 2009 at 16:37
2

Sorry, forgot to say that I use zsh.
– Juliano
Commented Mar 2, 2009 at 16:51
40

Support for \u was added in Bash 4.2.
– Lri
Commented Dec 31, 2012 at 12:52
5

There is a version of this using ANSI-strings echo $'\U1f602'
– memoselyk
Commented Sep 25, 2018 at 20:33
6

does NOT work for me, Mac OS 10.14.2, bash (GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin18)). It merely prints out the input - $ echo -e '\u2620' <enter> simply prints out: \u2620
– Motti Shneor
Commented Mar 26, 2019 at 8:36

| Show 4 more comments

8 revs, 4 users 58% · Accepted Answer · 2017-04-13 12:51:57Z

75

So long as your text-editors can cope with Unicode (presumably encoded in UTF-8) you can enter the Unicode code-point directly.

For instance, in the Vim text-editor you would enter insert mode and press Ctrl + V + U and then the code-point number as a 4-digit hexadecimal number (pad with zeros if necessary). So you would type Ctrl + V + U 2 6 2 0. See: What is the easiest way to insert Unicode characters into a document?

At a terminal running Bash you would type CTRL+SHIFT+U and type in the hexadecimal code-point of the character you want. During input your cursor should show an underlined u. The first non-digit you type ends input, and renders the character. So you could be able to print U+2620 in Bash using the following:

echo CTRL+SHIFT+U2620ENTERENTER

(The first enter ends Unicode input, and the second runs the echo command.)

Credit: Ask Ubuntu SE

edited Apr 13, 2017 at 12:51

community wiki

8 revs, 4 users 58%
RobM

2

A good source for the hexademical code points is unicodelookup.com/#0x2620/1
– RobM
Commented Aug 25, 2012 at 12:10
1

The version of vim I'm using (7.2.411 on RHEL 6.3) doesn't respond as desired when there's a dot between the ctrl-v and the u, but works fine when that dot is omitted.
– Chris Johnson
Commented Feb 15, 2013 at 20:28
@ChrisJohnson: I've removed the period from the instructions, it was not intended to be a key press (which is why it didn't appear with the keyboard effect). Sorry for the confusion.
– RobM
Commented Jul 27, 2013 at 10:45
6

Beware: this works in a terminal running Bash only if you're running it under GTK+ environment, as Gnome.
– n.r.
Commented Feb 25, 2014 at 21:37
3

The ability to C-S-u 2 6 2 0 is a feature of your terminal emulator, X Input Method (XIM), or similar. AFAIK, you will be unable to send both SHIFT and CTRL to the terminal layer. The terminal only speaks in characters, rather than in keysyms and keycodes like your X server (also, its is 7-bit for all intents and purposes). In this world, CTRL masks the 4 most significant bits (& 0b00001111) which results in
– nabin-info
Commented Jun 4, 2017 at 2:51

| Show 1 more comment

4 revs, 3 users 79% · Accepted Answer · 2015-09-13 10:32:23Z

Here's a fully internal Bash implementation, no forking, unlimited size of Unicode characters.

fast_chr() {
    local __octal
    local __char
    printf -v __octal '%03o' $1
    printf -v __char \\$__octal
    REPLY=$__char
}

function unichr {
    local c=$1    # Ordinal of char
    local l=0    # Byte ctr
    local o=63    # Ceiling
    local p=128    # Accum. bits
    local s=''    # Output string

    (( c < 0x80 )) && { fast_chr "$c"; echo -n "$REPLY"; return; }

    while (( c > o )); do
        fast_chr $(( t = 0x80 | c & 0x3f ))
        s="$REPLY$s"
        (( c >>= 6, l++, p += o+1, o>>=1 ))
    done

    fast_chr $(( t = p | c ))
    echo -n "$REPLY$s"
}

## test harness
for (( i=0x2500; i<0x2600; i++ )); do
    unichr $i
done

Output was:

─━│┃┄┅┆┇┈┉┊┋┌┍┎┏
┐┑┒┓└┕┖┗┘┙┚┛├┝┞┟
┠┡┢┣┤┥┦┧┨┩┪┫┬┭┮┯
┰┱┲┳┴┵┶┷┸┹┺┻┼┽┾┿
╀╁╂╃╄╅╆╇╈╉╊╋╌╍╎╏
═║╒╓╔╕╖╗╘╙╚╛╜╝╞╟
╠╡╢╣╤╥╦╧╨╩╪╫╬╭╮╯
╰╱╲╳╴╵╶╷╸╹╺╻╼╽╾╿
▀▁▂▃▄▅▆▇█▉▊▋▌▍▎▏
▐░▒▓▔▕▖▗▘▙▚▛▜▝▞▟
■□▢▣▤▥▦▧▨▩▪▫▬▭▮▯
▰▱▲△▴▵▶▷▸▹►▻▼▽▾▿
◀◁◂◃◄◅◆◇◈◉��○◌◍◎●
◐◑◒◓◔◕◖◗◘◙◚◛◜◝◞◟
◠◡◢◣◤◥◦◧◨◩◪◫◬◭◮◯
◰◱◲◳◴◵◶◷◸◹◺◻◼◽◾◿

I'm very curious the reasoning behind the round-about method, and the specific use of the REPLY variable. I am assuming you inspected bash source or ran through or something to optimize, which I can see how your choices could be optimizing, albeit highly dependent on the interpreter). — nabin-info, Commented Jun 1, 2017 at 17:05
@nabin-info $REPLY is a wrong chocie is this case, Using small caps for variable name could be prefered in order to avoid mis interpretation. You could test to replace sed 's/REPLY/anyVarname/g' , script will work fine anyway. — F. Hauri - Give Up GitHub, Commented Jun 4, 2023 at 10:35
$REPLY is the default output variable for the inbuilt read command. This solution doesn't use the read command, but $REPLY is as good as any name. It is my observation that all-caps variable names in bash scripts often indicate 'external' thing, whether they be configuration options/globals, environment variables, or returns from other commands or functions. Lowercase variables tend to be used locally. This is purely my opinion, and you can use any variable name you like. — Orwellophile, Commented Jun 7, 2023 at 10:30

6 revs, 6 users 47% · Accepted Answer · 2021-05-14 09:01:30Z

27

Quick one-liner to convert UTF-8 characters into their 3-byte format:

var="$(echo -n '☠' | od -An -tx1)"; printf '\\x%s' ${var^^}; echo

or

echo -n '☠' | od -An -tx1 | sed 's/ /\\x/g'

The output of both is \xE2\x98\xA0, so you can write reversely:

echo $'\xe2\x98\xa0'   # ☠

edited May 14, 2021 at 9:01

community wiki

6 revs, 6 users 47%
xerostomus

6

I wouldn't call the above example quick (with 11 commands and their params)... Also it only handles 3 byte UTF-8 chars` (UTF-8 chars can be 1, 2, or 3 bytes)... This is a bit shorter and works for 1-3++++ bytes: printf "\\\x%s" $(printf '☠'|xxd -p -c1 -u) .... xxd is shipped as part of the 'vim-common' package
– Peter.O
Commented Dec 2, 2011 at 17:01
PS: I just noticed that the above hexdump/awk example is swithching the sequence of bytes in a byte-pair. This does not apply to a UTF-8 dump. It would be relavent if it were a dump of UTF-16LE and wanted to output Unicode Codepoints, but it doesn't make sense here as the input is UTF-8 and the output is exactly as input (plus the \x before each hexdigit-pair)
– Peter.O
Commented Dec 2, 2011 at 17:35
8

UTF-8 characters can be 1 - 4 bytes sequences
– cms
Commented Apr 12, 2013 at 19:33
1

based on the comment of @Peter.O, I find the following, while bigger, pretty handy: hexFromGlyph(){ if [ "$1" == "-n" ]; then outputSeparator=' '; shift; else outputSeparator='\n'; fi for glyph in "$@"; do printf "\\\x%s" $(printf "$glyph"|xxd -p -c1 -u); echo -n -e "$outputSeparator"; done } # usage: $ hexFromGlyph ☠ ✿ \xE2\x98\xA0 \xE2\x9C\xBF $ hexFromGlyph -n ☠ ✿ \xE2\x98\xA0 \xE2\x9C\xBF
– StephaneAG
Commented Oct 10, 2015 at 0:04
6

Good god man. Consider: codepoints () { printf 'U+%04x\n' ${@/#/\'} ; } ; codepoints A Ｒ ☯ 🕉 z ... enjoy 👍
– nabin-info
Commented Jun 1, 2017 at 17:40

| Show 2 more comments

3 revs, 3 users 92% · Accepted Answer · 2018-01-04 11:04:12Z

14

Just put "☠" in your shell script. In the correct locale and on a Unicode-enabled console it'll print just fine:

$ echo ☠
☠
$

An ugly "workaround" would be to output the UTF-8 sequence, but that also depends on the encoding used:

$ echo -e '\xE2\x98\xA0'
☠
$

edited Jan 4, 2018 at 11:04

community wiki

3 revs, 3 users 92%
Joachim Sauer

Add a comment |

Matheus · Accepted Answer · 2019-04-11 18:49:28Z

14

Here is a list of all unicode emoji's available:

https://en.wikipedia.org/wiki/Emoji#Unicode_blocks

Example:

echo -e "\U1F304"
🌄

For get the ASCII value of this character use hexdump

echo -e "🌄" | hexdump -C

00000000  f0 9f 8c 84 0a                                    |.....|
00000005

And then use the values informed in hex format

echo -e "\xF0\x9F\x8C\x84\x0A"
🌄

answered Apr 11, 2019 at 18:49

community wiki

Matheus

1

echoing the \U<hex> string doesn't work on OSX it just outputs exactly what's in the quotes.
– masukomi
Commented Apr 20, 2019 at 22:07
The default bash version on macos (3.2.57 for me) predates the unicode feature. Update bash or use zsh.
– Quantum7
Commented Dec 6, 2022 at 14:35
The correct way of adding other normal chars would be with null char? echo -e '\U2192\0abc'?
– Pablo Bianchi
Commented Jul 22, 2023 at 2:47

Add a comment |

9 revs · Accepted Answer · 2024-03-22 07:46:08Z

Playing with UTF8 in `bash`

Upgrade 2023...

Print UTF8

From some time ago, bash use %b in printf:

printf %b\\n \\U1F600
😀

Store UTF8 into a variable

So you could assign a variable by using -v flag of bash's printf builtin:

printf -v smiley \\U1F600
echo $smiley 
😀

Strictly answering SO question:

user@host:~$ printf -v skull %b '\U2620'
user@host:~$ PS1=${PS1/%\\$ /$skull\\$ }
user@host:~☠$

Could do the job. (Note %b is nearly useless)

Showing part of table

Then for showing quickly some part of unicode table:

printf %b\\n \\U1F6{{0..9},{A..F}}{{0..9},{a..f}}|paste -d\  -{,,,}{,,,}
😀 😁 😂 😃 😄 😅 😆 😇 😈 😉 😊 😋 😌 😍 😎 😏
😐 😑 😒 😓 😔 😕 😖 😗 😘 😙 😚 😛 😜 😝 😞 😟
😠 😡 😢 😣 😤 😥 😦 😧 😨 😩 😪 😫 😬 😭 😮 😯
😰 😱 😲 😳 😴 😵 😶 😷 😸 😹 😺 😻 😼 😽 😾 😿
🙀 🙁 🙂 🙃 🙄 🙅 🙆 🙇 🙈 🙉 🙊 🙋 🙌 🙍 🙎 🙏
🙐 🙑 🙒 🙓 🙔 🙕 🙖 🙗 🙘 🙙 🙚 🙛 🙜 🙝 🙞 🙟
🙠 🙡 🙢 🙣 🙤 🙥 🙦 🙧 🙨 🙩 🙪 🙫 🙬 🙭 🙮 🙯
🙰 🙱 🙲 🙳 🙴 🙵 🙶 🙷 🙸 🙹 🙺 🙻 🙼 🙽 🙾 🙿
🚀 🚁 🚂 🚃 🚄 🚅 🚆 🚇 🚈 🚉 🚊 🚋 🚌 🚍 🚎 🚏
🚐 🚑 🚒 🚓 🚔 🚕 🚖 🚗 🚘 🚙 🚚 🚛 🚜 🚝 🚞 🚟
🚠 🚡 🚢 🚣 🚤 🚥 🚦 🚧 🚨 🚩 🚪 🚫 🚬 🚭 🚮 🚯
🚰 🚱 🚲 🚳 🚴 🚵 🚶 🚷 🚸 🚹 🚺 🚻 🚼 🚽 🚾 🚿
🛀 🛁 🛂 🛃 🛄 🛅 🛆 🛇 🛈 🛉 🛊 🛋 🛌 🛍 🛎 🛏
🛐 🛑 🛒 🛓 🛔 🛕 🛖 🛗 🛘 🛙 🛚 🛛 🛜 🛝 🛞 🛟
🛠 🛡 🛢 🛣 🛤 🛥 🛦 🛧 🛨 🛩 🛪 🛫 🛬 🛭 🛮 🛯
🛰 🛱 🛲 🛳 🛴 🛵 🛶 🛷 🛸 🛹 🛺 🛻 🛼 🛽 🛾 🛿

Showing braille part:

printf %b\\n \\U28{{0..9},{A..F}}{{0..9},{a..f}}|paste -d\  -{,,,}{,,,}
⠀ ⠁ ⠂ ⠃ ⠄ ⠅ ⠆ ⠇ ⠈ ⠉ ⠊ ⠋ ⠌ ⠍ ⠎ ⠏
⠐ ⠑ ⠒ ⠓ ⠔ ⠕ ⠖ ⠗ ⠘ ⠙ ⠚ ⠛ ⠜ ⠝ ⠞ ⠟
⠠ ⠡ ⠢ ⠣ ⠤ ⠥ ⠦ ⠧ ⠨ ⠩ ⠪ ⠫ ⠬ ⠭ ⠮ ⠯
⠰ ⠱ ⠲ ⠳ ⠴ ⠵ ⠶ ⠷ ⠸ ⠹ ⠺ ⠻ ⠼ ⠽ ⠾ ⠿
⡀ ⡁ ⡂ ⡃ ⡄ ⡅ ⡆ ⡇ ⡈ ⡉ ⡊ ⡋ ⡌ ⡍ ⡎ ⡏
⡐ ⡑ ⡒ ⡓ ⡔ ⡕ ⡖ ⡗ ⡘ ⡙ ⡚ ⡛ ��� ⡝ ⡞ ⡟
⡠ ⡡ ⡢ ⡣ ⡤ ⡥ ⡦ ⡧ ⡨ ⡩ ⡪ ⡫ ⡬ ⡭ ⡮ ⡯
⡰ ⡱ ⡲ ⡳ ⡴ ⡵ ⡶ ⡷ ⡸ ⡹ ⡺ ⡻ ⡼ ⡽ ⡾ ⡿
⢀ ⢁ ⢂ ⢃ ⢄ ⢅ ⢆ ⢇ ⢈ ⢉ ⢊ ⢋ ⢌ ⢍ ⢎ ⢏
⢐ ⢑ ⢒ ⢓ ⢔ ⢕ ⢖ ⢗ ⢘ ⢙ ⢚ ⢛ ⢜ ⢝ ⢞ ⢟
⢠ ⢡ ⢢ ⢣ ⢤ ⢥ ⢦ ⢧ ⢨ ⢩ ⢪ ⢫ ⢬ ⢭ ⢮ ⢯
⢰ ⢱ ⢲ ⢳ ⢴ ⢵ ⢶ ⢷ ⢸ ⢹ ⢺ ⢻ ⢼ ⢽ ⢾ ⢿
⣀ ⣁ ⣂ ⣃ ⣄ ⣅ ⣆ ⣇ ⣈ ⣉ ⣊ ⣋ ⣌ ⣍ ⣎ ⣏
⣐ ⣑ ⣒ ⣓ ⣔ ⣕ ⣖ ⣗ ⣘ ⣙ ⣚ ⣛ ⣜ ⣝ ⣞ ⣟
⣠ ⣡ ⣢ ⣣ ⣤ ⣥ ⣦ ⣧ ⣨ ⣩ ⣪ ⣫ ⣬ ⣭ ⣮ ⣯
⣰ ⣱ ⣲ ⣳ ⣴ ⣵ ⣶ ⣷ ⣸ ⣹ ⣺ ⣻ ⣼ ⣽ ⣾ ⣿

Better into a little function

showU8_256() { 
    local i a
    for a ;do
        for i in {0..9} {A..F}; do
            printf '\\U%05Xx: %b %b %b %b %b %b %b %b %b %b %b %b %b %b %b %b\n' \
                0x$a$i \\U$a${i}{{0..9},{A..F}}
        done
    done
}

Then

showU8_256 1f{3,4}
\U01F30x: 🌀 🌁 🌂 🌃 🌄 🌅 🌆 🌇 🌈 🌉 🌊 🌋 🌌 🌍 🌎 🌏
\U01F31x: 🌐 🌑 🌒 🌓 🌔 🌕 🌖 🌗 🌘 🌙 🌚 🌛 🌜 🌝 🌞 🌟
\U01F32x: 🌠 🌡 🌢 🌣 🌤 🌥 🌦 🌧 🌨 🌩 🌪 🌫 🌬 🌭 🌮 🌯
\U01F33x: 🌰 🌱 🌲 🌳 🌴 🌵 🌶 🌷 🌸 🌹 🌺 🌻 🌼 🌽 🌾 🌿
\U01F34x: 🍀 🍁 🍂 🍃 🍄 🍅 🍆 🍇 🍈 🍉 🍊 🍋 🍌 🍍 🍎 🍏
\U01F35x: 🍐 🍑 🍒 🍓 🍔 🍕 🍖 🍗 🍘 🍙 🍚 🍛 🍜 🍝 🍞 🍟
\U01F36x: 🍠 🍡 🍢 🍣 🍤 🍥 🍦 🍧 🍨 🍩 🍪 🍫 🍬 🍭 🍮 🍯
\U01F37x: 🍰 🍱 🍲 🍳 🍴 🍵 🍶 🍷 🍸 🍹 🍺 🍻 🍼 🍽 🍾 🍿
\U01F38x: 🎀 🎁 🎂 🎃 🎄 🎅 🎆 🎇 🎈 🎉 🎊 🎋 🎌 🎍 🎎 🎏
\U01F39x: 🎐 🎑 🎒 🎓 🎔 🎕 🎖 🎗 🎘 🎙 🎚 🎛 🎜 🎝 🎞 🎟
\U01F3Ax: 🎠 🎡 🎢 🎣 🎤 🎥 🎦 🎧 🎨 🎩 🎪 🎫 🎬 🎭 🎮 🎯
\U01F3Bx: 🎰 🎱 🎲 🎳 🎴 🎵 🎶 🎷 🎸 🎹 🎺 🎻 🎼 🎽 🎾 🎿
\U01F3Cx: 🏀 🏁 🏂 🏃 🏄 🏅 🏆 🏇 🏈 🏉 🏊 🏋 🏌 🏍 🏎 🏏
\U01F3Dx: 🏐 🏑 🏒 🏓 🏔 🏕 🏖 🏗 🏘 🏙 🏚 🏛 🏜 🏝 🏞 🏟
\U01F3Ex: 🏠 🏡 🏢 🏣 🏤 🏥 🏦 🏧 🏨 🏩 🏪 🏫 🏬 🏭 🏮 🏯
\U01F3Fx: 🏰 🏱 🏲 🏳 🏴 🏵 🏶 🏷 🏸 🏹 🏺 🏻 🏼 🏽 🏾 🏿
\U01F40x: 🐀 🐁 ��� 🐃 🐄 🐅 🐆 🐇 🐈 🐉 🐊 🐋 🐌 🐍 🐎 🐏
\U01F41x: 🐐 🐑 🐒 🐓 🐔 🐕 🐖 🐗 🐘 🐙 🐚 🐛 🐜 🐝 🐞 🐟
\U01F42x: 🐠 🐡 🐢 🐣 🐤 🐥 🐦 🐧 🐨 🐩 🐪 🐫 🐬 🐭 🐮 🐯
\U01F43x: 🐰 🐱 🐲 🐳 🐴 🐵 🐶 🐷 🐸 🐹 🐺 🐻 🐼 🐽 🐾 🐿
\U01F44x: 👀 👁 👂 👃 👄 👅 👆 👇 👈 👉 👊 👋 👌 👍 👎 👏
\U01F45x: 👐 👑 👒 👓 👔 👕 👖 👗 👘 👙 👚 👛 👜 👝 👞 👟
\U01F46x: 👠 👡 👢 👣 👤 👥 👦 👧 👨 👩 👪 👫 👬 👭 👮 👯
\U01F47x: 👰 👱 👲 👳 👴 👵 👶 👷 👸 👹 👺 👻 👼 👽 👾 👿
\U01F48x: 💀 💁 💂 💃 💄 💅 💆 💇 💈 💉 💊 💋 💌 💍 💎 💏
\U01F49x: 💐 💑 💒 💓 💔 💕 💖 💗 💘 💙 💚 💛 💜 💝 💞 💟
\U01F4Ax: 💠 💡 💢 💣 💤 💥 💦 💧 💨 💩 💪 💫 💬 💭 💮 💯
\U01F4Bx: 💰 💱 💲 💳 💴 💵 💶 💷 💸 💹 💺 💻 💼 💽 💾 💿
\U01F4Cx: 📀 📁 📂 📃 📄 📅 📆 📇 📈 📉 📊 📋 📌 📍 📎 📏
\U01F4Dx: 📐 📑 📒 📓 📔 📕 📖 📗 📘 📙 📚 📛 📜 📝 📞 📟
\U01F4Ex: 📠 📡 📢 📣 📤 📥 📦 📧 📨 📩 📪 📫 📬 📭 📮 📯
\U01F4Fx: 📰 📱 📲 📳 📴 📵 📶 📷 📸 📹 📺 📻 📼 📽 📾 📿

Browsing unicode table

For this, after searching reliable way, I'v finally posted on SuperUser Dumping / browsing full unicode table, my python dumpUnicode script:

Shortly:

dumpUnicode() {
  python3 -c $'from unicodedata import name\nfor i in range(0x10FFFF):\n  try:
    var = name(chr(i))\n  except:\n    var = None\n  finally:\n    if var:
      print("\\\\U%06X: \47%s\47 %s" % (i,chr(i),var))'; }

dumpUnicode | grep SMIL.*SUNGLAS\\\|FONDUE
\U01F60E: '😎' SMILING FACE WITH SUNGLASSES
\U01FAD5: '🫕' FONDUE

Or for strictly answering SO request:

dumpUnicode |grep "' SKULL AND CROSSBONES"
\U002620: '☠' SKULL AND CROSSBONES

Converting to ASCII values

There is not 4 digit, but a variable number of bytes:

printf -v skull '%b' \\U2620
LANG=C printf -v skull %q $skull
IFS=\' read -r _ skull _ <<<"$skull"
echo ${skull//\\/\\0}

\0342\0230\0240

echo -e ${skull//\\/\\0}

☠

As a function:

u8toBytes() { 
    local char
    printf -v char %b "$1"
    LANG=C printf -v char %q "$char"
    IFS=\' read -r _ char _ <<< "$char"
    echo ${char//\\/\\0}
    echo -e ${char//\\/\\0}
}

u8toBytes \\U2620
\0342\0230\0240
☠
u8toBytes \\UA0
\0302\0240
 
u8toBytes 😎
\0360\0237\0230\0216
😎

Further

Have a look at Using Unicode specific character in bash

Your answer deserves more love from the community
– Vinicius
Commented 1 hour ago — Vinicius, Commented 1 hour ago

user2622016 · Accepted Answer · 2018-06-08 15:05:29Z

10

In bash to print a Unicode character to output use \x,\u or \U (first for 2 digit hex, second for 4 digit hex, third for any length)

echo -e '\U1f602'

I you want to assign it to a variable use $'...' syntax

x=$'\U1f602'
echo $x

answered Jun 8, 2018 at 15:05

community wiki

user2622016

Add a comment |

4 revs, 3 users 81% user2350426 · Accepted Answer · 2019-12-11 03:59:54Z

Any of these three commands will print the character you want in a console, provided the console do accept UTF-8 characters (most current ones do):

echo -e "SKULL AND CROSSBONES (U+2620) \U02620"
echo $'SKULL AND CROSSBONES (U+2620) \U02620'
printf "%b" "SKULL AND CROSSBONES (U+2620) \U02620\n"

SKULL AND CROSSBONES (U+2620) ☠

After, you could copy and paste the actual glyph (image, character) to any (UTF-8 enabled) text editor.

If you need to see how such Unicode Code Point is encoded in UTF-8, use xxd (much better hex viewer than od):

echo $'(U+2620) \U02620' | xxd
0000000: 2855 2b32 3632 3029 20e2 98a0 0a         (U+2620) ....

That means that the UTF8 encoding is: e2 98 a0

Or, in HEX to avoid errors: 0xE2 0x98 0xA0. That is, the values between the space (HEX 20) and the Line-Feed (Hex 0A).

If you want a deep dive into converting numbers to chars: look here to see an article from Greg's wiki (BashFAQ) about ASCII encoding in Bash!

re:"Or, in HEX to avoid errors..." I hardly think that converting a unicode char to some binary encoding that you express in hex chars, helps avoid errors. Using the unicode notation in "bash" would better avoid errors i.e.: " \uHHHH---the Unicode (ISO/IEC 10646) character whose value is the ----hexadecimal value HHHH (one to four hex digits); \UHHHHHHHH ----the Unicode (ISO/IEC 10646) character whose value is the ----hexadecimal value HHHHHHHH (one to eight hex digits) — Astara, Commented Feb 4, 2016 at 3:56

3 revs, 3 users 80% · Accepted Answer · 2018-01-04 11:04:47Z

7

I'm using this:

$ echo -e '\u2620'
☠

This is pretty easier than searching a hex representation... I'm using this in my shell scripts. That works on gnome-term and urxvt AFAIK.

edited Jan 4, 2018 at 11:04

community wiki

3 revs, 3 users 80%
Metal3d

2

@masukomi if you know how to use brew you can install a more recent bash and use that. The above works fine on my mac terminal when using the upgraded bash.
– mcheema
Commented Jan 11, 2014 at 12:12
Yes, that's fine with newer versions of bash. Hower prompt strings, e.g $PS1 don't use echo escape formats
– cms
Commented Oct 28, 2014 at 18:03

Add a comment |

2 revs, 2 users 76% · Accepted Answer · 2015-09-13 10:31:17Z

6

You may need to encode the code point as octal in order for prompt expansion to correctly decode it.

U+2620 encoded as UTF-8 is E2 98 A0.

So in Bash,

export PS1="\342\230\240"

will make your shell prompt into skull and bones.

edited Sep 13, 2015 at 10:31

community wiki

2 revs, 2 users 76%
cms

hi, what is the code that I should enter for "e0 b6 85"? how can I find it?
– Udy Warnasuriya
Commented Apr 12, 2013 at 13:18
just convert the hexadecimal ( base 16 ) numbers e0 b6 85 into octal (base 8 ) - use a calculator is probably the easiest way to do this
– cms
Commented Apr 12, 2013 at 19:26
e0 b6 85 hex is 340 266 205 octal
– cms
Commented Apr 12, 2013 at 19:30
This worked, thanks a lot! And btw, you can findal octal version at these pages: graphemica.com/%E2%9B%B5
– Perlnika
Commented Sep 7, 2013 at 13:46

Add a comment |

Flimm · Accepted Answer · 2016-10-17 16:23:55Z

6

If you don't mind a Perl one-liner:

$ perl -CS -E 'say "\x{2620}"'
☠

-CS enables UTF-8 decoding on input and UTF-8 encoding on output. -E evaluates the next argument as Perl, with modern features like say enabled. If you don't want a newline at the end, use print instead of say.

answered Oct 17, 2016 at 16:23

community wiki

Flimm

Add a comment |

Tino · Accepted Answer · 2018-03-17 10:04:43Z

Sorry for reviving this old question. But when using bash there is a very easy approach to create Unicode codepoints from plain ASCII input, which even does not fork at all:

unicode() { local -n a="$1"; local c; printf -vc '\\U%08x' "$2"; printf -va "$c"; }
unicodes() { local a c; for a; do printf -vc '\\U%08x' "$a"; printf "$c"; done; };

Use it as follows to define certain codepoints

unicode crossbones 0x2620
echo "$crossbones"

or to dump the first 65536 unicode codepoints to stdout (takes less than 2s on my machine. The additional space is to prevent certain characters to flow into each other due to shell's monospace font):

for a in {0..65535}; do unicodes "$a"; printf ' '; done

or to tell a little very typical parent's story (this needs Unicode 2010):

unicodes 0x1F6BC 32 43 32 0x1F62D 32 32 43 32 0x1F37C 32 61 32 0x263A 32 32 43 32 0x1F4A9 10

Explanation:

printf '\UXXXXXXXX' prints out any Unicode character
printf '\\U%08x' number prints \UXXXXXXXX with the number converted to Hex, this then is fed to another printf to actually print out the Unicode character
printf recognizes octal (0oct), hex (0xHEX) and decimal (0 or numbers starting with 1 to 9) as numbers, so you can choose whichever representation fits best
printf -v var .. gathers the output of printf into a variable, without fork (which tremendously speeds up things)
local variable is there to not pollute the global namespace
local -n var=other aliases var to other, such that assignment to var alters other. One interesting part here is, that var is part of the local namespace, while other is part of the global namespace.
- Please note that there is no such thing as local or global namespace in bash. Variables are kept in the environment, and such are always global. Local just puts away the current value and restores it when the function is left again. Other functions called from within the function with local will still see the "local" value. This is a fundamentally different concept than all the normal scoping rules found in other languages (and what bash does is very powerful but can lead to errors if you are a programmer who is not aware of that).

well -- doesn't work at all for me. any attempt to use any of your functions, emits: line 6: local: -n: invalid option local: usage: local name[=value] ... I'm using latest (10.14.2) MacOS and bash (GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin18)) — Motti Shneor, Commented Mar 26, 2019 at 8:42

2 revs · Accepted Answer · 2019-11-25 21:33:05Z

In Bash:

UnicodePointToUtf8()
{
    local x="$1"               # ok if '0x2620'
    x=${x/\\u/0x}              # '\u2620' -> '0x2620'
    x=${x/U+/0x}; x=${x/u+/0x} # 'U-2620' -> '0x2620'
    x=$((x)) # from hex to decimal
    local y=$x n=0
    [ $x -ge 0 ] || return 1
    while [ $y -gt 0 ]; do y=$((y>>1)); n=$((n+1)); done
    if [ $n -le 7 ]; then       # 7
        y=$x
    elif [ $n -le 11 ]; then    # 5+6
        y=" $(( ((x>> 6)&0x1F)+0xC0 )) \
            $(( (x&0x3F)+0x80 ))" 
    elif [ $n -le 16 ]; then    # 4+6+6
        y=" $(( ((x>>12)&0x0F)+0xE0 )) \
            $(( ((x>> 6)&0x3F)+0x80 )) \
            $(( (x&0x3F)+0x80 ))"
    else                        # 3+6+6+6
        y=" $(( ((x>>18)&0x07)+0xF0 )) \
            $(( ((x>>12)&0x3F)+0x80 )) \
            $(( ((x>> 6)&0x3F)+0x80 )) \
            $(( (x&0x3F)+0x80 ))"
    fi
    printf -v y '\\x%x' $y
    echo -n -e $y
}

# test
for (( i=0x2500; i<0x2600; i++ )); do
    UnicodePointToUtf8 $i
    [ "$(( i+1 & 0x1f ))" != 0 ] || echo ""
done
x='U+2620'
echo "$x -> $(UnicodePointToUtf8 $x)"

Output:

─━│┃┄┅┆┇┈┉┊┋┌┍┎┏┐┑┒┓└┕┖┗┘┙┚┛├┝┞┟
┠┡┢┣┤┥┦┧┨┩┪┫┬┭┮┯┰┱┲┳┴┵┶┷┸┹┺┻┼┽┾┿
╀╁╂╃╄╅╆╇╈╉╊╋╌╍╎╏═║╒╓╔╕╖╗╘╙╚╛╜╝╞╟
╠╡╢╣╤╥╦╧╨╩╪╫╬╭╮╯╰╱╲╳╴╵╶╷╸╹╺╻╼╽╾╿
▀▁▂▃▄▅▆▇█▉▊▋▌▍▎▏▐░▒▓▔▕▖▗▘▙▚▛▜▝▞▟
■□▢▣▤▥▦▧▨▩▪▫▬▭▮▯▰▱▲△▴▵▶▷▸▹►▻▼▽▾▿
◀◁◂◃◄◅◆◇◈◉◊○◌◍◎●◐◑◒◓◔◕◖◗◘◙◚◛◜◝◞◟
◠◡◢◣◤◥◦◧◨◩◪◫◬◭◮◯◰◱◲◳◴◵◶◷◸◹◺◻◼◽◾◿
U+2620 -> ☠

2 revs, 2 users 83% · Accepted Answer · 2015-09-13 10:38:52Z

4

The printf builtin (just as the coreutils' printf) knows the \u escape sequence which accepts 4-digit Unicode characters:

   \uHHHH Unicode (ISO/IEC 10646) character with hex value HHHH (4 digits)

Test with Bash 4.2.37(1):

$ printf '\u2620\n'
☠

edited Sep 13, 2015 at 10:38

community wiki

2 revs, 2 users 83%
Michael Jaros

printf is also a shell built-in. You're probably using the default macOS bash (v3). Try with \printf to use the standalone executable, or try with upgraded bash
– mcint
Commented Aug 29, 2018 at 20:21

Add a comment |

3 revs, 2 users 85% · Accepted Answer · 2017-05-23 12:02:47Z

3

Based on Stack Overflow questions Unix cut, remove first token and https://stackoverflow.com/a/15903654/781312:

(octal=$(echo -n ☠ | od -t o1 | head -1 | cut -d' ' -f2- | sed -e 's#\([0-9]\+\) *#\\0\1#g')
echo Octal representation is following $octal
echo -e "$octal")

Output is the following.

Octal representation is following \0342\0230\0240
☠

edited May 23, 2017 at 12:02

community wiki

3 revs, 2 users 85%
test30

Add a comment |

2 revs, 2 users 80% · Accepted Answer · 2018-11-14 14:11:44Z

2

Easy with a Python2/3 one-liner:

$ python -c 'print u"\u2620"'    # python2
$ python3 -c 'print(u"\u2620")'  # python3

Results in:

☠

edited Nov 14, 2018 at 14:11

community wiki

2 revs, 2 users 80%
Chris Johnson

Add a comment |

philcolbourn · Accepted Answer · 2017-07-20 11:26:40Z

1

If hex value of unicode character is known

H="2620"
printf "%b" "\u$H"

If the decimal value of a unicode character is known

declare -i U=2*4096+6*256+2*16
printf -vH "%x" $U              # convert to hex
printf "%b" "\u$H"

answered Jul 20, 2017 at 11:26

community wiki

philcolbourn

Add a comment |

ycMia · Accepted Answer · 2023-12-04 19:33:30Z

0

If you mean windows bash-like apps such as msys2, you can use :

[Alt ( Hold ) ] + [decimal Numbers] + [Alt (Release)]

to input a HTML typed code such as : 🥰, 🥰 It would echo to visual properly.

Similarly, in productive software you can use that method too.

answered Dec 4, 2023 at 19:33

community wiki

ycMia

Add a comment |

Collectives™ on Stack Overflow

How do you echo a 4-digit Unicode character in Bash?

20 Answers 20

Playing with UTF8 in `bash`

Print UTF8

Store UTF8 into a variable

Strictly answering SO question:

Showing part of table

Better into a little function

Browsing unicode table

Converting to ASCII values

Further

Not the answer you're looking for? Browse other questions tagged
bash
shell
unicode
character-encoding
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

20 Answers 20

Playing with UTF8 in bash

Print UTF8

Store UTF8 into a variable

Strictly answering SO question:

Showing part of table

Better into a little function

Browsing unicode table

Converting to ASCII values

Further

Not the answer you're looking for? Browse other questions tagged bashshellunicodecharacter-encoding or ask your own question.

Linked

Related

Playing with UTF8 in `bash`

Not the answer you're looking for? Browse other questions tagged
bash
shell
unicode
character-encoding
or ask your own question.