Converting string of unicode emoji

Question

I have a list of strings that basically represent unicode emojis, e.g.:

emoji[0] = 'U+270DU+1F3FF'

I would like to convert this "almost" unicode emoji representation to its true emoji representation so that I can search through text documents that contain these emojis, e.g.:

emoji[0] = emoji[0].replace('U+', '\U000')
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-4: truncated \UXXXXXXXX escape

How can I accomplish that?

L3viathan · Accepted Answer · 2017-12-14 14:04:47Z

3

A solution that would work with variable digit representations:

>>> import re
>>> e = 'U+270DU+1F3FF'
>>> def emojize(match):
...     return chr(int(match.group(0)[2:], 16))
>>> re.sub(r"U\+[0-9A-F]+", emojize, e)
'✍🏿'

answered Dec 14, 2017 at 14:04

L3viathan

27.2k2 gold badges60 silver badges81 bronze badges

Add a comment |

Ivan · Accepted Answer · 2017-12-14 13:55:48Z

2

This is because you have 4 digits in 270D and 5 in 1F3FF:

>>> e = 'U+270D'
>>> print e.replace('U+', '\U0000').decode('unicode-escape')
✍
>>> e = 'U+1F3FF'
>>> print e.replace('U+', '\U000').decode('unicode-escape')
🏿

answered Dec 14, 2017 at 13:55

Ivan

2,5612 gold badges22 silver badges29 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Converting string of unicode emoji

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
python
unicode
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged pythonunicode or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
unicode
or ask your own question.