How to encode Cyrillic characters in JSON

Question

I want to read a JSON file containing Cyrillic symbols.

The Cyrillic symbols are represented like \u123.

Python converts them to '\\u123' instead of the Cyrillic symbol.

For example, the string "\u0420\u0435\u0433\u0438\u043e\u043d" should become "Регион", but becomes "\\u0420\\u0435\\u0433\\u0438\\u043e\\u043d".

encode() just makes string look like u"..." or adds a new \.

How do I convert "\u0420\u0435\u0433\u0438\u043e\u043d" to "Регион"?

json str: "\u0420\u0435\u0433\u0438\u043e\u043d"; desired: "Регион"; get:"\\u0420\\u0435\\u0433\\u0438\\u043e\\u043d" — Влад Кныш, Commented Oct 15, 2016 at 3:44

Ignacio Vazquez-Abrams · Accepted Answer · 2016-10-15 21:03:01Z

7

If you want json to output a string that has non-ASCII characters in it then you need to pass ensure_ascii=False and then encode manually afterward.

answered Oct 15, 2016 at 21:03

Ignacio Vazquez-Abrams

792k157 gold badges1.4k silver badges1.4k bronze badges

Add a comment |

Mark Tolonen · Accepted Answer · 2016-10-15 19:55:10Z

0

Just use the json module.

import json

s = "\u0420\u0435\u0433\u0438\u043e\u043d"

# Generate a json file.
with open('test.json','w',encoding='ascii') as f:
    json.dump(s,f)

# Reading it directly
with open('test.json') as f:
    print(f.read())

# Reading with the json module
with open('test.json',encoding='ascii') as f:
    data = json.load(f)
print(data)

Output:

"\u0420\u0435\u0433\u0438\u043e\u043d"
Регион

answered Oct 15, 2016 at 19:55

Mark Tolonen

173k26 gold badges173 silver badges258 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How to encode Cyrillic characters in JSON

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
python-3.x
unicode
utf-8
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged python-3.xunicodeutf-8 or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python-3.x
unicode
utf-8
or ask your own question.