7

I have the following variable:

var string="Mazatl%E1n";

The string is returned like that by the server. All I want to do is decode that into: Mazatlán. I've tried the following:

var string="Mazatl%E1n";

alert(unescape(string));
alert(decodeURI(string));

unescape works fine but I don't want to use it because I understand it is deprecated, instead I tried decodeURI which fails with the following error:

Uncaught URIError: URI malformed

Why ? Any help is appreciated.

var string="Mazatl%E1n";

alert(unescape(string));
alert(decodeURI(string));

3
  • Look into decodeURIComponent... Commented Jan 12, 2016 at 22:27
  • 1
    escape() and unescape() are defined for ISO strings. decodeURI() andencodeURI() are defined for UTF-8 strings. Commented Jan 12, 2016 at 22:37
  • 3
    @JohanKarlsson is heading in the right direction. %E1 is the Unicode encoding, but URIs use UTF-8, so the correct encoding is %C3%A1. You can see this by running encodeURIComponent("Mazatlán") Commented Jan 12, 2016 at 22:39

3 Answers 3

7

You get the error because %E1 is the Unicode encoding, but decodeURI() expects UTF-8.

You'll either have to create your own unescape function, for example:

function unicodeUnEscape(string) {
  return string.replace(/%u([\dA-Z]{4})|%([\dA-Z]{2})/g, function(_, m1, m2) {
    return String.fromCharCode(parseInt("0x" + (m1 || m2)));
  })
}

var string = "Mazatl%E1n";
document.body.innerHTML = unicodeUnEscape(string);

or you could change the server to send the string encoded in UTF-8 instead, in which case you can use decodeURI()

var string = "Mazatl%C3%A1n"
document.body.innerHTML = decodeURI(string);

3
  • Could it be possible you document the regex ? Commented Jan 12, 2016 at 23:33
  • 1
    It matches a % followed by either u and 4 hexadecimal digits, or % followed by 2 hexadecimal digits: regex101.com/r/cC4tN4/1 Commented Jan 12, 2016 at 23:38
  • 1
    I would recommend @JohanKarlsson that you add an i (insensitive) after the g of the regex, as %2f and %2F are both valid encoding Commented Jul 24, 2019 at 8:56
2

URI supports the ASCII character-set , and the correct format encoding for á is %C3%A1 (in UTF-8 encoding)

fiddle


escape and unescape use an hexadecimal escape sequences
(which is different ..);
so the value you're getting form the server has been encoded using escape(string).

0
1

The decodeURI() function expects a valid URI as its parameter. If you are only trying to decode a string instead of a full URI, use decodeURIComponent()

0

Not the answer you're looking for? Browse other questions tagged or ask your own question.