How can I convert extended ascii to a System.String?

Question

For example: "½" or ASCII DEC 189. When I read the bytes from a text file the byte[] contains the valid value, in this case 189.

Converting to Unicode results in the Unicode replacement character 65533.

UnicodeEncoding.Unicode.GetString(b);

Converting to ASCII results in 63 or "?"

ASCIIEncoding.ASCII.GetString(b);

If this isn't possible what is the best way to handle this data? I'd like to be able to perform string functions like Replace().

Richard · Accepted Answer · 2009-03-20 14:50:07Z

30

Byte 189 represents a "½" in iso-8859-1 (aka "Latin-1"), so the following is maybe what you want:

var e = Encoding.GetEncoding("iso-8859-1");
var s = e.GetString(new byte[] { 189 });

All strings and chars in .NET are UTF-16 encoded, so you need to use an encoder/decoder to convert anything else, sometimes this is defaulted (e.g. UTF-8 for FileStream instances) but good practice is to always specify.

You will need some form of implicit or (better) explicit metadata to supply you with the information about which encoding.

answered Mar 20, 2009 at 14:50

Richard

108k21 gold badges208 silver badges268 bronze badges

1

This encoding stuff has been driving me mad... but your answer did the trick for me!!! It took me a while to figure out what to search for but finally I figured out what the proper search terms should be. :) Thanks for providing me with a simple solution to my problem. :)
– Dave
Commented Jan 12, 2011 at 1:02
But we can't add this code in every read, there should be some other better way to do this.
– RJN
Commented Mar 14, 2018 at 16:34
@Rajan365: What do you mean by "every read"? (And likely you should be asking a new question.)
– Richard
Commented Mar 14, 2018 at 17:17
@Richard I mean, instead of explicitley specifying the code page like "iso-8859-1", can I use Encoding.Default which again get the same code page?
– RJN
Commented Mar 19, 2018 at 13:28
1

@Rajan365 If the default is always the right encoding, then sure. But if the locale of the user is changed then maybe the default encoding will as well. Also, you can of course keep the Encoding instance around, you do not need to get a new instance for each string.
– Richard
Commented Mar 19, 2018 at 13:36

Add a comment |

Tom Wilson · Accepted Answer · 2012-02-28 22:59:39Z

The old PC-8 or Extended ASCII character set was around before IBM and Microsoft introduced the idea of Code Pages to the PC world. This WAS Extended ASCII - in 1982. In fact, it was the ONLY character set available on PC's at the time, up until the EGA card allowed you to load other fonts in to VRAM.

This was also the default standard for ANSI terminals, and nearly every BBS I dialed up to in the 80's and early 90's used this character set for displaying menus and boxes.

Here's the code to turn 8-bit Extended ASCII in to Unicode text. Note the key bit of code: the GetEncoding("437"). That used Code Page 437 to translate the 8-bit ASCII text to the Unicode equivalent.

    string ASCII8ToString(byte[] ASCIIData)
    {
        var e = Encoding.GetEncoding("437");
        return e.GetString(ASCIIData);
    }

Wow! Thank you! As a side-note, your answer is also a really good solution for how to turn a byte array into a string and back. — mike, Commented Oct 26, 2016 at 5:32

Jon Skeet · Accepted Answer · 2009-03-20 14:32:19Z

13

It depends on exactly what the encoding is.

There's no such thing as "ASCII 189" - ASCII only goes up to 127. There are many encodings which a 8-bit encodings using ASCII for the first 128 values.

You may want Encoding.Default (which is the default encoding for your particular system), but it's hard to know for sure. Where did your data come from?

answered Mar 20, 2009 at 14:32

Jon Skeet

1.5m881 gold badges9.2k silver badges9.3k bronze badges

What I'm reading into the byte[] lines up with 188 - 190 in this extended ascii chart: charlie.balch.org/asp/ascii.asp. Encoding.Default did the trick. Thanks a bunch!
– rtremaine
Commented Mar 20, 2009 at 14:47
2

Glad it worked - just be aware that anyone who talks about "extended ASCII" as if that means one particular encoding doesn't know what they're talking about. It's like talking about "one dollar" - one US dollar, Australian dollar, Canadian dollar, what? It may make sense in a particular context
– Jon Skeet
Commented Mar 20, 2009 at 14:54
1

but it isn't a definitive and unique idea. So I dare say Charlie's idea of "extended ASCII" is appropriate for his culture - but it wouldn't match what happens on some other people's computers.
– Jon Skeet
Commented Mar 20, 2009 at 14:54

Add a comment |

Community · Accepted Answer · 2017-05-23 11:53:55Z

1

System.String[] can not store characters with ASCII > 127 if you are trying to work on any extended ASCII characters such as œ ¢ ½ ¾here is the method to convert it into their binary and decimal equivalent

edited May 23, 2017 at 11:53

CommunityBot

11 silver badge

answered Jul 18, 2014 at 14:41

Ritwik

5918 silver badges17 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How can I convert extended ascii to a System.String?

4 Answers 4

Not the answer you're looking for? Browse other questions tagged
c#
.net
extended-ascii
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Not the answer you're looking for? Browse other questions tagged c#.netextended-ascii or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
c#
.net
extended-ascii
or ask your own question.