0

I have a telnet server that allows user to specify encoding for communication. Server will then use that encoding for communication, given it's supported.

I tried following encodings which all produced garbage from unicode characters on the target:

  • CP-1252
  • UTF-8
  • Latin1

The server also allows me to list all encoding along with ěščřžýáíé characters to test which one will work. This is the output in Windows Telnet:

UTF-8: ─Ť┼í─Ź┼Ö┼ż├Ż├í├ş├ę
ISO-8859-1: ?????řßÝÚ
latin1: ?????řßÝÚ
CP819: ?????řßÝÚ
IBM819: ?????řßÝÚ
iso-ir-100: ?????řßÝÚ
csISOLatin1: ?????řßÝÚ
ISO-8859-15: ?Ę??ŞřßÝÚ
latin9: ?Ę??ŞřßÝÚ
☺Y☺~☺řßÝÚ a☺
☺Y☺~řßÝÚE: ☺☺a☺
☺Y☺~☺řßÝÚ☺
☺Y☺~☺řßÝÚa☺
☺Y☺~řßÝÚE: ☺a☺
☺Y☺~☺řßÝÚ
System:
Big5-HKSCS:  n????? h q m
Big5: ?????????
Big5-ETen: ?????????
CP950: ?????????
windows-949: ?????????
CP949: ?????????
EUC-KR: ?????????
Shift_JIS: ?????????
SJIS: ?????????
MS_Kanji: ?????????
ISO-2022-JP: ?????????
JIS7: ?????????
EUC-JP: ?????????
GB2312:   ?????
GBK:
CP936:
MS936:
windows-936:
GB18030:    0 8 0 0 0 0 0 6 0 5
hp-roman8: ? ???
roman8: ? ???
csHPRoman8: ? ???
TIS-620: ?????????
ISO 8859-11: ?????????
WINSAMI2: ?  ?
WS2: ?  ?
macintosh: ??????
Apple Roman: ??????
MacRoman: ??????
windows-1258: ??????
CP1258: ??????
windows-1257: ?  ? ???
CP1257: ?  ? ???
windows-1256: ????????
CP1256: ????????
windows-1255: ?????????
CP1255: ?????????
windows-1254: ? ????
CP1254: ? ????
windows-1253: ?????????
CP1253: ?????????
windows-1252: ? ??
CP1252: ? ??
windows-1251: ?????????
CP1251: ?????????
windows-1250:
CP1250:
IBM866: ?????????
CP866: ?????????
csIBM866: ?????????
IBM874: ?????????
CP874: ?????????
IBM850: ?????
CP850: ?????
csPC850Multilingual: ?????
ISO-8859-16: ?  ? ?
iso-ir-226: ?  ? ?
latin10: ?  ? ?
ISO-8859-14: ?????
iso-ir-199: ?????
latin8: ?????
iso-celtic: ?????
ISO-8859-13: ?  ? ???
ISO-8859-10: ?  ?
iso-ir-157: ?  ?
latin6: ?  ?
ISO-8859-10:1992: ?  ?
csISOLatin6: ?  ?
ISO-8859-9: ??????
iso-ir-148: ??????
latin5: ??????
csISOLatin5: ??????
ISO-8859-8: ?????????
ISO 8859-8-I: ?????????
iso-ir-138: ?????????
hebrew: ?????????
csISOLatinHebrew: ?????????
ISO-8859-7: ?????????
ECMA-118: ?????????
greek: ?????????
iso-ir-126: ?????????
csISOLatinGreek: ?????????
ISO-8859-6: ?????????
ISO-8859-6-I: ?????????
ECMA-114: ?????????
ASMO-708: ?????????
arabic: ?????????
iso-ir-127: ?????????
csISOLatinArabic: ?????????
ISO-8859-5: ?????????
cyrillic: ?????????
iso-ir-144: ?????????
csISOLatinCyrillic: ?????????
ISO-8859-4: ?  ? ?
latin4: ?  ? ?
iso-ir-110: ?╣Ŕ?ż?ßÝÚ
csISOLatin4: ?╣Ŕ?ż?ßÝÚ
ISO-8859-3: ??????ßÝÚ
latin3: ??????ßÝÚ
iso-ir-109: ??????ßÝÚ
csISOLatin3: ??????ßÝÚ
ISO-8859-2: ý╣Ŕ°żřßÝÚ
latin2: ý╣Ŕ°żřßÝÚ
iso-ir-101: ý╣Ŕ°żřßÝÚ
csISOLatin2: ý╣Ŕ°żřßÝÚ

With putty, it clearly works and the correct one is UTF-8. This is what I get when using putty (I cut the rest of the long list):

UTF-8: ěščřžýáíé
ISO-8859-1: ?????▒▒▒▒
latin1: ?????▒▒▒▒
CP819: ?????▒▒▒▒
IBM819: ?????▒▒▒▒
iso-ir-100: ?????▒▒▒▒
csISOLatin1: ?????▒▒▒▒
ISO-8859-15: ?▒??▒▒▒▒▒
latin9: ?▒??▒▒▒▒▒
Y~▒▒▒▒LE:
Y~▒▒▒▒2BE:

There may be a problem on the server, but to consider that possibility, I first need to know what encoding does Microsoft Telnet Client actually use. Which encoding is it? Is it saved in some system variable?

6
  • If I remember correctly, it's probably 7 bit ASCII. If you see garbage in your remote session, it's probably because you didn't choose a compatible setup with the remote server. Commented Sep 5, 2016 at 20:10
  • @JuliePelletier I need to know what encoding name should I pass to the server. If it was just first 127 ASCII character it wouldn't allow diacritic symbols I think. I have no problems with sending and receiving ASCII symbols, only the Unicode ones cause garbage. Commented Sep 5, 2016 at 20:13
  • At first, I thought all you were getting was garbage (unwanted/unsensible results), and suspected communication setup errors. But then I saw the PuTTY output, and figured that might not be random. Is the PuTTY UTF-8 output exactly what you were expecting to see? Note that prior to Win10, Windows Telnet has been known to provide very poor terminal support except for raw text that includes no escape codes. I read Microsoft planned to make changes to the console in Win10; I haven't yet determined whether those end up making Telnet work nicer than in prior versions.
    – TOOGAM
    Commented Sep 5, 2016 at 20:48
  • 1
    per RFC854, the Network Virtual Terminal character device uses USASCII 7-bit characters, in an 8-bit field. tools.ietf.org/html/rfc854 See the section entitled "THE NVT PRINTER AND KEYBOARD" on page 10. Commented Sep 5, 2016 at 20:58
  • As @FrankThomas points out in the RFC 854, it also lists there: The NVT is intended to strike a balance between being overly restricted (not providing hosts a rich enough vocabulary for mapping into their local character sets), and being overly inclusive (penalizing users with modest terminals). it also states The code set is seven-bit USASCII in an eight-bit field, except as modified herein. Any code conversion and timing considerations are local problems and do not affect the NVT. in THE NETWORK VIRTUAL TERMINAL Commented Sep 5, 2016 at 22:15

1 Answer 1

1

Telnet is encoded in ASCII--there is an eight-bit mode that can be negotiated, which is usually used for data transfers.

1
  • Could you elaborate please? Is there any doc describing how to use the eight bit to send Unicode characters? Commented Sep 5, 2016 at 20:32

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .