Timeline for Convert UTF-16 LE to UTF-8 in windows via command line
Current License: CC BY-SA 4.0
20 events
when toggle format | what | by | license | comment | |
---|---|---|---|---|---|
Nov 15, 2023 at 3:05 | answer | added | dave_thompson_085 | timeline score: 0 | |
Nov 14, 2023 at 21:18 | answer | added | Charles Miller | timeline score: 2 | |
Jun 19, 2023 at 13:16 | audit | Suggested edits | |||
Jun 19, 2023 at 13:16 | |||||
Jun 6, 2023 at 22:54 | vote | accept | bfh47 | ||
Jun 6, 2023 at 22:54 | answer | added | bfh47 | timeline score: 2 | |
Jun 6, 2023 at 22:47 | history | edited | bfh47 | CC BY-SA 4.0 |
I tried to remove unnecessary detail from the post, and make it easier to read/understand. It still describes accurately the original issue I had, without compromising
|
May 30, 2023 at 7:59 | answer | added | harrymc | timeline score: 2 | |
May 29, 2023 at 22:19 | comment | added | bfh47 | @harrymc Hey I just wanted you to know: weirdly enough that chcp article you linked on ss64, happens to mention that "type" is one of the only commands that "allows reading and writing (UTF-16LE / BOM) files". If I click on the page dedicated to "type", down at the bottom there is a demo example of a batch file that converts "an ASCII (Windows1252) file into a Unicode (UCS-2 le) text file". So TYPE and cmd trickery can be used to convert encoding of a text file! I have no idea if their example can be reverse engineered, my batch knowledge hardly extends that far... | |
May 29, 2023 at 22:08 | comment | added | bfh47 | @harrymc Hey again! Annoyingly the resultant file was in the same encoding as the original, and performed equally poorly in cmd... I am wondering, the UTF-16 LE files created by recycle.exe can be displayed in cmd via type (albeit, with weird spacing, but there is no real corruption), they can even be "type"d to a brand new text document (of the same encoding), my only issue is that storing txt file contents as a variable don't work. Could I find a way to parse the document to just return valid, allowed characters? Does that work? Maybe that would then be saveable in a normal encoding | |
May 29, 2023 at 19:38 | comment | added | harrymc |
This code might work to convert the UTF16 file : type utf16.txt >ansi.txt . If it works, I'll put up an answer.
|
|
May 29, 2023 at 18:32 | comment | added | bfh47 | @harrymc I am almost relieved that the encoding is set by the program, I just tried "netstat > text.txt" and it produced something n++ considered to be "ansi", so luckily (or unluckily) it's just this particular program I need to worry about. But as said earlier, I believe I have no method for altering it's output encoding, only dealing with it "after the fact". I have learned something, so thanks for that. | |
May 29, 2023 at 18:28 | comment | added | bfh47 | @harrymc Ok, I am using Frank Westlake's "recycle.exe" (web.archive.org/web/20160814031010/http://ss64.net/westlake/xp/…). I think development of this program is already finished, I am not sure that it's output can be changed in such a specific manner. I can run the program without piping, and it outputs things fine in CMD. I just thought, if we knew the txt format that is outputted, surely there is a method for converting. notepad.exe can do this, I would have throught there'd be a cli program that can also do it. | |
May 29, 2023 at 18:21 | comment | added | harrymc | No, evidently this is not a matter of the code-page - it's the program that generates the output in UTF16. Do you control the program, or does it have any settings for that? | |
May 29, 2023 at 18:17 | comment | added | bfh47 | @harrymc, Hi again - if I put aside batch files for now, and type "chcp 437" in cmd, to change my active code page, then within the same cmd window, I perform my "pipe" from the pogram to a text file, it still produces a UTF-16 LE result,if I "type" that txt file within the same window, it still comes out with weird spacing. I assumed that might have worked, to be honest! I can run chcp after that in the same window and it assures me it's using 437. Do I maybe need to do everything in one line of input? | |
May 29, 2023 at 18:09 | comment | added | harrymc |
Code page 866 is "DOS Cyrillic Russian", so there is no reason that it will generate UTF16. However, try putting the line chcp 437 before the command.
|
|
May 29, 2023 at 18:06 | comment | added | bfh47 | @harrymc I did briefly come across some solutions which used "chcp" however I didn't have luck using them, but from the sounds of it, maybe it deserves revisiting. Could you potentially provide a working example, or link one? I will do some research later today when I have time. if I run "chcp" it tells me I am using code page 866. | |
May 29, 2023 at 18:01 | comment | added | bfh47 | @DavidPostill that result produces a UTF-8 BOM result which is not displayed properly and gives garbled cmd result. But thanks for the reply. "Set-Content" certainly looked different to "Out-File" which I demonstrated here, but it seems it does the same thing | |
May 29, 2023 at 17:49 | comment | added | harrymc |
Are you using the chcp command? Try chcp 437 (United States) to see if with it the program generates an ANSI file.
|
|
May 29, 2023 at 17:49 | comment | added | DavidPostill♦ | Does Converting text file to UTF-8 on Windows command prompt - Super User answer your question? | |
May 29, 2023 at 17:19 | history | asked | bfh47 | CC BY-SA 4.0 |