(question re-written to be more useful)
I am writinghave a batch script which pipeswill interact with command line programs, take their output, and then perform decisions based on that output.
One of the programs I need to interact with is a program'sfairly old one, so I am stuck with it's quirks. When I pipe it's output to a text file, the contents of that text file are then later meant to be read back intois in the script as a variable, which will be used for further decision makingUTF-16 LE encoding.
whenHere's how I pipe a program's output to a text file using cmd, like sodo that:
program -parameter > resultat.txt
The resultant file (according to notepad++) is "UTF-16 Little Endian"Under Windows 7, accordingthis encoding seems to notepad, it is "Unicode".
The problem:
If I use the "type" command inbe troublesome for cmd to show/batch work, because you cannot read the contents of thissuch a text file, it comes out looking weird, with extra spaces between every letter of every word into a variable.
If I try to makeHere is an example, (this only uses the contentsfirst line of thisthe text file a variable, for use in a batch script, like so):
set /p Var=<resultat.txt
echo %Var%
cmd /k
(This above 3 line script is my "test" script that I used later Also, if you use "type" to evaluate successprint the contents of different conversion attempts!)the text file, there is weird spacing, suggesting it's not properly being processed.
Attempted solution [1]Attempted solution [1] - Powershell
After research, I found athat powershell method for convertingcan convert txt encodings here is what it looks like, using the following method:
Get-Content -Path "path\file.txt" | Out-File -FilePath "path\new_file.txt" -Encoding <encoding>
Here isUsing Notepad++, I did some research, what encoding do I have testedneed to attain?
UTF-8 (the results referno BOM), which is equivalent to test script output"ANSI" in Notepad, is the encoding I need, loading text files to variables, and the brackets show what notepad++ identified"type" command, both work flawlessly when this encoding is used. How do I know? If I open the piped text file in Notepad, and resave as "ANSI" encoding, everything works flawlessly.
-Encoding ascii
...Is the option which should have worked, as this produces a result in UTF-8 (no BOM):, but it seems to be unable to handle UTF-16 LE source encoding format, and does not produce useable output. When I opened the resultant file in Notepad++ it identified it as UTF-16 LE "Unix", which was odd.
cmd pipeFunny enough: if I resave piped txt file as "unicode" in Notepad, this produces a UTF-16 LE BOM file, which works with the above conversion parameter to textproduce a perfect UTF-8 file (notepad++ says. At this point, I extended my research to also ask the question "How can I add BOM to UTF-16 LE encoding?" As I could combine such knowledge with the powershell knowledge. However, notepad says unicodespoiler alert: I was unsuccessful in finding a decent answer.
-Encoding utf8
...Is another similar option, but it produces a UTF-8 BOM file (the equivalent of saving as "UTF-8" in Notepad), this produces an output with corruption.
So to sum up:
I am looking for a command line tool/method (open or proprietary, 1st or 3rd party), to be able to achieve a convesion as follows:
by itself, unchanged: fails the "type" and "store contents as var" tests, (no var is stored)
when resaved as "unicode" in notepad (UTFUTF-16 LE BOM): garbled cmd output
when resaved as "ANSI" in notepad (UTF-8 Windows(CR LF): works flawlessly
when resaved as "UTF straight to UTF-8" in notepad8 (UTF-8 BOM Windows(CR LF): garble + correct output
piped txt conv. to "ascii" using PS (UTFUTF-16 LE "Unix"): no var is stored in cmd
piped txt conv. to "utf8" using PS (UTF-8 BOM): corrupted output
notepad "unicode" resaved txt conv. to "ascii" using PS Windows(UTF-8CR LF): works flawlessly
notepad "unicode" resaved txt conv. to "utf8" using PS (UTFUTF-816 LE BOM - Windows(CR LF): garble + correct output
I am not an expert at how text file encoding works... but from the results of above testing, it seems to me that I need a method to convert either:
- UTF-16 LE - Windows(CR LF) straight to UTF-8 - Windows(CR LF)
- UTF-16 LE - Windows(CR LF) to UTF-16 LE BOM - Windows(CR LF)
Is there any such way to do this, using a command line, or powershell tool? I don't worry too much if it's 3rd party. I did evaluate some other solutions but of course I had to avoid anything that was either recommending to use a GUI application, or that wasn't specific to windows.
...Or there's a chance I am going about this the wrong way? I tried to provide some context to what I am doing, so maybe there is a smarter solution.
Any help whatsoever or criticism is much appreciated.
Thanks in advance!