2

I want to concatenate multiple text files, that are encoded as UTF8+BOM, using a Windows 10 batch file. In between each file, I want to add a blank line. I used the TYPE command as shown below, but I found that since each file has a UTF-8 BOM on it, the resulting output file has BOMs in in the middle of it. I also tried the COPY command and it did the same thing.

Example 1

ECHO -- File start >OUTPUT.TXT
TYPE file1 >>OUTPUT.TXT
ECHO( >>OUTPUT.TXT
TYPE file2 >>OUTPUT.TXT
ECHO( >>OUTPUT.TXT
.
.
.

Example 2

COPY header+file1+blankline+file2+blankline+... OUTPUT.TXT

I expected that TYPE would not echo the literal characters, but would instead use the BOM to determine the file encoding so that it could display the file correctly. Apparently not. :-( Does the TYPE command not understand Unicode at all? If it was a UTF-16 file would it really output nul characters between?

What is an alternative? Do I need to use PowerShell?

3
  • Well the type command is from stone age where no UTF formats did exist, so how do you expect it deals well with this? You simply can't copy files with different encodings into one file without unifying them first.
    – LotPings
    Commented May 17, 2018 at 17:42
  • there is a command called iconv or perhaps another command on *nix that has been ported to windows, that can convert a file from UTF-16 with BOM, to UTF-8 without BOM. You should probably use the command chcp 65001 as that's UTF-8 without BOM. Then you could try copy and see how that goes.
    – barlop
    Commented May 17, 2018 at 21:14
  • @LotPings Type might work fine with UTF-8 without BOM. chcp 65001, Remember that most commands are from the stoneage but they do get updated a little bit from time to time!
    – barlop
    Commented May 17, 2018 at 21:14

0

You must log in to answer this question.

Browse other questions tagged .