I am working with batch files in Windows, using both Notepad, and Notepad++. When I run the batch files, which all start with @echo off, I see the first line (when running on two separate machines) reading ´╗┐@echo off, and then all the REM lines below it appear as well.

I have tried changing the encoding in Notepad++, but it claims they are already at UTF-8 encoding, which appears to be correct.

What do I need to do to get these files to run properly?

  @luu my question is about notepad++ specifically
  Note that regular Notepad, when saving with UTF8, does not allow to save without BOM, and will add those characters.
    – dmcontador
    

It looks like the DOS ASCII encoding of the Byte Order Mark for UTF-8 (0xEF 0xBB 0xBF): http://en.wikipedia.org/wiki/Byte_order_mark

In Notepad++ try encoding it as "UTF-8 Without BOM" or as plain ASCII. I think the use of BOM for UTF-8 is discouraged for this reason, it's not exactly backwards compatible with ASCII.

  • 2
    Absolutely right, except the 'DOS ASCII' is DOS code page 850, as shown by experimention in Python: >>> print u'\ufeff'.encode('utf8').decode('cp850') ´╗┐
    – deltab
    Commented Jun 17, 2014 at 6:00
  @deltab Ah, good find. I wasn't sure what the encoding was specifically called, just that I hadn't seen the line-art characters ╗┐ since the days of MS-DOS 5/Windows 3.11. Modern Windows must run batch files with that encoding for compatibility?
    – baochan
    Commented Jun 17, 2014 at 13:34
  • 3
    I ran into this when using Visual Studio to create a new text file.

Turns out it needs to be set to ANSI encoding to work properly. To set this, I chose Encoding->Encode in ANSI.

To figure this out, I tried to create a batch file from the command line.

echo @echo off > batch.bat
echo REM Some comment... >> batch.bat
echo echo Hello world! >> batch.bat

I then opened this file up in Notepad++, and checked the encoding in the lower right corner, which read ANSI as UTF-8. I don't know why it adds that last bit, but it seems to work now.

  ANSI is not really an encoding. Presumably it refers to your Windows system's default code page. That will vary from one system to another, depending on configuration.
  This is not correct. The BOM is a character set encoding artifact.
  @ThorbjørnRavnAndersen Who's incorrect, me or Cody?

