4

(note that I posted a previous iteration of my question in stack overflow)

Hello there, I am trying to change multiple files from UTF-8 to ANSI in Notepad. My goal is change them to ANSI like my course has specified, so that I can import these files in MySQL.

enter image description here

Specifically, I am wondering how to convert files from UTF-8 to ANSI, as what I have picked up here is that I need to convert, not just change the encoding. Or, I might need to know how to overwrite a file.

So far, I use the "Open as..." option to open a csv file in Notepad, select ANSI encoding, and click Save. When the warning pops up that some of the Unicode characters will be lost in ANSI, I clicked Ok. However, when I go back to the "Save as" page to check if I changed all the files, some have changed back to UTF-8 encoding. My instructor has said that the encoding should change if I overwrite the file...I haven't hit an "Overwrite" option yet, so how will I know if I am overwriting it?

enter image description here

If your solution involves inserting code in the text, please clarify how this text-converting code will not be a part of the code in MySQL, as I am confused about how inserting code in the text file will change the code and the data within once it's in SQL. That's where I'm at in my journey to learn these applications...

I have read about the different encoding types, that they support different characters. The files that won't change encoding do contain non-latin characters. Although the topic of encoding makes sense, I am getting the message from my course that I should be able to save all the files with ANSI encoding, and then import in mysql. I have read here about importing all files as UTF-8 to sql, but it seems like there's a lot I could mess up in my file. At this stage, I would be more confident finding a way to convert the file to ANSI, if possible, and import from there.

I feel like I've dumped a lot of information here, as I have tried a few different approaches to this. Anyone have ideas for what I'm missing here, or a workaround?

4
  • "So how will I know if I am overwriting it?" - This happens when you save the file as the existing file. Why are not using a more appropriate text editor? It's not clear how you have determine you still have a problem or even what your problem is exactly.
    – Ramhound
    Commented Apr 26 at 16:32
  • 1
    notepad-plus-plus.org
    – Gantendo
    Commented Apr 26 at 16:35
  • Thanks, everyone. I got some ideas from your comments.
    – Ruth-Anne
    Commented Apr 26 at 16:56
  • If you have non-ANSI characters in your files, they will be lost or changed to something else. Why can't you input UTF-8 files to MySQL, and how are you inputting them?
    – harrymc
    Commented Apr 26 at 18:09

2 Answers 2

1

My goal is change them to ANSI like my course has specified, so that I can import these files in MySQL

That doesn't follow. MySQL has supported loading UTF-8 CSV files for almost 20 years now.

So far, I use the "Open as..." option to open a csv file in Notepad, select ANSI encoding, and click Save. When the warning pops up that some of the Unicode characters will be lost in ANSI, I clicked Ok

That's the wrong way to convert a file. Specifying an incorrect charset at open time will not convert anything – it will only misinterpret the bytes as if they had already been converted, which (depending on what 'ANSI' means for your region) can end up being mapped to completely wrong characters.

Instead you need to open the file in the charset that it currently is, allowing the editor to understand what characters the bytes decode to, and the save it as the new charset you want.

It's like converting an image: if you have a PNG, you don't open it as JPEG – you open it as the PNG that it is and save it as JPEG.

(Do keep in mind that there really is no single charset named 'ANSI' – it is region-dependent, so what counts for 'ANSI' in your OS might be Windows-1251, it might be cp1252, cp1257, … and even assuming you had to use it for loading data into some 30-year-old database, your better option would be to use an editor or a converter that takes the actual charset name explicitly.)

0

A good way to do this is with Powershell:

Get-Content .\test.txt | Set-Content -Encoding ansi test-ansi.txt

You can easily apply loops to this if, for example, you need to change the encoding of all files in a directory. See Get-Help foreach

References: Converting text file to UTF-8 on Windows command prompt

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .