144
votes
Accepted
How do I change the character encoding for a webpage in Chrome?
Unfortunately… a Christmas-box from Google Chrome: Chrome encoding options gone?:
…
Chrome 55 has removed the Encoding menu and Chrome will do
auto-encoding detection now:
https://bugs....
40
votes
How do I change the character encoding for a webpage in Chrome?
You can now use extensions. Here is an example
https://chrome.google.com/webstore/detail/set-character-encoding/bpojelgakakmcfmjfilgdlmhefphglae
36
votes
Accepted
Why does Unicode have big or little endian but UTF-8 doesn't?
Note: Windows uses the term "Unicode" for UCS-2 due to historical reasons – originally that was the only way to encode Unicode codepoints into bytes, so the distinction didn't matter. But in ...
34
votes
Accepted
remove <200b> character from text file
<200b> is a Unicode for "Zero Width Space". You won't find it as a string. You can pipe the character into sed like this for removal:
sed -i "s/$(echo -ne '\u200b')//g" file
sed -i will modify ...
32
votes
remove <200b> character from text file
You can also get rid of this in VIM.
%s/\%u200b// - entire file
%s/\%u200b//g - entire file, more than one occurrence on a line
27
votes
Why does Unicode have big or little endian but UTF-8 doesn't?
Exactly the same reason why an array of bytes (char[] in C or byte[] in many other languages) doesn't have any associated endianness but arrays of other types larger than byte do. It's because ...
24
votes
Accepted
Excel: Change default encoding (file origin) of Text Import Wizard to UTF-8 (65001 : Unicode)
I answered a similar question at Default character encoding for Excel Text Wizard?.
I found my answer at Changing default text import origin type in Excel.
Close Excel, if it is open.
Open the ...
10
votes
Why when copying a non-English URL from address bar it's URL encoded and not as the text I see (decoded)?
The URL encoded format is the form actually used by applications that communicate on the web. It is copied from Firefox/Chrome this way by default to likely guarantee the usability of the copied URL.
...
9
votes
Excel: Change default encoding (file origin) of Text Import Wizard to UTF-8 (65001 : Unicode)
It seems that Byte Order Mark is required by Microsoft Office software.
Using Notepad++, convert the CSV using menu: Encoding -> Convert to UTF8-BOM.
Using the sed Unix utility, available in cmder or ...
9
votes
Accepted
My text file is riddled with question marks. How can I make it readable?
How can I make all these letters appear correctly?
I can think of two options :
Convert the file to UTF-8. – This is what I recommend.
Configure VS Code to auto-detect the most proper encoding.
The ...
9
votes
Accepted
Grep search for text in an ISO-8859-1 encoded file
How can I prevent the grep output from stripping the accented characters?
grep itself does not strip accented characters, it outputs matching lines as they are in the input file. It's your terminal (...
5
votes
How to set character encoding when opening a CSV file in Excel?
On Excel 2016 for Mac:
create blank worksheet,
in main menu go to Data -> Get External Data -> Import Text File,
follow steps in wizard - choose the encoding until you will see the correct preview ...
5
votes
Accepted
How to decode this seemingly GBK-encoded string?
These –
=?GBK?B?1cK5scf4s8e53L7WudjT2s34wufT38fp0MXPoteo?=
=?GBK?B?sai1xLTwuLQoz8K6vszBMbrFKS5kb2M=?=
– are MIME Encoded-Words. The general form is:
=?<Charset>?<TransportEncoding>?<...
5
votes
Determine and change filename encoding on Windows
I can reproduce your problem using next simple Powershell script
$RatedName = "šöü" # set sample string
$FormDName = $RatedName.Normalize("FormD") # its Canonical ...
5
votes
Accepted
Windows 10 All alt codes and accentuated characters replaced by �
I found the solution by wondering around the system settings. x)
I'll explain the procedure here (my computer is in french but I'll translate, and you can rely on the icons too).
Requirements :
...
5
votes
Stop VS Code from auto guessing encoding
How can I stop the auto-change encoding?
– According to your own
comment,
the Auto Guess Encoding is already off.
The fact that VS Code encodes your file as Windows-1252
(code page 1252 or CP1252)
...
4
votes
Cannot copy non-latin characters from PDF document
In my case Polish characters like ś,ć,ł,ę were broken when copying from pdf.
Tested a lot of options. The only one that worked really well was https://online2pdf.com/convert-pdf-to-rtf# .
So ...
4
votes
How do I find the encoding of the current buffer in vim?
I found that : https://vim.fandom.com/wiki/Reloading_a_file_using_a_different_encoding
You can reload a file using a different encoding if Vim was not able to detect the correct encoding
:e ++enc=<...
4
votes
Can't see the Chinese characters in VIM
Open the VIM configuration file
$ sudo -H gedit /etc/vim/vimrc
Added following lines:
set fileencodings=utf-8,ucs-bom,gb18030,gbk,gb2312,cp936
set termencoding=utf-8
set encoding=utf-8
Save and ...
4
votes
OneNote backslash character codes?
How do I get the rest?
OneNote 2013 uses the same equation editor as Word 2007/2010.
There are many more in the linked pdf.
Source The Word 2007/2010 Equation Editor
4
votes
Accepted
What is the difference between Windows-1252 and ANSI encoding?
What is the difference between Windows-1252 and ANSI encoding?
See below. In practice it probably won't make much difference to your conversion.
If you keep a copy of the original file then you can ...
4
votes
Why when copying a non-English URL from address bar it's URL encoded and not as the text I see (decoded)?
You may use an extension like https://chrome.google.com/webstore/detail/copy-unicode-urls/fnbbfiapefhkicjhecnoepbijhanpkjp that allows you to copy the URL without encoding.
4
votes
Accepted
How does notepad.exe determine character encoding?
According to Raymond Chen:
Some files come up strange in Notepad
[...] When faced with a file that lacks a special prefix, Notepad is forced to guess which of those two encodings the file actually ...
3
votes
Type Mayan numerals?
Unicode 11.0.0, released June 5, 2018 includes Mayan numerals in block U+1D2E0 - U+1D2F3. Mayan Numerals
3
votes
Finding out the default character encoding in Windows
In .NET Core and .NET 5+ it is System.Text.Encoding.Default gives instance of UTF8. While .NET 4 and previous versions return Windows Active one. You can find more details on following:
Microsoft ...
3
votes
FileZilla: preserve UTF-8 encoding when transferring to Linux
Cannot you just transfer them in a binary mode?
Transfer > Transfer type > Binary.
3
votes
Accepted
Control I characters in my text file
^I (Ctrl-I) is a representation of the tab character (9 in ASCII). Usually,
Vim displays tab characters by the number of space characters as specified in
the tabstop option. However, setting the list ...
3
votes
Accepted
Resetting Unicode mappings in PDF text
The example PDF is encoded correctly: It includes font-to-unicode tables, and if I try copy-and-paste with mupdf, the hyphen in Хлебникова in the second paragraph becomes U+00AD SOFT HYPHEN. So it ...
3
votes
Accepted
How to convert this string to Japanese using GNU/Linux tools?
Pipes are an OS feature which works with byte buffers and does not interpret their contents in any way. So piped text doesn't go through to bash and especially never through 'readline'. Text pasted as ...
Only top scored, non community-wiki answers of a minimum length are eligible
Related Tags
character-encoding × 305encoding × 66
unicode × 42
windows × 39
linux × 37
utf-8 × 36
special-characters × 25
windows-7 × 18
notepad++ × 17
macos × 14
vim × 14
fonts × 13
google-chrome × 12
pdf × 12
ascii × 11
windows-10 × 10
conversion × 10
characters × 10
microsoft-excel × 9
firefox × 9
bash × 8
email × 8
browser × 8
putty × 7
windows-xp × 6