All Questions
Tagged with character-encoding encoding
66
questions
0
votes
1
answer
236
views
UTF-8 Decoders fail to decode the encoded strings
I have some encoded values values which I believe is UTF-8. Now I dont really know if it is UTF-8 or not because other online tool and steps to decode UTF-8 is not working, BUT an open source tool ...
2
votes
1
answer
144
views
VIM uses wrong encoding - but only in status messages
I ran into a strange issue with my ArchLinux setup. Vim uses correct encoding for reading/displaying files but these status messages (which displays the current mode or reports back when the buffer is ...
4
votes
1
answer
1k
views
How to identify a file encoding?
I'm trying to figure out the encoding of a text file. I did try a lot of the common ones (with Notepad++), but I've failed so far.
A few hints: The file was originally an Eudora mbx file, with mostly ...
0
votes
1
answer
1k
views
Wrong character encoding in ssh session – but not for all connectios
I have an odd issue when connecting to my (Ubuntu) server via SSH.
If I connect from my Gentoo box, all is fine. All Umlauts etc. work, I can type "ÄÖÜ" and so on.
If I do the same from my ...
0
votes
1
answer
931
views
How to read Linux text files in Windows system?
For example, I run the top command and store it to a file in Linux, after that I open that file in Windows it contains some gibberish. Here is the file viewed in Notepad++:
The option to convert to ...
1
vote
0
answers
283
views
How to use ISO8859-9 encoding in terminal?
I maked a file containing "ırmak" with a text editor via encoding ISO8859-9.
Then, I tried to print the content with "cat" command in the terminal. But I could not.
I use the ...
1
vote
1
answer
805
views
Convert Korean files that are showing up incorrectly to utf-8 - character shows Çѱ¹Ÿî
I was just about to ask this after a long time of searching so decided to answer my own question...
I downloaded Korean subtitles in an .smi file that was in zip archive. When I extracted it, the ...
0
votes
2
answers
688
views
Restoring corrupted UTF-8 files
After my PC broke down I managed to make a backup of the relevant files before reinstalling Windows.
Now that I'm restoring those files and setting the system up I noticed that some of the files got ...
0
votes
1
answer
915
views
Printf in gawk with the correct encoding?
I'm wondering: can gawk printf in any format besides ASCII?
Currently, I'm using gawk match() to search through some UTF-8 text. When I go ahead and print out the matches gawk finds, it ends up like ...
1
vote
0
answers
93
views
How to fix accentuation encoding with cmd.exe running inside bash?
I installed https://www.msys2.org/ and setup an ssh server for it. With this I can connect to my machine and work remotely. The problem is that some application as Visual Studio tools or windows ...
0
votes
1
answer
783
views
Why does copying text between Notepad++ files create files with different bytes?
I've created a simple pdf [hi.pdf] with the word hi and when I open it in Notepad++, its encoding is ANSI, which I assume is Notepad++'s best guess, with it opening successfully when I Save as ...
0
votes
0
answers
847
views
What is this for a file name encoding and how to fix it?
On my Linux machine I found old files (at least from 2004 if not older), so possibly Win9x days. Maybe they came over some old FAT drive on my disk or some old Samba share.
Umlaute are very weirdly ...
0
votes
2
answers
2k
views
Finding the encoding of a text file containing weird characters
I recently received a file, of Turkish origin, where the file has some English words which I can easily read, and some weird characters. I wonder if this file is encoded, encrypted or sth else. I ...
0
votes
1
answer
184
views
BER decode SubjectAltName and CHOICE?
I'm having trouble working out the syntax when decoding a SubjectAltName in a TLS self-signed certificate. I believe the certificate is well formed. The trouble is, I don't understand how to decode ...
5
votes
0
answers
384
views
Can I tinker with the encoding when using pdftotext to convert PDF to text?
Sometimes when I do pdftotext it results in perfect text. I assume this is because the actual unicode text data is embedded directly in the PDF itself, and simply read out.
But other times (around ...
2
votes
1
answer
299
views
How to send an e‑mail to an address with Latin9/iso‑8859‑15 characters inside the username part of the address?
As part of finding a job, I need to send an e‑mail to an address which contains latin letters with accents inside the username.
I know this is not standard, but they did it and there’s less than 1000 ...
0
votes
0
answers
1k
views
How do you create custom zalgo text?
I know what zalgo text is and that there are a few websites that can make it for you. But I'm looking for how I can make it however I want." HͥAͣQͫ" is an example, how can I make it so I can choose ...
0
votes
1
answer
1k
views
Which type of character encoding consumes the most memory?
I will store a long (45132 character) string in a Postrgres database whilst preserving every character (including really rare ones).
Postgres can store strings up to 1GB (see here).
In terms of the ...
2
votes
1
answer
3k
views
What text encoding is used in "Zip archive data, at least v?[0x314] to extract"?
I tried multiple search terms on the usual suspect search engines as well as on stack overflow and superuser, but can't find anything on this. Normally a Zip archive magic pattern is "at least v1.0" ...
0
votes
2
answers
5k
views
Exporting from Excel to CSV replaces Japanese characters with ??? even though Windows, Office locale is Japan/Japanese
I am exporting an excel file (Excel 2016) containing Japanese characters into CSV. (Note : I am not exporting to CSV UTF-8 provided). In the process, all Japanese characters are replaced with '?'
My ...
0
votes
0
answers
1k
views
Interpret text file with some hex codes?
I have a file with contents looking like PK\u0003\u0004\u0014\u0000\u0006\u0000\b\u0000\u0000\u0000!\u0000À¸<91><91>¢\u0001.
However, I have a different version of the same file looking ...
2
votes
0
answers
446
views
Execute sh file with Russian or Chines Chars - Saved as UTF-8 or Unicode
I have a file which has some russian chars in. Below is the content of by sh file.
#!/bin/sh
sed -i "s/\bVAR1\b/Привет, как ты/g" file1.txt
When i save this file i had to save this as UTF-8 or ...
1
vote
2
answers
48k
views
¢tRÂà³Ab.Ÿân TXT files: how to switch from weird characters back to normal?
So, I have on a flash drive a txt file generated in Cyrillic (my own work, own pen drive), a few years old. Now I needed to open it, only to see this kind of mess:
.
I wonder why is this happening and ...
0
votes
2
answers
1k
views
What is causing these two apparently identical files to have different hashes?
I am unable to figure out why the following two files are yielding different hashes (SHA1, CRC32, SHA384, whatever):
https://cdn.jsdelivr.net/npm/[email protected]/dist/jsonify-error.js
https://...
0
votes
0
answers
681
views
What encoding is needed to read this HTML file containing Spanish characters?
I downloaded this HTML file from archive.org, which is a digitilised book, written in Spanish. As such, it has plenty of accents and other unique symbols (e.g. ñ).
Now, when I open the html file in a ...
0
votes
1
answer
943
views
Symbols are lost after conversion from UTF-8 to ISO8859-1 and back to UTF-8
I have a file with properties in French.
I would like to convert it to ISO8859-1.
But after conversion, some symbols are lost.
What is wrong?
> cat fr.properties
VAR2="élément n’a"
> cat fr....
2
votes
2
answers
2k
views
How to fix encoding - curly apostrophe appears as ‰Ûª
I have a text-file in which all the ASCII characters appear correctly but some others do not. In particular there is this word:
don‰Ûªt
In hex the bytes are 64 6f 6e 89 db aa 74. Obviously, it is ...
0
votes
2
answers
2k
views
Binary encoding formats
echo random text > text_file
Saves text_file in text format with ASCII encoding. To check the encoding, I do
chardetect text_file
which tells me that the file is ASCII encoded.
Now I have a jpg ...
-1
votes
1
answer
449
views
Opening a 7z file on mac shows strange encodings
I am opening this file:
https://dataverse.harvard.edu/file.xhtml?fileId=2901965&version=1.0
On TextEdit. When I do, the file looks like:
7zºØ'· t(ù‚]%N
ò„,¢‡ò]¬CRJ∫√ıìaIÄÅ≤íhYTÔ1/ıG3À=ôN’(...
0
votes
0
answers
701
views
How can I make `file -i` command return ISO-8859-9?
Let's say I have a text file with content
şşş
and in vi/vim I have set fileencoding as :set fileencoding=ISO-8859-9.
Later when I check the encoding with
file -i <myFile>
it gives text/...