Skip to main content

All Questions

0 votes
1 answer
236 views

UTF-8 Decoders fail to decode the encoded strings

I have some encoded values values which I believe is UTF-8. Now I dont really know if it is UTF-8 or not because other online tool and steps to decode UTF-8 is not working, BUT an open source tool ...
Solo's user avatar
  • 3
2 votes
1 answer
144 views

VIM uses wrong encoding - but only in status messages

I ran into a strange issue with my ArchLinux setup. Vim uses correct encoding for reading/displaying files but these status messages (which displays the current mode or reports back when the buffer is ...
Gabor Garami's user avatar
1 vote
1 answer
805 views

Convert Korean files that are showing up incorrectly to utf-8 - character shows Çѱ¹Ÿî

I was just about to ask this after a long time of searching so decided to answer my own question... I downloaded Korean subtitles in an .smi file that was in zip archive. When I extracted it, the ...
iateadonut's user avatar
0 votes
1 answer
915 views

Printf in gawk with the correct encoding?

I'm wondering: can gawk printf in any format besides ASCII? Currently, I'm using gawk match() to search through some UTF-8 text. When I go ahead and print out the matches gawk finds, it ends up like ...
ixns's user avatar
  • 1
5 votes
0 answers
384 views

Can I tinker with the encoding when using pdftotext to convert PDF to text?

Sometimes when I do pdftotext it results in perfect text. I assume this is because the actual unicode text data is embedded directly in the PDF itself, and simply read out. But other times (around ...
Lance's user avatar
  • 387
0 votes
1 answer
1k views

Which type of character encoding consumes the most memory?

I will store a long (45132 character) string in a Postrgres database whilst preserving every character (including really rare ones). Postgres can store strings up to 1GB (see here). In terms of the ...
stevec's user avatar
  • 839
-1 votes
1 answer
449 views

Opening a 7z file on mac shows strange encodings

I am opening this file: https://dataverse.harvard.edu/file.xhtml?fileId=2901965&version=1.0 On TextEdit. When I do, the file looks like: 7zºØ'· t(ù‚]%N ò„,¢‡ò] ¬CRJ∫√ıìaIÄÅ≤íhYTÔ1/ıG3À=ôN’(...
Dhruv Ghulati's user avatar
0 votes
2 answers
3k views

Determine The Encoding of a specific character in Word 2010

I'm having a problem with a generated document (coming from crystal reports engine). Initially hyphens are visible but if the text is copied and pasted with "keep text only" option or "remove ...
bumble_bee_tuna's user avatar
3 votes
2 answers
2k views

What are these strange characters? [duplicate]

I couldn't put them in the title because they actually seem to have many bytes: I took some screenshots:
Rafael's user avatar
  • 165
1 vote
3 answers
7k views

Displaying Korean characters properly on a computer with English Windows XP

I have Windows XP Professional Version 2002 SP 3 English version I have installed the required input methods in Regional and Language Settings and as far as I know all the other options provided there ...
user13267's user avatar
  • 1,721
4 votes
1 answer
898 views

Why does this PDF appear to encode parentheses correctly but doesn't when using pdftotext or copying and pasting?

Here are links to some journal articles: https://doi.org/10.1149/1.2183927 https://doi.org/10.1149/1.2988135 https://doi.org/10.1149/1.3021012 https://doi.org/10.1149/1.2159298 They all encode ...
Nathaniel M. Beaver's user avatar