10

The following should look good in your browser, AND compile in Latex looking beautiful right?

\documentclass{article}
\usepackage[utf8]{inputenc}
\begin{document}
UTF-8 test:
\begin{verbatim}
logic:  not:¬
        and: ∧  nand:⊼
        or: ∨   nor:⊽
        inequal to, xor:≢,≠,⊻,⊕     equal to: ≡,=
Set logic:  ∪,∩,⊊,⊋,⊆,⊇,⊈,⊉,≡ 
    avoid useage of non-specific: ⊂,⊃,⊄,⊅
Set Membership:∈,∉,∌,∋,∅
assignment ≔≕
predicate logic:∃,∄,∀,∴,∵
regex: /|\.⋆^?+(){,}[]
Arrows: ↖↑↗⇖⇑⇗
        ←↔→⇐⇔⇒⇕↕
        ↙↓↘⇙⇓⇘
        ↚↮↛⇍⇎⇏
numbers:ℕatural={0,1,2,3, … ∞}, ℤintegers,
        ℚ=rational, ¬ℚ=irrational,
        ℝeal, ±∞, ℂomplex
size: ≤,≥,<,>,≮,≯,≰,≱,≪,≫,≢,≠,≡,=
Greek Alphabet:
ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣ ΤΥΦΧΨΩ
αβγδεζηθικλμνξοπρσςτυφχψω
Game:
Chess symbols:♚♛♜♝♞♟♔♕♖♗♘♙
Playing card symbols:♠♣♥♦ 23456789JQKA
miscellaneous:°∙■□▪▫○●☇
\end{verbatim}
\end{document}

Currently(2010 Dec 12 with latest ubuntu with all updated packages) I get the following error:

! Package inputenc Error: Unicode char \u8:**"insert random char here"** not set up for use with LaTeX.

9
  • 3
    @GlassGhost: If you rewrote your question without such an accusative tone (as seen especially in your last comment), people would probably be more willing to help out. TeX can certainly be frustrating, but there's no reason to take that out on the community. Commented Dec 12, 2010 at 20:08
  • 6
    @GlassGhost: I believe I speak for most of us when I say that we do not appreciate the thinly veiled sarcasm and a certain note of entitlement in your comments. The community is happy to help anybody who comes around with a question, even those of the LMGTFY variety. But you need to show a certain degree of respect for people who spend their (free) time answering your questions. Commented Dec 12, 2010 at 22:28
  • 9
    @Martin: (slightly off topic), I'm now glad that I googled LMGTFY myself, instead of asking you what that stood for. Commented Dec 13, 2010 at 18:14
  • 8
    I don’t see any veiled sarcasm or accusative tone. And yes, it sucks that TeX is oblivious of Unicode, twenty years after the fact. So just to offer a different opinion, I’m completely fine with the question as-is. Commented Dec 13, 2010 at 20:20
  • 2
    @GlassGhost: A small, but important, point: LuaTeX and XeTeX aren't dialects, they're implementations. It's more like someone suggesting Chrome instead of Firefox, or GCC instead of Microsoft's C compiler. You're still writing in (La)TeX, HTML, or C, respectively, even if there are extensions. LaTeX itself is arguably a dialect of TeX, though perhaps descendant would be be a better term. Commented Dec 13, 2010 at 20:35

5 Answers 5

25

The original TeX engine wasn't designed to handle multi-byte encodings. So any package that provides Unicode functionality on top of that has to be imperfect. There are however two new engines that do provide full Unicode support: LuaTeX and XeTeX. There is also a package that makes it easy to select any OpenType or TrueType font for use with these engines: fontspec. (There is also unicode-math which provides support for Unicode-aware math fonts like STIX (or XITS) and Cambria Math, but it is still in heavy development and not quite complete yet.)

Be aware that the version of these engines and packages shipped with Ubuntu 10.10 (and earlier) are very outdated. If you want to use them, you have to manually install TeX Live 2010 (which is easy).

After you have updated, you can use lualatex or xelatex to compile the following code:

\documentclass{article}
\usepackage{fontspec}
\setmainfont{FreeSerif}
\setmonofont{FreeMono}

\begin{document}
\begin{verbatim}
% your example here
\end{verbatim}
\end{document}

This sets FreeSerif as the main font (Free Serif isn't the most beautiful font, but I do not know any other free font that rivals it in Unicode coverage; it comes preinstalled with Ubuntu -- in fact your browser probably uses it to display the symbols) and FreeMono as the monospace font (Free Mono lacks the chess symbols and diagonal implication arrows, but provides everything else in your example; Free Serif would provide everything but isn't monospace). Another free font family with good Unicode coverage is DejaVu, which as based on Bitstream Vera. The compiled document looks as follows:

example

You could of course replace \setmonofont{FreeMono} by \setmonofont{FreeSerif}. Then your verbatim text wouldn't be monospaced anymore, but would contain all symbols. See the fontspec manual for other fancy things that are possible.

If you are writing a text that mixes several languages and scripts then a single font will probably not cover all possible symbols. In this case the polyglossia package will come in handy (it currently only works with XeLaTeX).

6
  • your answer seems to be the best, if you remove that giant image from it, I'll mark it as accepted. Also I thought that UTF8 or Unicode supports most languages, is this not true?
    – GlassGhost
    Commented Dec 12, 2010 at 22:00
  • 4
    @GlassGhost: It does, but most fonts only implement a tiny subset. So if you want to mix say Chinese and Arabic text, you will need a font that has Chinese characters and a font that has Arabic characters. It is rather unlikely that you will find a good font that has both of them.
    – Caramdir
    Commented Dec 12, 2010 at 22:18
  • 3
    @GlassGhost: If you rewrite your question to be more concise and less accusing, I will remove the picture.
    – Caramdir
    Commented Dec 12, 2010 at 22:19
  • 2
    @Caramdir: I think the image proves the concept and should stay. Don't sacrifice useful information for 15 rep points. Commented Dec 13, 2010 at 1:23
  • 1
    @Caramdir should the verbatim be taken away? Oh and I marked your question as accepted anyways, I shouldn't have argued rather; suggested the fact after accepting your answer, which is what I am now doing
    – GlassGhost
    Commented Dec 13, 2010 at 18:19
9

inputenc with option utf8 knows definitions only for a subset of unicode. And from this subset it will only load a small subset as default. If you want more definitions you should load the corresponding font encodings. If you want full unicode support use a unicode enabled engine like xetex or luatex (but I have some doubt that you will find many fonts with chess support).

5

For Unicode support, you need to use XeTeX. It supports using UTF-8 encoded files by default. The Wikipedia page on XeTeX is also useful.

1
  • 4
    Or LuaTeX, see my answer. It supports UTF-8 encoded documents also by default.
    – topskip
    Commented Dec 12, 2010 at 20:12
4

Here is your example which can be compiled using XeTeX or LuaTeX. Deja Vu Sans Mono contains all occurring characters except for nand, nor and xor:

\documentclass{article}
\usepackage{fontspec}
\setmonofont{DejaVu Sans Mono}
\begin{document}
\begin{verbatim}
°±∞∙■□▪▫○●
Set Membership:∈,∉,∌,∋,∅
Set logic:
    ∪,∩,⊊,⊋,⊆,⊇,⊈,⊉,≡ 
    avoid but don't be frightened of usage of non-specific: ⊂,⊃,⊄,⊅
assignment≔≕
logic:
    and: ∧
    nand:⊼
    or: ∨
    nor:⊽
    not:¬
    inequal to, xor:≢,≠,⊻,⊕
    equal to: ≡,=
size: ≤,≥,<,>,≮,≯,≰,≱,≪,≫
regex: ⋆
predicate logic:∃,∄,∀,∴,∵
numbers:
    ℂomplex
    ℝeal
    ℚ=rational
    ¬ℚ=irrational
    ℤintegers
    ℕatural={0,1,2,3,
Arrows:
↖↑↗⇖⇑⇗
←↔→⇐⇔⇒⇕↕
↙↓↘⇙⇓⇘
↚↮↛⇍⇎⇏

Greek Alphabet:
ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣ ΤΥΦΧΨΩ
αβγδεζηθικλμνξοπρσςτυφχψω

Game:
Chess symbols:♚♛♜♝♞♟♔♕♖♗♘♙
Playing card symbols:♠♣♥♦ 23456789JQKA
\end{verbatim}
\end{document}
3

Use this:

\documentclass{article}
\usepackage{fontspec}
\setmainfont{Arial Unicode MS}
\begin{document}
 ... (your font list, but not verbatim, because arial is not a typewriter font) ...
\end{document}

with lualatex. There you go.

You will need a proper unicode font though.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .