Subsetting (Chinese) Fonts


There are loads of really delightful Simplified and Traditional Chinese True Type Fonts available on the web. There's only one issue - the file sizes are really large. In many cases, too large to effectively use as a web-font.

For example, this calligraphy style font is 3.4MB.
Richwin-Xing-kai-jian-Fan-Font-fs8

The beautiful Paper Cut Font weighs in at 14MB!
Paper Cut Chinese Font-fs8

That file-size is far to heavy to embed on a web page.

Subsetting

Generally speaking, font files like .ttf contain a representation of every single character. 0-9, a-z, A-z, all the punctuation, non-English characters etc.

That's really useful if the font is installed on your computer and you want to write a document which could contain every character. It's less helpful if you want to use a fancy font on your website's headers.

Subsetting is the act of creating a subset of a font. That is, a font file which only contains specific characters.

Let's suppose that we only want a specific phrase rendered in this font.

<h1>我很丢脸。我没有吃Fruity Oaty Bar</h1>

We only need 19 unique characters - we can get rid of any character which doesn't appear in that heading.

There are sevel font manipulation tools available. I've chosen Font Optimizer which has an excellent live demo page. The source code is on BitBucket or on GitHub.

The command line syntax is really simple

./subset.pl --chars="我很丢脸。我没有吃Fruity Oaty Bar" input.ttf output.ttf

The file size reduction is impressive. My original font was over 14MB. The optimized one is 32KB

14,066,456 input.ttf
    32,084 output.ttf

The process run instantly - fast enough to run as a web service to generate these fonts dynamically, I would think.

One could quite easily create a scrap of JavaScript which read the contents of a block of text and then requested a font which contained only the necessary characters.

Apparently, Monotype have a proprietary and patent-pending solution to this rather trivial application.

Uses

Being able to subset fonts to reduce file size is incredibly useful. Supposing you want a different font for body text, headers, and navigation. Rather than having to load three large font files containing every character in the known universe, you could subset each one for only exactly the relevant characters.

This also has an interesting DRM like effect. Some people don't want their shiny web fonts to be downloaded and used as a regular font. With subsetting, the font only contains the specific characters.


Share this post on…

One thought on “Subsetting (Chinese) Fonts”

    Trackbacks and Pingbacks

    What links here from around this blog?

    What are your reckons?