PDF has been a major pain for ebook formats because it isn't reflowable. If the PDF is designed for a 8.5x11 page and sized the fonts for that format, then reading it at 70% scale on an ereader can be nearly impossible if 1) the device's resolution isn't good enough or 2) your eyes are old enough. And if your reading device doesn't have the same aspect ratio of 8.5x11, or the PDF was designed for a different page layout (A6?) it gets even harder. While PDF can do practically everything these days, reformatting it to fit display sizes is hard enough to make it not want to be used.
We've been using HTML for decades now, and know how to work it. This is a solved problem. It fits any display size, and was designed from the bolts out to be both reflowable and to have easy meta-data management.
And yet, why has ePub dominated? A few reasons:
- Desire to wrap DRM around ebooks means that straight-up HTML isn't workable, it needs to be in a container of some kind. In the last 15 years each new proprietary format has had its own way of handling this.
- Desire to get out of the many-formats trap, and yet still support DRM.
- Budding standardization in the ebook market regarding the kinds of meta-data that people like to have as well as notation formats. Since we have to have a container anyway (see point 1) may as well do the meta-data handling in the container rather than in content. It means meta-data changes don't change content, which could possibly muck up the DRM.
HTML is kind of the perfect format, so long as you don't need DRM. But so far we still do.
Who knows, in 10 years maybe everything will be XHTML. For now, there is ePub.