E-Books and iPads and PDFs: Some Thoughts

by John Holbo on December 21, 2011

I’d like the survey the CT commentariat about their ebook reading habits, and toss out a few ideas. I’ve made the shift this year. I now read more new books on my iPad than on paper. I also read a lot of comics on the iPad, mostly courtesy of the Comixology app. But let’s start with plain old mostly word productions.

At the present time Epub (or EPUB or EPub, or however you capitalize it) and Kindle (mobipocket) are notably sucky formats. Epub 3 is rolling out, and I’m sure the future will get better and better. But for now we have these beautiful devices; yet the books I’m reading on them look plug ugly. Terrible layout. Limited fonts. In their guts, these HTML-based ebook formats are websites pretending to be books. They don’t have pages, strictly. They jostle images in thoughtless ways. The gestalt is very web circa 1997. This is by design (in a negative sort of way). You can’t be sure what size screen you are dealing with, so every appearance of every bit of every ebook on every device is its own custom-poured page, courtesy of these flow-y formats. But the results are, to repeat, bad. Suppose you had a choice between getting a basically quite nice ‘standard’ garment off the rack, or having an brain-damaged, blind tailor make you a suit – just for you! cut by the poor, mad fellow, just to your measure! on the spot! There’s a lot to be said for not the ‘bespoke’ option, in this case.

So how about: just make a PDF so it looks good on the iPad. (Until 2014, when Epub finally catches up.) Why the iPad? Because I’ve got one, so I can see what I’m doing. No, seriously: how will it look on other devices? Unless you are trying to read it on your phone – which, admittedly, some people want to do – it will look fine. Why PDF? Everything can read PDF, and will continue to be able to do so. If the screen is fatter or thinner on some Nook or Kindle or whatever next year’s flavor may be, there will either be a slightly fatter top or side margin. But slightly fat margins are minor sins compared to the barbarities routinely perpetrated, in passing, by ePub and Kindle. PDF can look great. You pick the font! The pictures are in the right place!

And will Epub catch up? Technically, I’m very ignorant. I don’t code. I sort of know HTML and barely grasp CSS. I make books with InDesign. Maybe that makes me biased in favor of (relatively) old-fashioned laying out of pages. But I have nagging doubts as whether this whole websites-pretending-to-be-books, custom-poured page business really is the future of the book. I’m concerned it is, to some degree, a solution in search of a problem. [UPDATE: gross overstatement. Obviously cross-platform compatibility is a real problem, but I wonder whether the problem isn’t being over-solved, with perfect flexibility becoming an ideal to which some good design values are being sacrificed, when modest flexibility might be better.] Consider this very admirable effort, Bibliotype, by Craig Mod, who is always worth reading on these subjects. By all means, let this sort of thing go forward. We’ll see. But consider: if you want to play Angry Birds, you orient your iPad to landscape, for maximum width (the action is left-right). If you want to play Tetris, portrait is better. Up to you, of course, how you want to play at breakfast or in bed or wherever. But the point is this: no one would say game designers should work to design games that are omni equi-playable in portrait or landscape mode, at arms’ length, one inch from your nose, so forth. Likewise, I don’t see why eBook designers should necessarily be bending over every which way to ensure that, no matter what device, and how you are holding your device, you are getting as good a reading experience as you are getting any other way you hold it. Flexibility is a virtue. But there are others. Maybe it would be better to design something that looks great one standard way, even if that means it doesn’t look so good some other way. We still have pages. They aren’t inherent in the e-nature of the eBook beast. But they are inherent in the readers we read on. The iPad is a page, even if the things on it don’t have pages. Maybe the way to go, ultimately, is back to deliberate page layout. Maybe there is no other way to get the best results.

But that’s a big maybe, and I don’t want to stand in the way of folks like Craig Mod trying whatever stuff they think might be great.

In the meantime, as things stand, PDF’s are almost never formatted for an iPad screen, so it doesn’t readily occur to us that we might look in this direction for the optimal solution. I read a lot of PDFs, academic stuff. It’s all formatted for my printer, not my iPad, so the pages are mostly too big, and if you shrink them to fit, the print is cramped. I would suggest that, going forward, folks might start sizing and shaping PDF pages for these devices. The rules are a bit different. But that’s enough for one post. I’m curious to hear what your recent ebook experiences have been. Do ugly ebooks bother you? Do you crave the ebook analog of the Protean easy chair, from Melville’s Confidence-Man?

“My Protean easy-chair is a chair so all over bejointed, behinged, and bepadded, everyway so elastic, springy, and docile to the airiest touch, that in some one of its endlessly-changeable accommodations of back, seat, footboard, and arms, the most restless body, the body most racked, nay, I had almost added the most tormented conscience must, somehow and somewhere, find rest. Believing that I owed it to suffering humanity to make known such a chair to the utmost, I scraped together my little means and off to the World’s Fair with it.”

In a follow-up post [UPDATE: now posted] I’ll give out some freebie PDF eBooks I’ve optimized for the iPad, and note how slightly different rules apply. Nothing fancy. (How fancy could PDF be, after all?) But nice, I hope.

{ 80 comments }

1

dsquared 12.21.11 at 8:59 am

Kindle owners who want to replicate the experience of reading pdfs on an iPad can do so by shining a small LED flashlight into their face while reading the Kindle.

2

CSM 12.21.11 at 9:00 am

Have you looked at the presentation of writing in the McSweeney’s app? They present their web content as-is, but the long form pieces, usually chapters from books or collections, are all in an iPad-specific presentation that is absolutely a joy to read. I’ll be curious how their approach compares to yours. Whatever the case, something needs to be done to get the state of ebooks into something absolutely hobbled.

3

JulesLt 12.21.11 at 9:04 am

The war is really between whether books should be designed (by book designers, making judgements about fonts and typography) OR whether the important thing is the text, and to keep trying to improve the software so that it makes ever better guesses at a ‘good’ layout (i.e. anything in the typographers job that is a rule, which can be automated).

Most of the ePub and HTML technologists are on the latter side – they understand the value in separating the content from presentation (and having tried to explain to my boss why we can’t just search and replace text inside a ‘template PDF’ like we can with an RTF or HTML document I do too).

But as a reader, my preference is for something that looks good.

4

SSK 12.21.11 at 9:13 am

While PDF documents look pretty, PDF is not a suitable alternative in the multi-device future present for reasons you’ve mentioned. I hate having to constantly manipulate PDF files to get any reading done.

The problem is not flowable text but ugly flowable text. Flowable text can be made pleasing if proper layout and typesetting principles are followed. You know this to be the case if you you’ve seen LaTeX typeset documents, or even Safari Reader.

I’m still looking for an eReader app that looks as good as documents typeset with LaTeX. The one I find the most bearable is Aldiko, however, it still leaves a lot to be desired.

5

conchis 12.21.11 at 9:18 am

FWIW, I don’t think the format is ugly (I have a Kindle, not an iPad – not sure if that makes a difference); and in any event, I value flexibility more than good looks (e.g. if I’m walking while reading, I need a bigger font size to maintain focus than I do if I’m stationary, and I value the ability to change this on the go).

I imagine this could change if I were reading things that were more picture heavy, but I’m not.

6

Ray 12.21.11 at 9:28 am

and this is why I prefer paper.
Yes ebooks being portable, you can read them in the supermarket checkout line etc – but that doesn’t really apply to iPads. (Yes, you can bring your ipad to the supermarket. No, it is not more portable than a book)
Yes, ebooks arejust as easy to read as paper – as long as you are reading on something with a reasonably high spec and screen size (ie, not your phone)
Yes, ebooks are cheaper, often free. And yes, the layout and proofreading will reflect that price.
Yes, if you travel cross-country, you can bring your whole library with you. But if you change _device_, you might lose your whole library to DRM.

7

Latro 12.21.11 at 10:56 am

Epub and similar formats are much more flexible, and flexible is important.

Of course, then we get flexibility screwed up by DRM and region-locking. I got my ebook reader and I’m using it and enjoying it, but not as much as I could. Just because I’m not in the USA or UK. Which mean a lot of the time the online bookstore refuses to deliver a bunch of bytes my way, while the physical bookstore would have no problems sending th dead trees on a CO2 generating flying machine :-/

8

Eimear Ní Mhéalóid 12.21.11 at 11:03 am

Well, I read a fair few ebooks on my phone so ePub is my preferred format. I much prefer Stanza to the kindle app, too. Hate hate PDFs.

9

George Iordanou 12.21.11 at 11:19 am

My main problem with ebooks is that I cannot use my usual note-taking technique. For example, when I read a book, I underline some parts and then type them on my laptop. When I need to revisit the arguments of the author, I only read my notes (the typed highlighted parts), which have page references. Kindle highlighting is not that practical, and going over pages quickly.. well, it’s not quick. This is why I am considering switching to an iPad, but I need first to find an app that shows in one place (and gives you the ability to extract) the parts of the book/article that are highlighted.

Therefore, I am not that bothered by the crappy format that some ebooks have, but rather with the limited ways that I can interact with the text. It is unfortunate, because it takes more time to actually copy the highlighted parts than read the bloody thing.

10

liberal japonicus 12.21.11 at 11:21 am

I’m not yet a big consumer of e-books, and have been turned off by the first couple I’ve gotten, so I don’t have a lot of experience to be able to compare, but I like the Atavist. Two colleagues and I wanted to use it for an essentially not for profit publication, but the licensing fee was way way out of our ballparks, unfortunately.

11

Tony Sidaway 12.21.11 at 11:40 am

This obsession with appearance is what is wrong with print books. I don’t give a rat’s ass what a book looks like, what I want is the content. Just give me the words one after another in a plain text file and let me take care of the rest.

12

Tom Elrod 12.21.11 at 12:30 pm

As others have mentioned, the problem is not with Epub per se but with how it’s used. Websites can look pretty shitty, too, and did back in 1997. And you have to worry about how a site looks on different browsers versus mobile phones, etc. This is where Epub is right now. Now that Epub 3 supports HTML 5, and once designers start working more with it and understanding it, and consideration is taken into what an Epub should look like on an iPad versus (for example) a Nook Touch…soon there’s no reason Epub books will have to look terrible.

13

faustusnotes 12.21.11 at 12:47 pm

I recently put up a review on my blog of a book (a world setting for an RPG) that makes maximum use of the benefits of pdf for eBook readers – it’s multiply bookmarked so you can move from place to place in the book, cross-referenced up the wazoo and linked to a map. It’s an amateur publication, essentially, so some parts are missing but I think that textbooks as well should be able to make maximum use of the combination of pdf + iPad to give a really improved experience.

14

William Timberman 12.21.11 at 1:17 pm

Patience, I say. I’ve read several book-length PDFs on my iPad, including Balkins’ Cultural Software, and it’s do-able, if not pleasant. I eventually got used to using my thumbs to stretch the text to fit after turning each page. Not ideal, but eventually things’ll get sorted out, format-wise. A far worse situation, in my opinion, is that so many of the books I’m dying to get my hands on are either not available in e-form, or are delayed for weeks or months after the hard-cover comes out. That too will change over time, I think. I’m just glad that I no longer have to figure out how to fit all those books into my rather small house, and that I can carry all of them with me in a single briefcase wherever I go.

I also prefer enlarged print to wearing reading glasses, and I’m crazy about integrated dictionaries. The latter may be tools of the consumerist devil, as some of my friends call them, but they’re very handy. If only I could find one that handled multiple languages as seamlessly as my current one does English.

The bottom line is that I’ve prayed for a functional e-reader and widely available e-books for over thirty years, and am not inclined to carp now that they’ve finally arrived.

15

faustusnotes 12.21.11 at 1:24 pm

Another way that eBook readers would be awesome is if you could tap on a word and bring up the translation – like the amazing rikaichan, but for pdf/epub. I’m reading manga on my iPad now but they’re rendered as simple picture files, so you can’t interact with them. That’s a shame because it means I can’t check kanji; I don’t yet know if this works better with actual novels (which are too hard yet for me to read). If they had that functionality, they would be truly Teh Awesome.

16

Trey 12.21.11 at 1:40 pm

I prefer doing long reading on my (hardware) Kindle, for the reason that dsquared provides in the first comment. I do most of my article reading on my iPad, though, as the screen is perfect for PDFs. I use GoodReader for PDFs, Stanza for ePubs (which are dreadful), and calibre for managing them all. I bought an $8 stylus so I can underline, write (though poorly), and otherwise mark up my PDFs even if it makes me look painfully uncool. It saves a lot of paper, but I must admit that I ultimately prefer paper.

17

Skapusniak 12.21.11 at 1:48 pm

Plain Text. 80 characters per line. Unix Line Endings. UTF-8 encoded. Markdown or similar non-intrusive markup if you absolutely must.

18

Ted Lemon 12.21.11 at 1:49 pm

In other news, “get off my lawn, kid.” PDF is _horrible_ to read on tablets. It’s laid out for an 8 1/2 x 11 sheet! Or an A4, if it’s a European PDF. So the letters are tiny on an iPad, and you wind up having to zoom and scroll. This is not a good user experience.

Sure, it will be great when they figure out how to annotate books so that the rendering software can choose between a variety of layouts depending on UI orientation and screen size, and these layouts can be hand-tuned by wizened elves from the publishing industry. But realistically what is going to happen is that the software is going to get smarter, and very few books, if any, will be tuned this way.

I love a well-laid-out book, but let’s face it: most books aren’t. No, my main complaint about ebooks is that the book-sharing paradigm is completely broken, but the price hasn’t been adjusted to account for it, so I wind up buying paper books still just so I can lend them to my father.

19

Donald A. Coffin 12.21.11 at 2:09 pm

We have Kindles and, so long as (a) the material is text-only and (b) you never want to go back (or forward) to find something, the experience is mostly OK. Unfortunately, both (a) and (b) rule it out for professional reading.

20

ezra abrams 12.21.11 at 2:14 pm

The issue is, how do we transfer stored information to our brains ?
Framed thusly, the choice between electrons and cellulose depends on what you need to do with the information.
If you need, roughly, a paragraph sized chunk of information, or a fact, electrons are convienent, esp with searching.
If you need something larger, I think cellulose is better – the amount of information/sec that you can access by flipping pages is highly superior to the amount of information you can access by flipping screens.
There is also a pschychological thing about how active you are; isn’t one more active with cellulose ?
Also, I personally, am not gonna buy an ebook untill I know that i have *control* over the content – surely CT readers remember a year or so ago, one of the ebook vendors took back 1984 (irony cubed)

21

Watson Ladd 12.21.11 at 2:39 pm

Simple solution: Make ereaders A5, so scaling A4 by a linear factor of square root 2 produces the right output. Or even make them A4.

22

John Holbo 12.21.11 at 2:46 pm

“PDF is horrible to read on tablets. It’s laid out for an 8 1/2×11 sheet! Or an A4, if it’s a European PDF. So the letters are tiny on an iPad, and you wind up having to zoom and scroll. This is not a good user experience.”

This is one of my points. People always make their PDF’s this size, but they don’t need to. People should change their ways. I’ve posted my follow-up post now, so you can check out my freebies if you like.

“My main problem with ebooks is that I cannot use my usual note-taking technique.”

I have a couple different PDF readers on my iPad. They support note-taking to variable degrees, so you’ve got to try a couple out and see what you like. I’m not totally happy myself, but I’m happier with notetaking on iPad than I am with old fashioned paper. I was always a bad notetaker, and I’ve gotten a bit better at it.

23

Bruce Baugh 12.21.11 at 2:50 pm

The sudden success of the iPad is itself a good argument in favor of not relying on PDF over more adjustable formats – neat innovations will continue to come along, and why make it that much harder to use them? Meanwhile, here in existing 2011, a lot of people do read on Kindles and for that matter on smart phones, and I’m (to put it mildly) not sure that a subtext of “ha ha go get a real device you hoser” is what I personally would wish to send to prospective readers with the bad taste to favor devices I don’t, or misfortune to be stuck with them. “Yup, here you go” is much better for my own authorial stance.

24

John Holbo 12.21.11 at 2:55 pm

“The sudden success of the iPad is itself a good argument in favor of not relying on PDF over more adjustable formats – neat innovations will continue to come along, and why make it that much harder to use them?”

Yeah, I don’t mean to be an extremist about it. I really do regard reliance on PDF as temporary, until ePub grows up. But, then again, I expect academics will be making PDF’s for years to come, so maybe they should start making them a bit differently. Also, I’m interested in comics and graphic material. So that’s kind of a different issue. Also, I totally admit that we need flexible eBooks, but I wonder whether maybe they there are costs to making them too flexible that mean ultimately we should settle for a bit of stiffness. This is really several different points and I’m sort of mixing them up.

25

Jim Henley 12.21.11 at 3:00 pm

So this is, like, totally ableist, man.

It really is, actually, though I don’t think you should be clapped into a reeducation camp or anything. But one key benefit of .mobi and .epub is that they scale for people with vision issues.

As it happens, I played a kibitzing role in a pretty monumental ebook conversion earlier this year, so I totally feel the pain of the chaotic epub “standard.” (Liz Castro aptly likened it to the state of CSS2 implementation c. 1999.) Did you, John, know that the iBooks implementation of ePub has an “!important” tag? (Really, a class, I think.) It means, “Please, iBooks, don’t ignore THIS tag the way you do the others.” And then mobi is a whole separate beast. (In many ways less capable, but given the state of compliance it might as well be.)

But I was amazed – meaning, I learned something – how many people wanted to read the equivalent of a 300-page roleplaying game ruleset on their mobile phone. In fact, for one player in my campaign, the iPhone is her primary interface with the text (and art, and tables). And my wife loves reading books on her iPhone, which has a screen at best half the size of a Kindle 1.

My conclusion: on the reader’s side, there is no hunger for a standardized document “page.”

26

cian 12.21.11 at 3:01 pm

Yes you can make a PDF for one particular sized device and it will look lovely. It will look awful on every other device.

epub is the future (oh for the love of god let Amazon dump mobi, which is the internet explorer of book formats), for the same reason that html won over PDF/JPEG designed web sites in the early days of the internet. Its flexible, it will work on anything. Over time it will be good enough for most people’s formats, where PDF will also be a god awful format on any other size, other than the particular one it was designed for.

Yes epub has a number of problems at the moment, but then I remember designing HTML in the late 90s. It will improve – PDFs will not.

27

cian 12.21.11 at 3:02 pm

Oh, and apple’s implementation of the epub standard is particularly bad. Which may be part of the problem here.

28

Jim Henley 12.21.11 at 3:08 pm

Oh, and apple’s implementation of the epub standard is particularly bad. Which may be part of the problem here.

QFT!

29

Luis 12.21.11 at 3:13 pm

Yes you can make a PDF for one particular sized device and it will look lovely. It will look awful on every other device.

That’s exactly, exactly it, in one short, brief sentence. Glad I read everything before writing my own comment. :)

(And I’d note that, while I’m no typography/layout snob, I’ve had vastly more problems from bad conversions (e.g., frequent missing spaces between words in the book I’m currently e-reading) than from bad typography/layout (which has only really afflicted my e-copy of Leaves of Grass.)

30

K Zhang 12.21.11 at 3:18 pm

Other than the constraint of different screen size, people also seem to have different preferences for font size and demand to have the ability to change font size. This creates too many variants for the same book and raises production costs quite a bit. I doubt the publishers will make these investment. My guess is that the future of ebooks will still be with the HTML-like formats, as people will get better at making them. Remember the early days of WWW when all websites are ugly? With the evolution of HTML and better understanding of website design principles, people are better nowadays at making good looking websites for all browsers. The same will happen for ebooks.

31

Charles 12.21.11 at 3:19 pm

Reading any PDF on my Kindle is painful. I no longer bother. Maybe it’s better on the DX, but the small Kindles just can’t handle it. If I can only get something in PDF, I turn on my printer.

32

John Holbo 12.21.11 at 3:22 pm

“Yes you can make a PDF for one particular sized device and it will look lovely. It will look awful on every other device.”

I totally see the validity of a lot of these points, and cop to leaning somewhat exaggeratedly in the ‘maybe we need fixed pages’ direction – for thought provocation purposes! – but is it really true that PDF’s made for one particular-sized device look awful on every other? I admit that PDF’s made for the iPad look a bit weird (not unreadable, but a bit weird) on a regular old computer screen. Just as PDF’s for print look bad on the iPad. But I would expect a PDF made for iPad to look pretty ok on a Kindle or a Nook. Just with slightly different margins, right?

33

John Holbo 12.21.11 at 3:23 pm

I admit to finding it strange that people want to read books on their phones.

34

Charles 12.21.11 at 3:29 pm

” But I would expect a PDF made for iPad to look pretty ok on a Kindle or a Nook. Just with slightly different margins, right?”

No. The difference in screen size throws everything off. PDFs that look beautiful on my wife’s iPad are unreadable on my Kindle. The Kindle’s braindead PDF reader is part of the problem, but only a small part.

35

Andrew Fisher 12.21.11 at 3:31 pm

John@33
My wife does most of her reading at the moment whilst nursing our son. Phones are ideal for reading one-handed, and still work well in the dark when feeding him to sleep.

36

Lee 12.21.11 at 3:32 pm

I have five ways of (e-)reading nonfiction, depending on the complexity and formats available:

(1) For difficult/slow or illustrated reading, I read with my eyes. PDF would be okay if the iPad had a higher resolution screen. Too much zooming in and out, as it is. I returned my iPad 2 and am waiting for the Retina Display (TM) rumored for the iPad 3. Kindle/epub almost always spoils books with diagrams, illustrations, or art.

(2) For medium-difficult reading that is available on Kindle and Audible, I listen to it at 3x speed while following along with my eyes on my iPhone’s Kindle app. Yes, this is peculiar for an adult. But I find I retain the material better, keep focused, and move through books faster this way. During easy-to-follow passages I just close my eyes or put the iPhone in my pocket. Examples: Pinker’s Better Angels, Szpiro’s Pricing the Future. On a long commute, this is completely absorbing.

(3) For medium-difficult books not on Audible, I get the Kindle version, import it into Calibre, and convert it to ePub. (I do this in international waters, let’s say.) The result is usually fine, but occasionally unreadably janky. Then I import it into the Kurzweil-invented ereader app Blio. This app has changed the way I read. It uses high quality text-to-speech voices to read the book aloud while highlighting the words and turning pages automatically. People on the subway and at the gym are often peeking over at my screen. It can read fast, much faster than the Kindle hardware can. Example: Bellah’s Religion and Human Evolution.

(4) Breezy books I don’t ever read with my eyes. Strictly listening via Audible or TTS. Example: Sookie Stackhouse series (Audible).

(5) I will only read one or two books a year now not fitting into the above categories. Unless it’s truly special, the reading/listening opportunity costs are just too high.

37

Gareth Rees 12.21.11 at 3:34 pm

People read books on their phones because they already carry their phones around with them—the phone fits in a pocket, unlike a tablet—and so it’s enormously convenient to use the phone for reading. However poor the reading experience, it beats not being able to read anything because your tablet is at home.

Also, I second Jim Henley’s ableism point: from seeing my parents take up reading on the iPad with enthusiasm, it’s clear that the ability to increase the type size is vital.

38

Katya 12.21.11 at 3:41 pm

I read on my Kindle, because reading for long periods of time on a backlit screen is unpleasant, to say the least. I also like that I can resize the font when reading under different conditions.

For most books, with the exception of graphic novels, I’d say that book design and layout isn’t really an issue–the words on the page are the words on the page, and changing the font size or whatever does not significantly affect the reading experience. I mean, we don’t think that hardcover v. trade paperback v. mass-market paperback really matters all that much (except that mass-market books have that crappy ink that smudges on your fingers). It’s the rare book where messing with the margins and the font actually matters all that much, and I frankly expect that the technology will continue to improve, a la web design, to deal with those books.

39

William Timberman 12.21.11 at 3:47 pm

JH @ 33

Reading on a phone is great for waiting rooms in doctor’s offices, car washes, airports, car dealers, in jury assembly rooms, food courts, or on car trips — anywhere where you might want to read, but wouldn’t want to drag an iPad or Kindle along with you. You do get used to reading a paragraph at a time with a flick of the thumb at the end. It doesn’t seem to interrupt the flow much once you’re used to it. Better still, with an iPhone/iPad combination, you can pick up either device and resume in exactly the place you left off on the other device. The combination is awfully convenient, although I don’t suppose it would work very well for an illustrated medical text, or for blueprints.

40

John Holbo 12.21.11 at 3:49 pm

I get the advantage of smallness, but it doesn’t fit my lifestyle. I don’t have a smart phone (not that I’m proud of that or anything. I just don’t have an iPhone.) Mostly I wouldn’t read books on my phone, had I such a smart device, because I listen to audiobooks in those moving-about situations in which others are reading “The Hunger Games” on their phone, apparently.

Re: the print size. Yeah, fair enough. But you can zoom the text on a pdf easily enough. Take the samples I put up. Turn ’em landscape-wise and zoom the text. You’ll be reading 24-point and a line still fits all on one screen, probably. I think old people could read my PDF’s. Actually, I shouldn’t just say that. I should ask my parents, who are visiting.

One reason I’m not too worried about these issues is that it’s quite clear that PDF isn’t going to kill ePub, so I’m not going to be the bastard who deprives the near-blind of their books, on account of the revolution this post causes. Obviously ePub (or whatever) will get better and better. PDF isn’t the future. My point is more about good, aesthetically-pleasing book design. I don’t want page layouts to get permanently worse. And in the near future, it might be worthwhile making some nice PDF’s, for people with iPads who like pages that look better than the ones I’ve been looking at lately.

41

mds 12.21.11 at 3:55 pm

I use Calibre to convert ebooks into eReader format, then sync them to my Handspring Visor Deluxe. I had a PDF reader on the Handspring for a while, but the program took up too much space, was a resource hog, and rarely provided a satisfactory reading experience. This anecdote serves the purpose of (1) underscoring all the previous points about PDF format; (2) supporting Mr. Baugh’s point @ 23 in particular, that one might not wish to presume what device a document will be read on; and (3) telling people who say “Get off my lawn, kid” to get off my lawn. Why, back in the day, our e-readers were flipbooks of IBM punchcards—and we liked it.

42

Bruce Baugh 12.21.11 at 4:27 pm

John, I think that a lot of the answer is simply accepting how different the kinds of things we call “book” are.

For comics and graphics novels, CBR/CBZ is a good solution, and one I expect to last a long time.

For stuff where fixed formatting is genuinely important, or for printing out, PDF is a good solution, and one I expect to last a long time.

For stuff where fixed formatting is not important, and where reader control of the display is important, there are a bunch of options and we’re not there yet. Jim’s citation of the comparison to CSS2 is very apt, I think.

The hard part for some kinds of creators is recognizing when control over formatting on the creative end isn’t important, or where it’d be desirable to offer separate versions with and without fixed formatting.

43

bemused 12.21.11 at 4:46 pm

I love epub format and prefer to get my books that way, which I read on my android phone. But for those who care about books looking like printed books, see this:

http://openlibrary.org/dev/docs/bookreader

and the books served by the Internet Archive book servers. Several libraries are working to lend in this format, and they incorporate an interface to the Adobe DRM scheme to satisfy the publishers and allow lending of books in copyright.

44

bemused 12.21.11 at 4:52 pm

Also, if you want to play with epub book design, try the open source “Sigil” software.

45

someBrad 12.21.11 at 5:10 pm

Why choose? Seems like publishers can demonstrate their relevance by releasing books in: (a) a version formatted to look great on full size tablet (e.g., iPad), (b) a version formatted to look good on a small tablet/ebook reader (e.g., Kindle), (c) a version formatted to look OK on a smartphone, and (d) a non-formatted version that can be resized per the user’s requirements.

46

Jim Henley 12.21.11 at 5:24 pm

Why choose? Seems like publishers can demonstrate their relevance by releasing books in: (a) a version formatted to look great on full size tablet (e.g., iPad), (b) a version formatted to look good on a small tablet/ebook reader (e.g., Kindle), (c) a version formatted to look OK on a smartphone, and (d) a non-formatted version that can be resized per the user’s requirements.

I saw what you did there! (And it was awesome.)

This approach can be done, kind of! It’s what Jenna Moran did with Nobilis 3e. It is, however, much more work than you might think. And it still, given the current disarray, won’t work for everything. We gave up entirely on the Kobo Reader app with Nobilis, and accepted substandard results in Stanza/iPad. And just couldn’t afford all the different hardware platforms for testing.

47

cian 12.21.11 at 6:31 pm

Re: the print size. Yeah, fair enough. But you can zoom the text on a pdf easily enough. Take the samples I put up. Turn ‘em landscape-wise and zoom the text. You’ll be reading 24-point and a line still fits all on one screen, probably.

And voila, you have suddenly created a bad ebook. The formatting no longer matters (because its zoomed in), its an inconvenient read (because now you have to move around the screen).

Incidentally, there was nothing in those books you’ve done that couldn’t be done with epub as it stands today. With decent vectorisation, the books would be a fraction of the size also (35MB for a short story?). However, I’d question whether fixed pages is really the way to go for ereaders. This seems like a form of technological nostalgia. And do most people want to read picture books?

I’d agree that there are problems with epub (footnotes – for the love of god, footnotes), but it is possible to create very nice epubs. Some of the good folks at mobileread have done this (there’s a lovely collection of Dickens there, for example), and some of the publishers produce nice books (Penguin classics can be lovely).

On the other hand, given that the paper books that most people read is text, usually typeset by an algorithm…

48

todd. 12.21.11 at 6:58 pm

I read on a Kindle, and I agree that it’s lame when you have half a screen of empty space, an image, and then the chapter continues. When you see the blank space you think the chapter is ending awfully suddenly, and then there’s a surprise image, it’s weird.

BUT, I also like being able to convert freely between mobi and epub, so that I can buy books anywhere (break the encryption, sssh), and then read them on whatever device I like. Several devices simultaneously, sometimes. I’m not convinced that the kind of Kindle I have would draw a PDF even passably well. But I’d be willing to try out the ‘Kerchief’ PDF sample and send in some images of how it looks. And my phone definitely does not play well with PDFs. It reads them, but only with a lot of tortured pinching and scrolling.

49

Emily 12.21.11 at 6:59 pm

There are other options out there aside from EPUB and PDF. The Inkling app for iPad does reflowable text the right way. The UI has been designed specifically for the iPad and includes things like footnotes, beautiful fonts, embedded videos, interactive figures, etc.

50

bemused 12.21.11 at 8:11 pm

Emily @49: From what is visible on the inkling website, they have a proprietary format and are going after the textbook market. They lock you into their ipad app for every book you purchase. There is a long history of individuals and enterprises losing their investment by jumping on the bandwagon of niche proprietary document formats. I still refer to my undergraduate physics and mathematics texts 40 years later…I doubt that inkling format will be readable on devices that are current in 40 years. The advantage of an open format is that it can be used or upgraded as technology changes.

Even DRM within a public format (epub+adobe DRM) locks you into a proprietary license server, at least in theory. I want my ebooks (both free as in beer and purchased) to be available to me on all the computing platforms I choose to use them on. I don’t want them to be “revocable” as Amazon has once done with their kindle+DRM format.

51

John Holbo 12.22.11 at 12:32 am

“The formatting no longer matters (because its zoomed in), its an inconvenient read (because now you have to move around the screen).”

Well, yes. But that’s actually true for any large-type eBook, PDF or Epub or otherwise. Namely, the type is so large that you need to mouse or scroll around.

“Incidentally, there was nothing in those books you’ve done that couldn’t be done with epub as it stands today. With decent vectorisation, the books would be a fraction of the size also (35MB for a short story?).”

I respectfully disagree. First, you most definitely couldn’t do it with ePub as it stands today, and I seriously doubt you will be able to do it with ePub 3 (which is not to deny that ePub 3 will be a big step forward.) Second, of course you could save space by vectorizing the images. But then they would look a lot worse. It’s not the case that reducing size is automatically the right answer. The question is: which do people prefer, small size and (frankly) crap image quality or good image quality but larger file size? Finally, “The Chimes” is usually classed as a short novel, not a short story. It’s 30,000 words or so. So it’s not like I’m turning tiny little works into bloated files. And again, it’s the images that are taking up the space, not the words. 11 600 dpi images just plain take up space. Epub isn’t some magic method of compressing images without sacrificing quality. Nothing can do that.

“And do most people want to read picture books?”

The point isn’t that all books need pictures. The point is that if books have pictures, they should look good and be placed properly.

“I’d agree that there are problems with epub (footnotes – for the love of god, footnotes), but it is possible to create very nice epubs. Some of the good folks at mobileread have done this (there’s a lovely collection of Dickens there, for example), and some of the publishers produce nice books (Penguin classics can be lovely).”

I am working on mastering epub but it’s awful, although I’m sure it will get better and better. We’re just at an awkward stage. I don’t deny that nice epubs are possible and I will look at the mobileread Dickens offerings for comparison, if I can (if they are free! Not that I insist on free, but after making my own, I’m not going to buy others!) I read a lot of free Kindle books that are, I am sure, just Gutenberg scrapings, and they don’t look very good and have typos and so forth. I’m glad to have them at the price. What bothers me is when I pay good money for an ebook and it still looks like total crap. I would like some good-looking eBooks, for themselves, and pour encourager les autres.

52

cian 12.22.11 at 3:17 am

Well, yes. But that’s actually true for any large-type eBook, PDF or Epub or otherwise. Namely, the type is so large that you need to mouse or scroll around.

Not really. On an epub I just hit page turn. With a PDF I may have to scroll left/right if its still too small for my reader/eyes. Plus its a less ‘natural’ reading experience, once you’re no longer reading it as a page per screen.

I respectfully disagree. First, you most definitely couldn’t do it with ePub as it stands today

Sure you could. Number of ways, depending upon the effect you wanted. I’ve seen far more complex things done with epub, and in ways that worked on earlier readers (part of the problem with epub is that the readers are still quite primitive). There are some nice examples on mobileread. There’s a three men in a boat, for example. And somebody somewhere did a lovely illustrated Alice.

More complex layouts can be a problem, fine – but then more complex layouts don’t really scale on a variety of devices. What is needed is an approach that works out how to gracefully combine images and text, that reflows properly. Its a different medium – and in part designers simply need to adapt to that, just as they did with the web.

I am working on mastering epub but it’s awful, although I’m sure it will get better and better.

Awful seems rather strong. There are definitely some things that need to be addressed, but if you’re complaining about the packaging requirements – well that’s intended to be done by machines. Would you handcode a PDF. Its perfectly possible… Sigil and Calibre can both handle construction of epubs. Or if you have the budget for it, Adobe’s tools are apparently pretty good. A lot of things are more complex, because flexible layout is simply more complex than fixed layout. There’s not much you can do about that.

MobileRead have a huge library of free books produced by volunteers. Not all of them are good, but many of them are very good indeed. The best ones are created by HarryT and Jelleby in my experience. HarryT has done a complete Dickens, which included proof reading the gutenberg versions. He’s also done Austen, and others. The more recent ones were produced as epubs (earlier ones predate the standard, and so don’t look as good) – but all are available as epubs.

Incidentally, depending upon the source vectorising images can actually improve them. And it works just as well for PDFs. But even allowing for that, the PDF you created was extremely large by the standard of illustrated PDFs. I have heavily illustrated technical colour PDFs that are far smaller. You might want to look into compressing those images.

53

John Holbo 12.22.11 at 5:44 am

“Incidentally, depending upon the source vectorising images can actually improve them.”

By smoothing the quality of the linework, yes, but in this case the point is to try to preserve it, so we don’t really want that, I think. But obviously a vectorized version of the images would be much much smaller. Yes.

“But even allowing for that, the PDF you created was extremely large by the standard of illustrated PDFs. I have heavily illustrated technical colour PDFs that are far smaller. You might want to look into compressing those images.”

Again, the issue isn’t whether images can be compressed, so long as you are willing to take the hit in terms of quality. There’s no mystery as to why your illustrated PDF’s are smaller. They don’t contain 600 dpi images. Probably more like 100 dpi. 150 tops. Which will look fine so long a you don’t try to zoom in. The question is whether it’s nice enough to have better quality, zoomable images that it’s worth the cost in file size. That’s why I included the 300 dpi alternative for comparison. I prefer the high quality and don’t mind the cost, because 30 megs isn’t all that much.

“Sure you could.”

I don’t really think so, but I do admit that Epub3 is sure to be a big step forward in a lot of ways. One issue that I have is how to get text to wrap images gracefully. This is not because it’s such a universal problem. But I happen to have the problem, with the books I’m trying to make, so I struggle with it. Really the only way to solve it is to lay out pages. To some degree, I just need to accept that I’m making books that aren’t a good fit for Epub. That’s not Epub’s fault because it’s virtues are flexibility, among other things. But I can’t help looking for alternatives.

54

Zora 12.22.11 at 6:57 am

I have been making ebooks for Project Gutenberg for eight years, for Distributed Proofreaders. Over those years, we have been tightening our standards and our books are now better than most of the commercial ebooks I’ve purchased. They are certainly better than the earliest Project Gutenberg books (which were done by individuals who worked on a whole book at a time, and are of variable quality). We do two to three rounds of proofreading, two rounds of formatting, one of post-processing, perhaps a round of smoothreading, and a last inspection before posting.

Don’t snark at our books.

55

John Holbo 12.22.11 at 7:34 am

Oh, I didn’t mean any disrespect to Project Gutenberg. Sorry if it came out that way. I was sort of crossing two thoughts in my mind. The disrespectful one was this: a lot of e-books for sale are actually just repackaged Gutenberg books, which you should be able to get for free, and which are sometimes a bit deceptive in how they are presented. This doesn’t reflect badly on Project Gutenberg, for which we are all grateful, just on folks trying to make a buck by scraping Gutenberg for stuff to resell, without actually contributing any value to it.

The worst offenders in this regard are POD re-publishers, of course. Here’s a great example. Someone is trying to resell those Archive.org Spenser scans I used to make my book for $28!

http://www.amazon.com/Spensers-Faerie-queene-poem-books/dp/B004SZ3T2Y/ref=sr_1_sc_1?ie=UTF8&qid=1324539147&sr=8-1-spell

56

Nicholas Wolverson 12.22.11 at 12:59 pm

My experience of reading on Kindle has been that the text generally flows well, and I’m immersed in the book until BLAM halfway through a line I’m hit with a hard hyphen/line break from the print version. As jarring as coming across an obvious spelling or grammatical error indicating insufficient proof-reading – but certainly nothing to do with the format.

In most cases I’ll admit chapter headings and section breaks are not formatted as well as they could be either. But 1Q84 was the first ebook which I actually noticed for its good layout – the chapter titles are very nicely done. Why do I see ASCII art breaks or low-res bitmaps as separators elsewhere?

Ultimately the primary limitation of ebook formatting right now is laziness, and PDF as a format can only resolve that if it means not bothering to reformat in any way. The laziness no doubt stems from the lack of market force to get these things right – hopefully that changes a bit as more books are sold electronically.

57

tychoish 12.22.11 at 1:02 pm

I find reading software for PDF to be less flexible and to provide a less enjoyable experience than ePub. In part because it’s never typeset for the device I’m using (and I don’t have/use iPads, iPhones, and MacBooks.)

The ability to customize the font size and flow of an ebook for any device, or desired reading environment (need bigger fonts? great! need white text on a black background? great!) has almost always been an advantage of eBooks, and I think a typographical snobbishness is a poor reason to take a step toward making books less accessible.

I think the answer here is:

– Develop better reading software that has better fonts, and better defaults. Thankfully FBReaderJ is open source and cross platform, so it’s not insurmountable.

– Learn how to, and improve the editing tools for ePubs (and other formats.) The truth is that the web hasn’t changed technologically very much in the last 15 years, but we’ve gotten a lot better at using it. The same thing is and will be true of ebooks, it just needs to happen.

– The killer experience of Kindle books is the fact that your position in the book is pretty good at syncing between your devices in a seamless way that we forget about. While this is not difficult to implement, it’s important to not forget about it.

58

John Holbo 12.22.11 at 2:35 pm

Well, I could get all Nixonian and pretend I’m sure the Silent Majority of PDF-loving decent folk are with me, but it looks like I’m more or less totally outvoted.

59

cian 12.22.11 at 6:40 pm

Given that they’re pre-print PDFs (at a guess), I’d assume they’re around 300-400 dpi. That’s about normal for printing on a page. PDF is a print medium really. Kind of a viewable PostScript file, that got extended into other things. A truly horrible format.

Given that what you have here are scans from an old book, and will be scanned at less than optimal conditions, 600 dpi seems high. Still I can’t really understand why your images are so big. Are you compressing them? Are they stored as colour, or B+W?

I don’t really think so, but I do admit that Epub3 is sure to be a big step forward in a lot of ways.

Well you can because I’ve done so. It took some advanced CSS, but its possible. Or, one could just use SVG and maintain the page layout for the images that you’re using.

One issue that I have is how to get text to wrap images gracefully. This is not because it’s such a universal problem.

Have you considered that this is because you’re not very technical, or that you’re using the wrong tools. I think you’re right that epub may not be the right solution, if what you’re trying to do is reproduce exactly the experience of the printed page on an iPad. Though I’m not sure PDF is really the solution for that either.

But I think you’re mistaking your own limitations as a coder, as limitations of the format. It has limitations, sure, but they’re not nearly as bad as you seem to think.

60

pjcamp 12.22.11 at 7:01 pm

I only buy books that are (a) mine, not my license to use (b) not DRMed and (c) not likely to be obsoleted. That means no ebooks of any sort at the moment. I use paper.

Even so, something formatted for an iPad would not be formatted for me since I refuse to be sucked into Apple’s closed system. See points a, b and c.

61

todd. 12.22.11 at 7:15 pm

I dropped a couple of your PDFs on my Kindle. The Dickens looks really good, and I like the font. Of course, it happens that I have my Kindle set on a tiny font of about the same size as the text in the PDF when “fit to screen.” If you have to zoom in any more, it is significantly more annoying to read — the Kindle’s inferface for panning is much less refined than its interface for paging.

The screen size and eink’s lack of enormous contrast had a hard time with “Kerchief & Madness,” so I would say it looks “eh.”

If it will be at all illustrative, I took photos comparing each of those (and an epub on my Kindle) to a print-out that was lying near to hand, over here: https://picasaweb.google.com/108856674463006044561/Ebooks?authkey=Gv1sRgCOmu35mln5S_CQ#

62

Doug K 12.22.11 at 10:00 pm

my ebook reader is a computer, the books are from Project Gutenberg in HTML (thank you Zora) and they work beautifully.

When I read this on Epub 3,
“The biggest bang I see EPUB 3 bringing to the digital publishing world is undoubtedly the ease with which it will allow the creation of rich multimedia and interactive experiences. ”
I thought, “wow, Microsoft Encarta for the 21st century. Say, wait a minute..”
As Brent Simmons observes, most people who like to read find video way too slow.
Epub 3 did a lot of work on accessibility, though, so credit for that much.

PDF is painful to read in any format, because it’s not designed to be readable, it’s designed to be printed. I spend a considerable fraction of my working day reading software technical documentation. PDF and the internet allowed these publishers to shift the costs of printing to their customers, which is fine if you plan to print out the manuals and stick them in a three-ring binder; it’s horrible to try to actually read or search the PDFs, especially if subjecting oneself to the impervious horrors of Adobe Reader. Many software houses still use PDF. The worst ones use columns in the PDF too, all the better to sell their customers consulting services with, after the frustration of trying to read the doc overwhelms us. Here too HTML works so well that I don’t even notice it. Having finished that rant – thanks for the Faerie Queen PDF, I enjoyed the pictures..

63

David Littleboy 12.23.11 at 12:51 am

“the impervious horrors of Adobe Reader.”
Don’t you mean “imperious” horrors? Whatever, yes, it’s quite horrible. Slower than molasses on a winter’s morn. Foxit Reader is quite a bit better. I’m not fond of the aesthetics of the Foxit UI, but it works. My customers send me PDFs with annotations, and Adobe Reader’s highlighting complete obscures the highlighted text. Ouch.

64

John Holbo 12.23.11 at 3:10 am

“Given that what you have here are scans from an old book, and will be scanned at less than optimal conditions, 600 dpi seems high.”

If they are badly scanned, there’s no point trying to get blood from a stone, of course. Line art like this should be scanned at at least 600 dpi and, even just for print, 600 dpi for this sort of thing is recommended. (300 dpi for color illustrations is standard, but 600 for bw line art.) Really you should scan this sort of thing at at least 1200 dpi, for clean-up, although of course you aren’t going to print/display it at that resolution.

Sorry, maybe I should have just said it: I scanned the images myself, and then cleaned them up carefully.

“Still I can’t really understand why your images are so big. Are you compressing them? Are they stored as colour, or B+W?”

I think you may have somewhat unrealistic notions about digital file sizes for high quality images. Mine are grayscale Photoshop docs, one layer, transparent except for the line art, placed in InDesign, converted to PDF. They are 600 dpi so they are large, but, of course, vastly smaller than any color image would be at that resolution. They are not compressed but, obviously, could be. But then if you zoomed them to see the detail they wouldn’t be crisp images any more. They aren’t vectorized, because that would degrade the line quality. Out of curiosity, I just made a fresh “Chimes” edition, bicubically downssampling to 100 dpi from 600 and the whole thing came out as 2.7 megs, which is maybe more in line with your expectations. Those expectations are, to repeat, due to the fact that normally people go for 100 dpi or so. Geometrical growth of file size, due to high dpi, is a harsh mistress. Love her or leave her, man.

“PDF is a print medium really. Kind of a viewable PostScript file, that got extended into other things. A truly horrible format.”

Agreed! That’s why what I have been proposing is rather ironic. I am aware of all internet traditions in this regard! Championing PDF as the bleeding edge of 2012 is quixotic at best. (Sorry, maybe I should have made it clearer that I see the oddity of what I am proposing.) But, until Epub books look better, per my post and per comments, I do think people might consider making PDF’s that aren’t sized for printing but for the iPad. Lots of people think PDF is bad because it’s always 8 x 11, but obviously it doesn’t need to be.

“Have you considered that this is because you’re not very technical, or that you’re using the wrong tools. I think you’re right that epub may not be the right solution, if what you’re trying to do is reproduce exactly the experience of the printed page on an iPad. Though I’m not sure PDF is really the solution for that either.”

Per the post, there is a distinct possibility that my biases, in this regard, are skewing my judgment. Yes, I have considered this. Indeed, it seems that my hypothesis that a nice pdf would look basically ok on most any non-phone reader has been substantially falsified by responses to this thread. Live and learn.

I am obviously running together distinct issues. A lot of ebooks look like crap now, but that will improve as Epub gets better. I don’t mind plain, just ok ebooks but it annoys me when I pay good money for, say, an academic press title for Kindle and it still looks terrible. We should be doing better. I am obviously interested in producing ‘art books’ for the iPad. I want them all nice, but that means treating pages as images. And I just need to deal with the fact that books are going to be treated as flows.

Re: doing my Dickens books as Epub and having them be just as good. You need to make sure that the text that wraps the images is the right text, and occasionally this means jiggering the pages and so forth. Finnicky stuff that is done by hand and not worth standing in front of the Epub train shouting ‘halt!’ for. (Which is why, per the post, I’m not standing in front of the train, shouting ‘halt!) I’m sure you are right that Epub already allows better quality than people are realizing with it, and it will only get better as we learn, and as Epub progresses. Your emphasis is just different from mine, I think.

65

John Holbo 12.23.11 at 3:15 am

Thanks for the review, Todd. Glad the Dickens looks good enough on the Kindle. I should have said in my original post that I don’t expect things like “Squid and Owl” and “Kerchief” to look good on anything less than an iPad. I guess what I’ve learned from this post is that nothing I am making this way will really look good on anything but an iPad, which is sort of a small pool to be paddling in, ultimately. How about the Nook? Anyone here got one?

Glad you appreciated the Faerie Queene PDF, Doug. (It was more work than it was worth, so I’m glad it’s worth something to someone besides myself.)

66

Meredith 12.23.11 at 7:13 am

http://vanishingnewyork.blogspot.com/
Search especially St. Marks Bookstore.

67

Martin Bento 12.23.11 at 8:59 am

John, I haven’t looked at all into what you’re dealing with, so perhaps this is not applicable, but just FYI, as a general matter it is not necessarily true that compressed images are lower quality in any sense. There are “lossless” and “lossy” forms of compression. The latter offer more compression, but the former produce an identical result.

68

John Holbo 12.23.11 at 11:17 am

Yes that’s right Martin. But in this case it doesn’t help. You can compress a Photoshop doc, losslessly, and save a few percent space, but you can’t do that to an image embedded in a PDF. The only way to get any savings there is to reduce the dpi, which is lossy.

69

John Holbo 12.23.11 at 2:05 pm

In defense of cian’s point of view, the much much smaller (2.7 meg vs. 34 meg) book file looks only a smidge less good, so long as you don’t zoom the images. And some people are saying their readers choke on large PDF’s, so if I’m going to produce these boutique behemoth files, I really ought always to make a svelte version as well.

I really only made the 600 dpi one because the line art for the Chimes in particular has all these tiny details that are actually impossible to appreciate without a magnifying glass, on the page. It’s kind of nice to build a magnifying glass into the page, in effect. But I hereby show that I have rather idiosyncratic interests, and I should be careful not to try to project a path for the eBook industry out of them.

70

Matt McGrattan 12.23.11 at 3:56 pm

There are quite sophisticated PDF compression tools that will compress binarised text with JBIG2, do page segmenting, and then compress any images with jpeg2000. There’s lots of ways to decrease the size of a PDF (in bits) without significantly degrading image quality. and which don’t involve resampling. There are, of course, still limits to the final file size, and people/users often have crazily unrealistic assumptions for how small multi-page documents with images/line-art can be made without serious degradation in quality.

Also, no-one scans books at 1200dpi. Industry standard used to be 600dpi [this is for archival scans from special collections type material), but current working practice in many institutions involves scans at lower resolution than that. 1200dpi may indeed be optimal for certain kinds of material, but no-one really does it because it’s higher resolution that would fit in most digitisation workflows, even workflows using extremely high-end kit.

I work in this area, fwiw, so this isn’t ex recto.

71

John Holbo 12.23.11 at 4:33 pm

“There are quite sophisticated PDF compression tools that will compress binarised text with JBIG2, do page segmenting, and then compress any images with jpeg2000. There’s lots of ways to decrease the size of a PDF (in bits) without significantly degrading image quality. and which don’t involve resampling.”

I did not know that. So this is a slightly lossy way to do it? Now that I think about it, there is a setting on my InDesign control panel that has always puzzled me. A ‘compress’ box you can check, alongside all the options for downsampling. I once tried making a file with that box checked, then unchecked, and the files were exactly the same size. So I proceeded to ignore it. Possibly that was a mistake and if I had made some other adjustments, I could have gotten at least marginally better results.

“Also, no-one scans books at 1200dpi.”

True. But I might – and do – scan individual images or plates from books at that resolution. Engravings with lots of fine linework. (I’ve been scanning some Gustaf Doré stuff, for example.) For most images it’s unnecessary, as you say. And for text it would be silly to go above 300. I guess, having just futzed and somewhat failed with the Spenser stuff – which was just not good enough – I am in the mood to urge erring on the side of large images. Obviously this is time-consuming. In my case it works because I park myself by the scanner and read something. If the scanner takes 4 minutes per image, I actually get some reading done. Looking up and turning the page every 4 minutes while you sit and read for several hours is a good way to get some reading done and not have your high-res scanning be an absurd waste of your time. But maybe I’m insane. It’s been known to happen.

72

John Holbo 12.23.11 at 4:35 pm

I have also always assumed there wasn’t much point trying to compress images in PDF because when I just plain try to compress a PSD file – using StuffIt or whatever – the results are meagre. You save only a tiny amount of space. So I just assumed PSD’s were effectively incompressible without loss.

73

Matt McGrattan 12.23.11 at 5:23 pm

Both JBIG2 and jpeg2000 can be run losslessly, so it’s possible to compress significantly with no loss. With jpeg2000, using lossless compression, you can end up with a file about 40% of the size of the uncompressed original. You can use lossy settings that are, in ordinary use, indistinguishable from lossless and make even bigger gains again.

File compression, such as stuffit, that isn’t tailored to images isn’t very efficient. JBIG2 and jpeg2000, on the other hand, are image compression methods, and very efficient. The PDF standard supports both, so it’s possible to wrap JBIG2 and jpeg2000 images in a PDF wrapper, to create much smaller PDFs that are still viewable without special tools. CCITT group 4 compression is also pretty good for scanned text, although not as efficient as JBIG2.

I don’t know about InDesign. But there are specialist tools for image compression. You can use something like ImageMagick (with jasper as jpeg2000 library) or Kakadu, or Luratech to do the jpeg2000 conversion on non-text images — line-art, illustrations, photographs, and so on — and there are similar tools for jibg2 compression of bitonal images (of text). There are commercial packages (LuraTech, for example, make one) that’ll wrap both jbig2 and jpeg2000 compression together on a single file. Some testing we did at work recently using one of these tools (and somewhat lossy compression settings) gave us a PDF that was about 20 – 25% of the size of the original (which was itself already using compressed images).

74

Matt McGrattan 12.23.11 at 5:38 pm

Also, even if you were going to use PDF as your delivery mechanism, there’s no reason why the page size issue should be a problem. You can generate PDFs on the fly from text and images, and have them optimised for whatever screen resolution or quality level you like.

75

todd. 12.23.11 at 6:22 pm

I would say the Dickens looks better than “good enough;” I wish I could pick that font to use all of the time.

“You can generate PDFs on the fly from text and images, and have them optimised for whatever screen resolution or quality level you like.”

Serious question: Is this feasible on a Kindle’s CPU, without totally draining the battery? I guess it might not be any more computationally intensive than rendering webpages in the (very meh) browser. But TeX always seems like it’s working harder than a browser.

76

Matt McGrattan 12.23.11 at 6:38 pm

re: 75

I meant on the fly at the server end. You download a book, it pops up some checkboxes for screen-size and quality, and the server generates the PDF on the fly, for your device.

That’s how some institutions that provide, for example, downloads of archive material such as old newspapers do it. They don’t store PDFs for everything, as only a few things will ever get requested, and instead they generate them on the fly at request time. If you add a bit of caching, for the more heavily used stuff, it’s quite fast.

If you were e-publishing a book, and wanted to offer PDF delivery — either instead of, or in addition to epub and other formats — you could have the server render the PDF from text + markup [in some form or other] and image files for illustrations at whatever resolution and quality level you wanted.

77

todd. 12.23.11 at 6:51 pm

That makes a lot of sense. I guess you lose “on the fly” font size selection, but for the majority of ebook usage that is probably fine.

78

John Holbo 12.24.11 at 3:36 am

“I would say the Dickens looks better than “good enough;” I wish I could pick that font to use all of the time.”

Thanks! Stempel Garamond. Germans imitating a French font that isn’t really French. And the kerning is a bit questionable. As a result it looks very respectably English, in my amateur opinion.

And thanks for the information, Matt.

79

Andreas Moser 12.26.11 at 9:13 am

E-books are NOT the future.
You’ll have to update your e-reader every few years, maximum.

I stick to paper and I can still read the books that my grandparents bough.
Print is king: http://andreasmoser.wordpress.com/2011/03/27/print-is-king/

80

Anne Frances 12.26.11 at 11:06 pm

George Iordanou @9: Ah, because I am reading this thread days later, you will probably not see this, but: If you buy books at Amazon for reading through the Kindle software, then any words you highlight are stored through Amazon: You go to your account, and then to your Kindle account pages, and — voila — for better or worse, for paranoia or ease — there are all your highlights, which you can copy and paste elsewhere or read online.

Comments on this entry are closed.