Thursday, October 29, 2009

Not what they get

Recently, I had a discussion with someone where I related how a single Word doc printed differently across different versions of Microsoft Word. The person I was talking to responded (quite seriously) that WYSIWYG means "What You See Is What You Get", not "What They Get", that Word actually renders the document on screen based on the capabilities of the default printer on that computer, so that you should expect the same document to print differently on different computer+printer configurations.

Whoa, it is really sad that it's not only accepted, but expected behavior for a word processing document to look vastly different (note my example: wrapping text around a table, page breaks, etc.) depending on the printer Windows was using. I might understand if the text were rendered a little different due to fonts (installed on the printer) being slightly different from the fonts Windows is using. But I find it hard to believe that text flowing around a table should be any different on one computer+printer vs another computer+printer. If that's really how Windows works, I think I'm even less of a fan.

And yet, Microsoft makes a big deal that if you run Microsoft Office, you will be able to share your documents with others running Office. Apple makes a point of that too in some of their ads. The Microsoft ad copy on Apple's Online Store says:
The latest version of the industry standard for productivity software on the Macintosh platform. Microsoft® Office 2008 for Mac is more powerful and easier to use. Office 2008 combines Microsoft Word for Mac, Microsoft Power-Point® for Mac, Microsoft Excel® for Mac, Microsoft Entourage® for Mac, and Microsoft Messenger for Mac and lets you easily create high-impact documents and seamlessly share your ideas with others, whether they are on the Mac or Windows® platform.
(Emphasis mine.)

And yet, if you cannot guarantee that your document on a Mac (in my example, at least one person printed their copy of the doc on a Mac) will look the same as on Windows, how is that seamless???

On the other hand, Linux/Unix systems expects the application to generate a Postscript document, which is then sent to the print driver, and it's the driver's job to turn that into a printed page on the specific printer. A document should look the same printed on a laser printer vs an inkjet printer. Isn't that the whole point of a printer driver? I think the Linux/Unix method makes more sense.


  1. The "WSYSIWYG" has always been an approximation on every platform regardless of the display and print rendering design. One possible exception is the original Macintosh and its ImageWriter printer.

    I encounter documents that do not print as they appear on screen whether it's a Macintosh, Linux, Solaris or Windows, and regardless of rendering language, including PostScript.

    When the document is a test print, I have to determine why it doesn't print correctly. It almost always comes down to truncation error and a boundary condition in either the:
    a. transformation of the document by the application into the print rendering language, by far the most common error, or
    b. rendering the image into the print bitmap, a problem that occurs either in the driver or in the printer.
    Often, just changing the margins or a cell dimension to a slightly different value is enough to fix the problem even for documents printed on different platforms.

    If rendering to the display and to the printed media are both driven by the same language than one has the highest probability of WYSIWYG. That's why the document displayed by a PDF reader almost always match the printed document. However, it's still possible for some PDF documents to be rendered poorly (downright fuzzily) on screen. I've yet to ascertain why that occurs but I do know that turning off anti-aliasing often resolves the issue for black and white documents.

    Compatibility of Office documents across the differing versions of Microsoft Office and platforms is a completely separate issue. My comments here apply only to differing versions of Microsoft Office.

    Saving an Office document to an older format does not make the document share the same print or screen appearance with the version saving the document. All that the save does is remove features that would otherwise prevent the older version from opening the document.

    To maximize backward compatibility with other Office versions, it is necessary to set the Compatibilty setting in Microsoft Office to the values that correspond to the older versions, then create the document, and save it in the old format.

    Even then, print time options that are controlled by the user and not by the document, e.g. not stored in the document and restored, can cause dramatically different prints. One of the more common and least expected setting is to print "hidden" text. Even if the document author has not marked any text as hidden, Microsoft Word does. Displaying the hidden text will alter how tables and text are presented.

    But explaining some of the reasons why Microsoft Office documents don't print the same across version and platforms and how to prevent it doesn't excuse the problem. The morale of the story is that it doesn't matter on which platform you create a document, if you want print portability, you'll need to convert it to PDF and you'll need to check the conversion either with an PDF reader or better yet, printing to a consumer printer.

  2. If that's really how Windows works, I think I'm even less of a fan.

    It's not Windows's fault; Word is doing extra work to be stupid. (I'm not sure if that's better or worse. :-))

    It's been many years since I've read about programming printing on Windows, so it's possible I'm flat out wrong about this or it's changed, but I'm 99% sure that at least one common way of doing printing is to use the exact same GDI calls programs use to draw to the screen. GDI is the graphics... device... interface? Whatever; it's the API that programs use to draw stuff to the screen. It's got about what you'd expect from any drawing API; you can draw lines given the coordinates of the endpoints, you draw circles given a circle and diameter (or maybe the bounding box), and you can draw text given the coordinate where you want it to start and information about the font.

    These exact same primitives are what you use to draw to the printer. All you need to do print what you have drawn to the screen is obtain a GDI handle to an object representing a printer and pass it to your normal drawing function. (Maybe to be useful the drawing function has to be a little more aware of the coordinates its using so it doesn't come out tiny or something like that, but setting up a coordinate transformation first probably takes care of most of that.)

    If you want to print what you have on the screen this is way easier than figuring out how to generate the corresponding postscript; if you want to draw something substantially different, it's probably no harder in general.

    (Note that I'm not saying Linux doesn't work this way; for all I know Qt and GTK make printing stuff you draw to the screen as seemless as Windows does. I don't really care for purposes of this discussion; I'm just explaining what Windows does.)

    However, what GDI doesn't do is any of the fancy layout stuff that you need to actually lay out a document. I'm not even entirely sure it'll do wordwrapping at a certain width, though it might. All that's done by Word.

    Basically what is going on is -- for some reason -- Word is specifically querying properties of the printer and changing its layout accordingly. As I said before, it's doing extra work to be dumb.

    (I've seen this topic come up a couple times before, e.g. on Slashdot. I don't recall seeing any particularly good explanations for why Word does what it does.)

  3. BTW JH, I pretty much agree with you (and disagree with BillR) about the reasonableness of Word doing this.

    As BillR admits, PDFs virtually always come out the same; I disagree that there's something about PDFs vs. a Word processor document that means that significant stuff like where line and page breaks happen should vary depending on the printer.

    (I also kinda wonder how susceptable DOCX is to this sort of problem relative to DOC. I sort of hope that MS took the oppontunity to start with a mostly-clean slate, but I've not heard of anything to indicate whether or not this is true.)

  4. Bah... and sorry to spam, but I should add a disclaimer that I don't have firsthand knowledge of what exactly Word is doing; I'm just trying to infer based on what I know about the API and Word's behavior. I could be wrong about just about anything I say there.

  5. evaned:
    > I also kinda wonder how susceptable DOCX is to this sort of problem relative to DOC. I sort of hope that MS took the oppontunity to start with a mostly-clean slate

    It's not like Office is exactly standard, anyway. Even if MS did start with a clean slate with DOCX, they keep changing it so who knows how it's going to end up.

  6. But that's just the problem. MS Office is the defacto standard. So few people use other office software that it's assumed everyone has a copy of Microsoft Office. In high school, when I told a teacher I didn't have a copy of MS Office on my computer (I didn't know about Open Office at this time), he told me to pirate it.

  7. @some guy:

    That's... somewhat of a red herring. Word would be better software if it gave consistent layouts than if it didn't, all other things being equal. This is true whether or not the format is a moving target for others.

    I like better software.

  8. So someone was expecting a psychic versino of MSWord which knows anything and everything about your printer and miraculously compensates? Ahahahaha!

    It's surprising that this is still an issue. I had that problem back in 1998 and it's one big reason why I dumped all MS software in favor of free software. The university's UNIX (Solaris) servers printed things just fine and even Linux printed things fine (once you got the printer set up), but the MS print routines screwed up *most* of the time. I couldn't afford a Sun server back then so I was stuck with Linux + other free software. (The Debian Linux repository just looked so much better than the BSD offerings.)

  9. I'm trying to remember back to my digital pre-press days here... I recall being surprised when opening a Word document on Windows and having the whole thing reflow when I chose a different printer. I recall that not happening in Word for Mac. I think the difference was in document defaults versus printer defaults, though it may have been more closely related to PostScript versus whatever that-crap-is-that-Windows-uses-to-print.

  10. Let's not even talk about Excel, where autosizing the cell to fit the text does no such thing, either for viewing on the screen or printing on the page.

  11. try joint collaboration with an english colleague using Word (I'm american). As soon as your colleague pastes a portion of her document into yours, all is lost. The pasted portion carries over properties from the file, and these gradually metastasize throughout the recipient's organizations entire hard drive. Pretty soon people find themselves opening documents and discovering that they're formatted for A4, or their spell checker mysteriously requests extra vowels in the middle of words. WYSIW (the last W stands for "weird").


Note: Only a member of this blog may post a comment.