French punctuation in Microsoft Word: Rick Schaut chimes in

Posted by Pierre Igot in: Macintosh, Microsoft
June 25th, 2006 • 3:42 pm

This is really becoming quite extraordinary. Regarding the bug with Postscript fonts and the non-breaking space in Word 2004, MacBU developer Erik Schwiebert has now openly admitted that not fixing the bug was a mistake on their part:

The sad thing about this particular bug is that it falls into the category of ‘oops.’ The code was wrong in a very small way (an exception was made for the breaking-space character and accidentally forgotten for the non-breaking space character.) We actually found this bug in-house just a little while after Office 2004 shipped, and fixed it internally for the next major release of Office in August 2004. We even noted that it was important in the French market. Somehow, though, it was never flagged as a bug that should be migrated back to Office 2004. The tester for that area of Word (who found and opened the bug) left MacBU to go back to graduate school during that summer, and I think we probably just missed this one in the transition. A simple human mistake.

Regardless of whether you think that the excuse is valid here or not, you have to agree that it is a pretty clear admission that the bug was indeed identified and that Microsoft meant to fix it, but somehow managed to “forget” about it during the bug fixing process—which is why it’s still there in Word 2004 nearly two years later.

Unfortunately, fellow MacBU developer Rick Schaut, who appears to be on a mission to denigrate me personally (no matter how often he says he’d much rather spend time actually fixing bugs, he sure spends substantial time making sure that everyone knows how the MacBU feels about me and my reports on Microsoft products), decided that he could not leave the issue alone and had to chime in with his own comments.

And that’s where things really get out of hand. I will ignore the personal attacks against me, whether in Latin or in plain English, because I am simply not interested, and I think that ultimately they are a reflection of Rick Schaut’s true character more than anything else. But I cannot let this slip unnoticed:

It is, in fact, not possible to do proper French punctuation on any computer, and particularly with any Type 1 font, by using non-breaking spaces.  In a number of instances, the non-breaking space is too wide.  Most French users of Word don’t use non-breaking spaces at all.  They use advance field widths to get the proper spacing.  Fields don’t cause line breaks, because there is no character that can form a line break.  Fields merely change the width of the inter-character spacing.

(This is something something that Rick writes in a comment on his own blog post, where he tries to explain to a fellow named Doug, who has the misfortune of trying to defend my views as expressed on my blog, just how wrong I am in his eyes.)

Now, I don’t know what kind of French-speaking Word users Rick Schaut happens to be acquainted with, but I wouldn’t be surprised to hear that they are green-skinned with big jug ears and antennas on their skulls. Because back on this planet, I sure have never even heard of a single French-speaking Word user who actually uses “advance field widths to get the proper spacing” in lieu of the good old non-breaking space character.

It really does boggle the mind to think that somebody working full time on Microsoft Office for the Mac can have such a warped view of the real world in which us plain folks happen to try and do our humble jobs. Even if it is actually possible to reproduce traditional French typographic rules more accurately using “advance field widths” instead of the non-breaking space, does Rick Schaut really believe that any ordinary user would want to use or indeed be capable of using such “advance field widths”? (Note that Rick Schaut doesn’t provide any specific examples illustrating his point.)

And does Rick Schaut not realize, as I took pains to explain in detail in my last post on the subject, that his own software, i.e. Microsoft Word, actually automatically uses the non-breaking space when you are typing in French or in Canadian French?

What is Rick Schaut telling us here? That the default behaviour in Word is wrong and that should not be used? Even though it is an option that has been turned on by default in Microsoft Word for the past ten years or so?

It really does boggle the mind to think that Microsoft’s own developers don’t seem to be aware of their own application’s features.

Yes, it is true, as I pointed out in my own blog post about French typography earlier this week, that the non-breaking space is an imperfect solution. In many circumstances, the width of the non-breaking space is far too great compared to the width of the space used traditionally in French typography with French punctuation marks in the pre-computer days. I did mention the thin space in my own blog post. There is also the en space. (In many fonts, the non-breaking space has the same width as an em space, i.e. as the m letter in the font.)

To try and justify his position, Rick Schaut refers to a page about HTML authoring in French, which is apparently as far as his own research goes when it comes to French typography. Let’s ignore for now that we are talking about word processing, and not HTML authoring. Even on this page that Rick Schaut refers to, the author actually points out that, since Word 97, Microsoft Word has had this feature that automatically inserts the non-breaking space when typing in French!

In Microsoft Word 97 the non-breaking space U+00A0 is automatically inserted when the French language is selected and a guillemet is typed. Some French typographers prefer to use a non-breaking thin space (espace fine insécables) with the guillemets.

(The author does not specify how those French typographers manage to use the non-breaking thin space instead. One thing is for sure: They are not using Microsoft Word.)

The author also argues that “[t]he terms and rules for spacing in French ortography [sic] are somewhat confusing and mixed.” (They appear to be so confusing that he can’t spell orthography properly.) But the rules are not that confusing. There might be some hesitation about the use of the en space vs. the thin space. But other than that, the rules are pretty clear. The only confusion comes from the situation that Microsoft itself introduced by making the use of the non-breaking space in French automatic since Word 97.

Because of this, many people think that the non-breaking space is the right type of space to use in French typography. It is not. But unfortunately, thanks in no small part to the developing skills of Microsoft engineers past and present, that is the situation we are in today. The non-breaking space is the space that is used on computers today to try and mimic traditional French typography. It’s not ideal, but it certainly is better than no space at all, and it certainly is the only workable and user-friendly alternative that software engineers have been able to come up with.

By stating that “most French users of Word don’t use non-breaking spaces at all” and that they “use advance field widths to get the proper spacing,” Rick Schaut basically demonstrates his complete ignorance of the issue and of the way his own software works, as well as the typical approach of an engineer who’s completely disconnected from the reality of day-to-day computing. (Does he seriously think that most French-speaking Word users use “advance field widths to get the proper spacing”? Does he honestly think that most French users would even be able to do such a thing? This is utterly unbelievable.)

Like I pointed out in my post about the bug with Postscript fonts in Word 2004, it already is hard to enough to get Microsoft developers to make sure that the non-breaking space itself remains usable in Word. At it stands, today, in Word 2004, when you type in a Postscript font, the non-breaking space is unusable, and French-speaking Word users are pretty much forced to use TrueType fonts if they want to retain their sanity.

Instead of admitting this plain and simple fact and actually doing something about it, Rick Schaut appears to be more interested in character assassination and flaunting his own utter ignorance of the specifics of the matters at hand. It really is beyond ridiculous.

I would also like to point out that Rick Schaut is not even professional and honest enough to provide a direct link to the “go-around” that he refers to in the first paragraph of his blog post, and instead provides a link to Drunken Batman’s typically illegible take on the topic, in which that blog’s writer can’t even spell my last name right and provides no evidence that he wouldn’t be the runaway winner himself in any “asshat” competition.

And I would also like to point out that Rick Schaut has only ever given me any credit once, and that was about the infamous “Disk is full” bug, only when it finally became obvious that they couldn’t go on denying the persistence of this embarrassing bug. And even then, he only gave me credit quite reluctantly—I quote, “after some prodding on my part,” as if I was the one who had made it difficult for them to identify the bug! I want to stress that this is a bug that affected Word users in very destructive ways for several years and I wrote extensively about the bug and about what could be done to work around it without losing precious hours of work—which the bug could indeed easily destroy—long before Rick Schaut finally did take the bug seriously enough.

I will not stoop so low as to throw French insults in Rick Schaut’s general direction, even though he fully deserves them. I will simply invite people to examine the facts, and make up their own minds about who is being utterly and irresponsibly “egonorant” here.


12 Responses to “French punctuation in Microsoft Word: Rick Schaut chimes in”

  1. Warren Beck says:

    Pierre:

    Le client a toujours raison. What else can I say.

  2. Pierre Igot says:

    Unfortunately the software industry as a whole has obviously invented its own rules, which have very little to do with customer satisfaction. And we as ordinary users are ultimately powerless against this situation. As long as computer users in general continue to endure this level of abuse without reacting, nothing will change.

  3. Warren Beck says:

    Keep up the good work, Pierre.

  4. ssp says:

    I’ll just ignore the Microsoft issues as I don’t care about the quality of their software.

    But I had a look around Unicode, and there only seems to be a ‘thin space’ (U+2009) and lines seem to be broken at that one. I couldn’t find a non-breaking thin space in there, how are the ‘proper’ French spacings stored in Unicode text then?

  5. Pierre Igot says:

    That’s the thing, ssp. There doesn’t appear to be a reliable way to achieve “proper” French spacings everywhere. I think that what happens is that each application relies on its own scheme to make such spaces unbreakable. For example, if I insert the Unicode thin space in a text document in TextEdit, it’s not treated as a non-breaking thin space. But if I insert the same Unicode thin space in a Pages document, it is treated as a non-breaking thin space. Since there is only one thin space in Unicode (there is no separate character for the non-breaking thin space), I guess we’ll be stuck with this situation for the foreseeable future.

    Adobe InDesign also supports various types of white space, including the thin space. I am not entirely sure it’s the same Unicode character, but it looks like it is. Again, Adobe InDesign does not allow line breaks next to a thin space.

    While there is no built-in support for the thin space in Word 2004, I am able to insert thin spaces in Word documents using Mac OS X’s Character Palette, and they are treated as non-breaking spaces there as well. But who knows what happens when you try and share a Word document containing such a thin space with a Word for Windows user… I certainly wouldn’t trust Microsoft to have ensured that this works OK.

    Interestingly, I am also able to use the thin space as a non-breaking space in Mail messages (Mail 2.0). But again, who knows what would happen to such spaces on the recipient’s computer.

    So the bottom-line is that we seem to have some support for proper French punctuation and spacing rules (even though they can only be achieved manually), but we have no guarantees regarding what happens to texts using the appropriate spacing rules when they are shared with other people using other computers.

    One day, maybe, software engineers will take international typography seriously enough and design file formats and tools that make it easy to achieve great results and guarantee that the results are preserved when sharing documents with other people.

    But for now, we have a lead MacBU developer who still thinks that “most French users do not use non-breaking spaces,” so we clearly have a looong way to go.

  6. ssp says:

    Thanks for the info Pierre. I’m quite interested in Unicode but haven’t really thought about spacing in this context as I’m not writing in French.

    Should Unicode (by its own ideas about what it should do) include such special spacing characters or is this supposed to be handled by applications? Can you point me to a discussion of the technical side of this issue perhaps?

  7. Pierre Igot says:

    The problem is that language-specific issues tend to be discussed in that particular language :). This page is a good recap (in French) of the issues. The author concedes that, with the present situation, the only realistic options are the regular space and the non-breaking space. But even then, he fails to mention the problem with the colon, which normally requires a flexible space both before and after. The regular space is flexible, but breakable, whereas the non-breaking space is unbreakable, but not flexible. So even there you have to stray away from traditional French typography by using the non-breaking space, which makes the space between the word and the following colon not flexible (which it should be).

    The problem is that this particular problem (which is pretty fundamental) cannot be solved by a single character, whether it’s Unicode or another set. What we would need here is a flexible non-breaking space, i.e. a non-breaking space whose width is adjusted automatically by the algorithm that calculates the space between words in the line (in justified text, of course; in left-aligned text, this is not a problem, and the space before the colon should probably just be a non-breakable en space).

    Even if Unicode did include more characters, such as a non-breaking thin space, a non-breaking en space, etc. we would still have the problem with the fact that all these characters have a fixed width. So I suspect that, at this point in time, and in the foreseeable future, such issues will continue to be handled at the application level, in (unfortunately) application-specific ways.

    What we really need is a system-wide standard for calculating the width of the non-breaking space based in the same way that the width of the regular space is calculated.

    Or alternatively, just use left-aligned text exclusively :).

  8. ssp says:

    Thanks for the link. That was interesting.

    I hadn’t found the Ux202f character before (because I was looking for ‘non-break’ rather than ‘no-break’) and it looks like it should be the right thing from a technical point of view. Practically it’s not very useful of course because no more than four fonts on my system (three actually, as the space seems to be zero-width in Lucida Grande) contain this glyph.

    I think that the width computation needed by applications that do their own line-breaking is an issue that doesn’t need to be treated by Unicode or the fonts. As long as Unicode can encode that character (and thus the properties of the space) and the font can give a ‘suggested’ width for the space, it should be the responsibility of the type engine to ensure all the spacing works out correctly.

    And for left-aligned text: Yes, that does make things easier. And with the poor justification and hyphenation efforts made by many applications (which include a certain Office suite as well as my web browser), it shouldn’t be used in many situations. What about turning it off on these pages for better legibility? (For some reason the WordPress geeks made this bad decision and there has been a bad justification creep on the web since their software started being popular…)

  9. ssp says:

    Uh, and do I really need to mention that the situation in the stone-age arena of computing known as LaTeX looks much better?

  10. Pierre Igot says:

    Yeah, I suppose I should probably turn it off—although I don’t find it shockingly bad in Safari. What browser are you using?

  11. Pierre Igot says:

    Yes, I am aware of the LaTeX capabilities. The problems, as far as I know, is that LaTeX documents cannot really be shared in a office/work group type of situation and that the LaTeX solutions available are not exactly user-friendly. But they are used quite extensively in the academic world to produce printed documents with high-quality typography. It’s just not “for the rest of us.”

  12. ssp says:

    I’m using Safari and of course it’s not horrible on every line. But there will be bad lines on every page. Particularly as there are no hyphenation capabilities. Basically it looks – quite adequately perhaps on this page ;) – like many Word documents look, trying to be nice and using justified text, but being uninformed/incapable and not using hyphenation.

    I know we have had that discussion before and that you won’t be able to get an inane managerial type to use TeX, but the fact remains that it provides better compatibility than most other formats and that – for standard work – it’s not that hard to use. Not harder than graphical programs I’d say if you take into account all those little ‘tricks’ people have to use to ensure Word (or whatever other program) doesn’t eat their documents in the process of editing.

Leave a Reply

Comments are closed.