Erik Schwiebert on localized versions of Microsoft Office for Mac OS X

Posted by Pierre Igot in: Macintosh, Microsoft
June 20th, 2006 • 10:59 am

MacBU developer Erik Schwiebert recently posted about localized versions of Microsoft Office for Mac OS X.

Of particular interest to me is this description of the “pseudo-localization” process that Microsoft uses to “test” Office applications before they are actually localized:

One of the ways we deal with that is a process called ‘pseudo-localization.” This has nothing to do with ‘pseudo-code’; instead, it is a way of forcing text into some translation automatically, yet still have that text be mostly readable. It works by taking the normal Roman alphabet and changing each of the characters into some similar character, perhaps one with an accent, or a copyright symbol instead of a C. We also pad each string with extra text to make it wider to check for dialog mis-layout and string insertions.

So “pseudo-localization” might become “[=== Þšéü?ø-lõçålî?á?ïò? ===]” — still mostly humanly-readable, wider to force dialog layout, and bracketed so we can tell if a dev hardcoded string insertions. We can do this in an entirely automated fashion, and this technique lets us test perhaps 50% of Office as if it were localized, so that we can catch obvious dev mistakes right away. We thus reduce the risk of finding a bad localization bug so late in the release process that we can’t fix it safely.

The excuse for using this procedure instead of actually localizing the product first and then testing it properly is, apparently, the drastic compression of the delay between the release of the English version of Office and the localized versions of the product:

[…] over the last 10 years we’ve gone from a 6-month delay between English and localized releases to somewhere on the order of 2 weeks delay. This means that we’ve compressed the same amount of work into less time, so that the international Mac community gets their hands on the next version of Mac Office at almost the same time as English speakers do. That compression means that we don’t have much time to fix bugs in the main codebase that are only revealed by the localization process!

What Erik fails to explain, however, is the following.

First of all, Mac OS X has an entirely different way of handling localized versions of applications. Instead of releasing a different version of each software application for each language, Apple releases a single package that includes the code of the application and all the localizations.

Mac OS X itself is like this: When you buy Mac OS X, you automatically get the multilingual version, i.e. a version that already includes a large number of localizations. The only difference between the French version of Mac OS X and the English version of Mac OS X, for example, is the packaging (the text labels on the disc itself and the text on the packaging) and the printed materials. When you launch the Mac OS X installer on the disc, the very first thing it asks you is to select your language, with each supported language listed in its own language. So if you are French, you see a list that includes “English” and “Français” and you just select “Français” and then the remainder of the installation process is in French and the default language used throughout the system is French.

Even after you do this and install Mac OS X in French, you still have the option to switch to another language (provided that you didn’t do a customized installation of Mac OS X in which you removed the other additional languages that Mac OS X installs automatically by default). All you have to do to switch to another language is go to the “International” preference pane and change the order of the languages in the list of languages in the “Language” tab. After that, once you log out and log back in, your entire Mac OS X environment is in the new language you selected.

This also means that, if you have multiple users on the same machine, which of course Mac OS X fully supports, each user can have his own preferred language. You can use Mac OS X in English in your own user environment, and your wife can use the same Mac OS X in French in her own user environment. The only area of Mac OS X that is stuck in one language or the other is what lies beyond your own user realm, i.e. the main folder hierarchy at the root level on your startup volume, and the login window itself. Those areas use the language that you selected during the installation of Mac OS X itself. But everything else is flexible and switching from one language to another does not require any software installation or reinstallation.

This was all introduced more than five years ago.

Apple also extended this same multilingual architecture to software applications. When you get a software package such as iLife, for example, all the applications included in the package are themselves multilingual. Here again, the only difference between the French version of iLife and the English version is the text in the packaging and the printed materials. When you launch the installer to install the software, the installer automatically uses the language that you are currently using in Mac OS X, and the applications it installs will themselves automatically use the language of your system when you launch them. For example, if you install iTunes and the language in your user environment is English, then when you launch iTunes the user interface will be in English. But when your wife launches the same iTunes application in her own French environment, the user interface in iTunes will be in French.

Finally, the same thing applies to Apple’s software updates. When Apple releases a software update for Mac OS X itself or for one of its software applications, you don’t have to pay attention to the language of the software update itself. All software updates automatically include all localizations. In other words, even if you download the latest Mac OS X system update from the English-language Apple web site in the US, you still get a system update that includes all the localizations. You don’t have to download a different update if you are using Mac OS X in French. (This probably makes the software update files a bit larger than they would be if they only included one language, but in this day an age I think that it’s an acceptable trade-off.)

This same multilingual architecture can also be used by third-party developers. They too, if they want to be multilingual and support Mac users throughout the world, can release multilingual applications that automatically use the user’s preferred language when the user installs and launches them in his own user environment. And many third-party developers do indeed do just that.

But not Microsoft. Oh no, not Microsoft. Five years after Apple first introduced this new multilingual architecture with Mac OS X, Microsoft still does things the old-fashioned way. If you want to use Microsoft Word with a French interface, you have to buy the French version of Microsoft Office for Mac OS X, which is a different product altogether. And each time Microsoft releases a software update, you need to make sure that you are downloading the right one for your version of Office.

And once you get a version of Office in a specific language, be it English or another language, then the application always runs in that language, regardless of the language of the rest of the user environment. In other words, if you buy the English version of Office for your computer, and install it on your machine, the user interface will be in English both in your English user environment and in your wife’s French user environment. Conversely, if you buy the French version of Office, the user interface will be in French for all users of the machine, regardless of what their language of choice is.

This doesn’t mean that the software applications themselves are not multilingual. They are. Even if you install the English version of Word, you can still include in the installation the French dictionaries (they are not included by default, but you can include them with a customized installation), and you can still do word processing with French text and have access to all the French-language spell-checking and typographic features. But the user interface of the application itself will be in English, and many of the default behaviours will be the ones for English and will need to be changed. (Each user can have his/her own Word templates, of course, which means that you can configure Word so that the default language for Word documents is French in her user environment, and English in yours.)

The fact that the language of the user interface in Office applications does not change depending on the user’s preferred language in Mac OS X is annoying enough. But that’s not the worst of it. The worst of it is that Microsoft’s process for fixing bugs is completely screwed. And there is no better example of this than the bug with non-breaking spaces and PostScript fonts in Word 2004.

In a nutshell, what happens is that, if you are typing a Word document in a Postscript (Type 1) font, whenever you insert a non-breaking space in your document, Word changes the font to… Times New Roman!

It is extremely annoying, and makes the English version of Word 2004 completely unusable for typing French text in a Postscript font. Why? Because non-breaking spaces are an essential part of French typography! Non-breaking spaces might be rarely used in English typography (although they should probably be used more often, for example to avoid having “Mac OS” at the end of one line and “X” at the beginning of the next one in a paragraph of text, which does not look good and is not very reader-friendly), but they are used in French typography all the time, in combination with the colon, the exclamation mark, the question mark, the French quotation marks, etc.

Needless to say, this means that the bug with non-breaking spaces and PostScript fonts in Word 2004 is an absolute disaster for French users of Word 2004. (This bug didn’t exist in Word X, and I suspect its appearance is related to changes made to the Word code to increase support for Unicode documents, which is, of course, a separate issue altogether.)

And yet it is such an obvious bug. How could anyone typing French text in Word 2004 using a Postscript font not notice it? Like I said, non-breaking spaces are used all the time in French. Indeed, even Word itself uses them automatically when you select the option to replace the straight quotation mark with the appropriate French quotation marks automatically when typing French text (which is an option that has been available for years in Word).

The fact that Microsoft did not catch that bug in Word 2004 before they released the product is a clear illustration of how flawed their testing processes are. Whatever benefits this “pseudo-localization” technique described by Erik Schwiebert provides, it is clearly not good enough to catch even such elementary bugs.

Now, it is quite possible that Microsoft did catch this bug during the process of testing the French version of Word 2004, and fixed it in the final French version of Word 2004 that was released back in 2004.

I do not know, because I bought the English version of Word 2004. I am a professional translator, and I work in both English and French all the time. Since the rest of my Mac OS X user interface is English, it makes no sense for me to buy the French version of Word.

What I do know, however, is that, two years and a dozen Office 2004 software updates later, Microsoft still has not fixed the bug in the English version of Word 2004! It is still there, and I am still completely unable to use Postscript fonts in Word documents on my computer.

Now what excuses does Erik Schwiebert have for this sorry state of affairs? I might find it acceptable (barely) that Microsoft is not able to catch such bugs in the hectic schedule that leads to the release of the initial versions of the product. But how can they justify not fixing the bugs in the next two years, even though the bugs are so obvious?

Oh, I am sure that, as a user of the English version of Word who actually does half of his typing in French, I am part of a minority. I probably don’t fit Microsoft’s profile of its “typical” Word user. But still… Does Microsoft really expect me to purchase a separate copy of the French version of Office 2004 just to see if, by any chance, the bug might have been fixed in that version—and then be forced to use a French-language interface in Word when the rest of my Mac OS X interface is in English?

This is, of course, utterly ridiculous.

We only have to look at Apple to see that it is perfectly possible to release multilingual software on schedule that automatically supports all of Mac OS X’s supported language without requiring the user to buy separate versions for each language and download separate updates for each language. It is entirely Microsoft’s decision not to have embraced this multilingual approach and structure in their own products for Mac OS X. And it’s entirely their fault if obvious bugs related to Office’s multilingual features still aren’t addressed properly more than two years after the initial release of their product.

And for all this, once more, all I can say is: Shame on Microsoft, and shame on the MacBU. They have absolutely no excuses here.

8 Responses to “Erik Schwiebert on localized versions of Microsoft Office for Mac OS X”

  1. ssp says:

    Actually the names of the ‘main folder hierarchy’ (by which I assume you refer to the Users and Applications folders) are displayed localised as well. The actual names of the folders on the hard drive are just in English anyway (so that the stupid Unix underpinnings don’t break all the time I suppose) and just their names are localised in the UI according to your language preference.

    When dealing with MS, I wonder how large those ‘small’ groups of users are which require certain functions. With their complete market domination that should easily be many thousands if not millions.

  2. Pierre Igot says:

    I wasn’t sure about the folder name thing. I should have checked it. Indeed they are localized as well. I thought I remembered seeing them in English once even on a French system, but maybe my memory is playing tricks on me, or this was a problem in earlier versions of OS X that was addressed later on. Thanks for the clarification.

    I certainly feel that my own “bilingual” situation is not all that unusual, especially not outside the U.S. There are many European and Asian companies these days who use not just the local language but also English in business communications, etc. In that respect, Microsoft is still far too U.S.-centric—although they are hardly alone in that department.

  3. ssp says:

    I guess you could have ‘broken’ the localisation of the folder names by accidentally removing the .localized file inside them.

    Yes, multi-linguality is tough. While the situation with it is better today than it ever was (thanks to Cocoa’s relatively simple localisation techniques) it is still rather bleak. In particular as the technique of localising things is so simple now, we start seeing loads of rather bad localisations that make me want to use the English version instead.

  4. Warren Beck says:

    I think that it is extremely surprising that the non-breaking spaces bug with Postscript fonts is still broken. This is a fundamental flaw, yet apparently most people are not bothered at all by the situation because they only use the TrueType fonts that are provided either with the operating system by Apple or by Microsoft when one installs Office.

    I’ll admit that I no longer have a problem with this because I am more or less forced to use Times New Roman or Ariel in my work (Chemistry) because the versions of these fonts that come with Office 2004 have unicode tiers for Greek letters (even in italic and boldface) that most other typefaces (even Verdana) lack. I am not aware of any OpenType or oldstyle Postscript fonts with comparable features. Now, in the days of Word v.X, one was forced to obtain greek letters for equations and math stuff in the main text from the so-called “symbol” fonts, such as Symbol and Adobe’s Universal Greek or Math Pi.

    So, Pierre, I guess that the MacBU would tell us that the “opportunity cost” of implementing proper support for OpenType and/or Postscript into Office 2004 at this point is just too high. I submit, however, that the MacBU didn’t even test for proper compatibility prior to the original shipment of Office 2004, and now they don’t care because, like it or not, only a few people actually use non-TrueType fonts.

  5. Pierre Igot says:

    Yes, of course, if one only uses TrueType fonts, the problem does not occur. But I don’t consider this an acceptable workaround. Many people work with a collection of fonts that they have accumulated over the years and have developed an attachment to certain fonts, no matter how “old” and obsolete the underlying technology of these fonts is and how incomplete their character set is. I, for example, have an invoice template that I designed a long time ago and that uses a particular Postscript font. I don’t really fancy changing the font of the template, simply because it’s part of my visual identity. So now I have to live with this stupid bug.

    I find it hard to believe that there aren’t still many people out there who want to continue using certain Type 1 fonts. I think the problem here is that it’s a combination of two factors: using Type 1 fonts [strong]and[/strong] using the non-breaking space. Since the non-breaking space is mostly used by non-U.S. people, it makes the bug less important in the eyes of Microsoft.

    Again, I do not know whether this bug has been fixed in the French version of Word 2004. It’s quite possible that it has been fixed. But why they haven’t also fixed it in the English version is beyond me, especially after all this time.

    To be entirely fair, Apple’s own Pages also has a somewhat similar problem with certain Type 1 fonts. Only the symptoms in Pages are not as obvious, and they seem to only affect a subset of the category of Type 1 fonts (those that do not include a non-breaking space in their char set, I think). But it’s nearly as shameful for Apple as it is for Microsoft, especially since the symptoms have to do with line spacing and Apple has designed Pages as a page layout application.

    Given that the problem also affects Pages in some way, I wouldn’t be surprised if it were linked to an underlying cause in Mac OS X, and Microsoft just couldn’t be bothered to pressurize Apple to fix it.

  6. Schwieb » Blog Archive » Bugs stink! Yeah Yeah! says:

    […] Pierre Igot writes what I must politely describe as an impassioned discourse about my description of pseudo-localization, multi-lingual bundles in OS X, and MacBU testing. Pierre makes some good points, totally misses the idea of pseudo-loc in another, and generally castigates the MacBU for failing to fix a particular bug that is very important in French typography. He then invited me to comment. […]

  7. Schwieb says:

    Sorry to find you so frustrated, Pierre. I’ve posted a reply on my blog.

  8. Betalogue » Microsoft Word and non-breaking spaces: French Typography 101 says:

    […] his response to my post about the bug with non-breaking spaces and Postscript (Type 1) fonts in Word 2004 yesterday, Microsoft developer […]

Leave a Reply

Comments are closed.