Word 2004: Very poor, non-standard handling of ‘.doc’ file extensions

Posted by Pierre Igot in: Microsoft, Pages
April 11th, 2007 • 6:21 pm

Microsoft is, quite obviously, the main reason why we Mac OS X users are forced to use file extensions such as “.doc” and “.gif” in our file names in the first place. In pre-Mac OS X days, file extensions were only required when you had to exchange files with Windows users. In Mac OS X, Apple introduced the mandatory use of file extensions for a number of its own file formats, and, in spite of several advances that might, eventually, one day, lead to a file-extension-free world for all computer users, for the foreseeable future we are stuck with them, whether we like it or not.

In light of this, the fact that file extensions are handled differently from application to application is rather frustrating, especially since one of the main culprits when it comes to refusing to handle file extensions properly (i.e. use the now standard Mac OS X way) is none other than… Microsoft itself (in its Mac OS X applications).

For the remainder of this post, I will assume that you are using your Mac OS X environment with the option to “Show all file extensions” checked in the Finder’s preferences. If you don’t, you are obviously able to handle much more file extension insanity than I am. (If the option is not checked, Mac OS X attempts to hide file extensions from the user, but of course it’s not really a successful approach, because the user is still forced to deal with them explicitly in a variety of situations. The end result is an infuriating mess of now-you-see-them, now-you-don’t foolishness.)

Unfortunately, the very fact that “Show all file extensions” is an option in the Finder‘s preferences and not in the system-wide System Preferences application tells its own story. It is a setting that only really applies to the Finder as a Mac OS X application. Whenever you are in another application (say, Apple’s Pages, or Microsoft Word 2004), you still have to deal with that particular application’s approach to handling file extensions, regardless of what your Finder preference is.

In most Mac OS X applications, including Pages, the file extension issue is handled right within Open and Save dialog boxes. When you are in Pages, for example, and you want to save a newly created document, the “Save As” dialog box includes a check box labelled “Hide Extension” (in the bottom-left corner):

Hide Extension check box

Frustratingly, even if you have elected to “Show all file extensions” in the Finder, the “Hide Extension” is still checked by default the first time you use Pages for this (and the same is true in most other Mac OS X applications).

Fortunately, once you’ve unchecked the “Hide Extension” option, Pages remembers that your preference is not to hide the file extension, and from then on it makes the extension visible as “.pages” at the end of your file name in the “Save As:” file name field at the top of the dialog box:

Pages document file name with extension

The problem with file name extensions, however, is that, well, they are part of the file name. This means that, while you are editing the file name itself, you can easily accidentally edit or delete the file name extension as well.

And the quality of the software you are using will be reflected in how gracefully it handles such a situation. Here is how a typical Mac OS X application like Pages handles this situation: As soon as I delete or otherwise edit the “.pages” portion of the file name, Pages automatically checks the “Hide Extension” box.

What does this mean? This means that, as soon as I have indicated to Pages, through my stupidity and carelessness, that I don’t realize that “.pages” is actually not part of the file name and should not edited, Pages reverts to its “Mac OS X for Dummies” mode and hides the “.pages” extension from view. If I now try to save my document with a file name that does not contain “.pages,” Pages automatically appends the file name extension to the file name I entered, so that when I look at the file in the Finder, it does indeed have the “.pages” file extension and is properly recognized by the system as a Pages document.

And the key thing here is that there is nothing I can do about it. I cannot force Pages to save a Pages document without the “.pages” file extension. Either I leave it visible and untouched in the file name field, or I accidentally delete it and then Pages adds it manually and invisibly to my file (and switches the status of my “Hide Extension” option to reflect this).

Of course, I can always manually remove the “.pages” file extension from the file name after it’s been saved by editing its file name in the Finder, but I am doing this at my own risk and, when I am that stubborn, Mac OS X will not get in my way—although it will still throw in an alert dialog about the fact that removing the extension might make the file type unrecognizable, etc.

Now, let’s go back to that “Save As” dialog in Pages. What if I am even dummer and decide not only to remove the “.pages” file extension (which, remember, will automatically cause Pages to switch to “Hide Extension” mode), but also to add another file extension to the file name, like, you know, “test.pdf”? (Maybe I think that I can just create a PDF file of my document by saving it with a “.pdf” file extension! I have certainly seen this particular behaviour in some computer users in my time…)

This is particularly relevant in Mac OS X 10.4, where you have the option to click on an existing file name in the file list below the file name field to copy that file name to the file name field. If you do this (and if file extensions are visible), then Mac OS X copies the entire file name to the file name field, including whatever extension it has. So, for example, if I am saving a Pages document and I see a file called “Really Long Name That I Don’t Want To Retype.pdf” in the file list, I can click on that file and this will copy “Really Long Name That I Don’t Want To Retype.pdf” to the file name field.

What about the “.pages” file extension, though? Well, again, it has disappeared, and Mac OS X has checked the “Hide Extension” option back. But if I try to save the file now, it will try to save it as “Really Long Name That I Don’t Want To Retype.pdf.pages,” right?

Well, not quite. If you try to do such a thing, Mac OS X does display an alert:

Alert with two extensions

What this dialog is saying is that you can use two extensions, but really the one that’s needed here is “.pages” and the other one will not determine the file type in any way. (It might just be useful in some cases to indicate the origin of the file before it was saved in Pages format.)

Here again, it is actually impossible to save the Pages document without the “.pages” file extension appended to its file name. I’d say it’s a fairly fool-proof approach.

Interestingly, though, if you actually manually type out another extension while the “.pages” file extension is visible in the file name field, Pages will save the file with the two extensions without showing the alert box. In other words, if I have a new Pages document and I bring up the “Save As” dialog box with “Hide Extension” off, the file name field will contain “untitled.pages” and the “untitled” portion of the file name will be selected by default, so that it is replaced by whatever I type next. But if I type “test.pdf” manually in there, before the visible “.pages” part, and then press “Save,” Pages saves the file with the name “test.pdf.pages” without warning me about the use of two extensions. So in that case it just assumes that I deliberately wanted to create a file name with a period in it (other than the period separating the name from the extension), and, well, I am allowed to do that, aren’t I? The period is not a “forbidden character” in file names in Mac OS X. It just happens to have a special meaning near the end of the file name, as a separator for the file extension part (when it’s visible).

This is essentially the way things work everywhere in Mac OS X, in all Apple applications, and in most third-party applications that follow Apple’s guidelines and standard UI conventions.

Let’s now turn to Microsoft. Because of the historical context mentioned above, Word has long had an option to add the “.doc” file extension to its file names. The option was actually already there in the later versions of Word for the classic Mac OS, because Mac Word users started sharing Word documents with Windows users long before Mac OS X was introduced. This option is called “Append file extension,” and it’s still there in today’s version of Word for Mac OS X:

Append file extension

It too is in the bottom-left corner of the “Save As” dialog box, and it too sticks once you’ve checked it, i.e. Word keeps the option checked for future file saving operations, until you uncheck it.

At this point, it should be noted that the “.doc” file extension has never been required for Word documents on the Mac, and still isn’t required today, in Mac OS X. You can still save a Word file using a file name without a file extension, and both Word and Mac OS X will know that it is a Word document, because the file still uses the older file creator and file type codes, which are invisible codes that are stored with the file and used to be the way the classic Mac OS kept track file types and which applications they should be opened with.

Thankfully, even though Apple forces you to use file extensions for most of its own applications, Mac OS X still supports the older scheme, and Word continues to use it to this day.

So the “.doc” file extension is only needed if you want to share your Word documents with Windows users, or if you want to make sure that other applications know that the files are Word documents. (Spotlight is smart enough to also index Word documents that don’t have the “.doc” file extension, and Pages is able to open Word documents that don’t have the extension, but Pages cannot open RTF files created by Word that don’t have the “.rtf” suffix, for example.)

The problem with Microsoft, as usual, is that, even though file extensions are now standard in Mac OS X and Mac OS X has a standard way of dealing with then, they have refused to adopt that standard and still insist on doing things their own way. In addition, again as usual, their own way is actually crappy, ugly and flawed—while Apple’s solution, while not perfect, is at least fool-proof and relatively elegant.

It is, unfortunately, very easy to cause Word’s approach to file extensions to completely break down when saving files in Word. Just try the following:

  1. Create a new Word document.
  2. Press command-S to save it.
  3. In the “Save As” dialog, check the “Append file extension” box if it’s not already checked. This will add a “.doc” file extension to your file name at the top.
  4. As the file name, type in “test.pdf.” This will result in a file name that actually reads “test.pdf.doc.”
  5. Click on “Save” to save the file. Word proceeds, without any warning about the two extensions in the file name.
  6. Now that the document is saved as “test.pdf.doc” and is still open in Word, press command-shift-S or select “Save As…” in the “File” menu. This will bring up another “Save As” dialog box allowing you to save the same document under a new name.
  7. Look at the default file name in the file name field now.

What does it read? “test.pdf,” with only “test” highlighted! In other words, Word’s scheme for file extensions has become totally confused, and is about to save my Word document with an “.pdf” suffix!

And indeed if I press “Save,” Word gleefully saves the document as “test.pdf,” and even gives it a nice little PDF document icon for the proxy icon in the title bar next to the file name!

The problem of course is that this is not a PDF file at all! Yet Mac OS X now clearly treats it as a PDF file, and if you double-click on it in the Finder Mac OS X will try to open it in Preview—and obviously fail, with an error message about the fact that it’s not a PDF file.

You can still open the “test.pdf” Word document with Word, but you have to force it by dragging the file onto the Word icon.

But really, how ridiculous is this? A moment ago, I was joking about hapless computer users trying to create PDF files by simply appending the “.pdf” file extension to them—and this is exactly what Word does here!

Dear oh dear. Of course, the core of the problem here is that, unlike Apple’s approach, Microsoft’s approach to file extension handling in its “Save As” dialog boxes is hopelessly crude. All it does is that it searches for the first period that appears in the name, and then it treats whatever comes after as the file extension…

Sadly, it is not just a hilariously bad flaw. It actually has a real impact on the daily working lives of Mac users. I frequently receive Word files authored and named by other users. And these other users frequently use periods in their file names, for example as the separator between the various elements of a date.

So I get a file called “press release 2007.04.11.doc” and I open it in Word and I want to save it under a new name. Guess what happens? With Word’s crappy file extension handling, the default name in the “Save As” dialog box becomes… “press release 2007.04,” with “.04” as the file extension!

It would be funny, if it weren’t so sad and so utterly frustrating. OK, Microsoft, you were here first with file extensions, and you added support for them even in the classic Mac OS (although I am sure this crappy flaw already existed back then too). But could you please now recognize that Apple has introduced a standard way of handling file extensions and that it is much better and much more fool-proof than your own?

(Needless to say, the other Office applications, Excel 2004 and PowerPoint 2004, are equally stupid about all this.)

Now, of course, this year we are going to see a new version of Office for Mac OS X. Do you think that this particular issue will be addressed? I seriously doubt it… For them, it probably is only a “minor flaw” that Mac OS X users can live with.

Yeah right.


3 Responses to “Word 2004: Very poor, non-standard handling of ‘.doc’ file extensions”

  1. henryn says:

    Sigh. Yes, I’ve been waiting for this to come around.

    I still remember when Microsoft appropriated the generic .doc extension and ceased using the informative .msw, and that’s sufficient to give me indigestion still.

    For what it’s worth, I don’t have a lot of complaints about the system now in place on MacOS 10.4. It works in almost all cases — though I have no real idea how — I don’t have to think about it and the few cases in which there is trouble are easy to fix.

    Except of course in the case of MS applications, as you document. I’ve lived with these, just worked around them, for –well– as long as you have.

    None of this surprises me. As you point out, MS has chosen to do a poor job in many operational details, and they continue to do so for the same reason… because they _can_. I wish it were not so. Nothing personal, but these conversations are exactly the same over and over again, and it is far being tedious.

    Henry

  2. Pierre Igot says:

    However tedious it might be, I believe these things need to be properly documented in detail. It might be an obvious flaw to you, but it’s obviously not obvious enough for Microsoft to actually do something about it.

    I am not saying that my documenting it will make any difference, but the lack of appropriate documentation of all the flaws in Microsoft’s Mac products is, IMO, part of the problem. The more people document them and complain about them, the higher the chance that MS will finally do something.

  3. henryn says:

    If it were up to me, I would make your blog required reading to MS Mac product technical managers.

    Obviously, it isn’t up to me. Obviously, they do not read your excellent documentation.

Leave a Reply

Comments are closed.