From Pages ’09 to the Web: An XML-based workflow (illustration)

Posted by Pierre Igot in: Blogging, Pages
April 23rd, 2011 • 2:52 pm

In order to complete the picture of the custom-designed workflow that I described in my two previous posts:

I think it would be a good idea to provide a few screen captures that illustrate the process in a very visual way. (Click on each capture to view it at full size.)

So, to begin with, here’s a capture of my screen with an example of an article that I have written for my ‹› web site in Pages ’09:

Original in Pages

As you can see in this picture, my style sheet includes several custom-designed paragraph and character styles, including paragraph styles for the coloured examples and character styles for words in English, words in French, superscript, and small caps.

Now here’s the exact same Pages ’09 document after I have gone through the process described in my initial post, i.e. decompressed it into a folder with The Unarchiver, opened the enclosed index.xml file in BBEdit, applied my special script to it, saved it, renamed the enclosing folder with a .pages file extension and reopened the resulting Pages ’09 file in Pages ’09:

Tagged copy in Pages

As you can see, there are now div and span tags around all occurrences of my styles, as well as a href tags around hyperlinks. All this was done automatically by my script.

(Don’t pay any attention to the trailing </p> tag after the title. It’s there because there is a whole bunch of extra XML code above the actual text of the document in the index.xml file, and this causes my grep pattern for paragraphs to believe that the beginning of the document is part of a bigger paragraph that begins higher up. I could probably fix this, but I don’t really care, because it does not affect the usability of the result, and I can just erase the extra tag manually.)

Then all I have to do with the tagged document in Pages ’09 is select it all and copy it to my WordPress dashboard in the HTML editor:

Tagged text in WordPress

As you can see, all the rich text formatting is gone, since we are in the HTML editor and not the visual/rich text editor, but I don’t care, because I have all the tags I need to get the formatting I want with my CSS style sheet. And here is what I get on the web after posting the article:

Article on the web

Pretty neat, isn’t it?

Comments are closed.