Safari: Whatever happened to efficient page caching?

Posted by Pierre Igot in: Macintosh
January 16th, 2006 • 2:43 pm

Today, at 2 pm, someone from the local power utility showed up. He said he had to change my meter (didn’t say why) and would have to cut the power for about 10 seconds. I told him I appreciated the warning and went to shut down my computer.

At that point I had, as usual, a number of unread web pages open in Safari. These days, I use the free Safari add-on Stand. This add-on adds a number of features to the Safari interface, including the ability to quickly save all currently open Safari windows and tabs as a “workspace” on its “Bookmark Shelf.” In fact, Stand is smart enough to automatically save references to all open windows and tabs as the “Last Workspace” on the shelf when you quit Safari, even if you don’t do it manually yourself.

After the power outage, I restarted my computer and launched Safari. I then went to Stand’s “Bookmark Shelf,” selected the saved workspace, and clicked on the “Open in Tabs” button that reopens all the windows and tabs more or less in the same arrangement as they were. (Stand doesn’t remember the window’s positions, and sometimes opens some pages as separate windows when they actually were tabs in the same window, but these are minor inconveniences.)

To my surprise, when I did this, some of the pages reloaded immediately. Now, you need to remember that I am still on dial-up, peaking at 28.8 kbps. In such a situation, opening multiple pages at once is not a particularly good idea, because the connection gets saturated very quickly, and some pages take a very long time to load, or Safari ends up experiencing server time-outs and failing to load the pages entirely.

So I was very surprised to see that some pages loaded immediately. And that got me thinking… Isn’t this the way that caching is actually supposed to work? After all, most of the pages that I had open in Safari before shutting my computer down were pages that I had loaded very recently, and the vast majority of them had not changed at all. Why does Safari need to reload them from scratch? Why can’t it reload them all instantly from its cache? Does Safari automatically empty its cache when you quit it? If so, why?

It seems to me that, given a number of pages that have not changed since they were last loaded in Safari, the process of opening them again should be nearly instantaneous. All Safari has to do is connect to the server in question and check whether the page has been changed or not since it was last loaded. This should take a fraction of a second. With 10 or 20 pages, this should take a few seconds at most, even on a dial-up connection.

So why does this not work reliably? Why does Safari insist on reloading all these pages from their respective servers even though they have not changed? And why does it only have a few stored in its cache that load instantly, and not all of them?

I suppose that part of the problem might be that the web servers on which these pages are located might have different ways of marking their “last modified” dates and times. But I would expect a modern web browser such as Safari to be familiar with these conventions and to be able to retrieve the information about the modification dates and times from various servers running various flavours of Unix, Mac OS X, Windows, etc.

I also suppose that, when you have a broadband connection, the fact that Safari reloads the pages from scratch from the servers instead of using its cache doesn’t have much impact, because the page loading takes very little time for most pages. But when you are on dial-up, you are particularly aware of such issues, and this particular situation is making me wonder why indeed Safari doesn’t appear to be using its cache more efficiently and effectively. I thought that this was what caches were for. If the cache is not used to store unchanged pages and reload them quickly in the event of a quit/relaunch action, then what is it good for?

Then again, unfortunately, I am more than accustomed to Apple’s on-going ignorance of the needs of dial-up users. It’s nothing new. Ever since Mac OS X was launched six years ago, the needs dial-up users have been consistently ignored or overlooked. I see very little hope for any improvement in that department. (And believe me, I am doing everything I can to finally get out of dial-up hell.)

Still, there is something of the apparent deficiency of this caching mechanism that is quite disappointing, regardless of issues of bandwidth. It’s about computing elegance and efficiency. If the page hasn’t changed, there is no need to reload it. So why does Safari reload it just the same?


2 Responses to “Safari: Whatever happened to efficient page caching?”

  1. ssp says:

    Caching is decidedly non-trivial. And many things apart from the browser can play a role in it. I don’t know the standards by heart but there seem to be several things that can happen. Like the server sending a header that marks the page as non-cacheable. Or the server sending a 304 response when the browser tries to re-fetch a page that hasn’t changed.

    As you’ll have to have an eye on all those details to determine whether Safari is doing the right thing, that’d be hard to do. But Safari’s behaviour feels quite reasonable to me (in fact, if anything, Safari used to be caching things too strongly in older versions).

    Just an example from my recent experiences: I started sending out my web pages in a compressed form to save bandwidth. (watch here for an upcoming report on the topic) And soon I had to discover that while compression worked fine the use of PHP to achieve it meant that browsers would always reload the page because PHP kept sending the current time as the file’s modification time. So pages kept being reloaded despite not having changed. I assume that many other pages on the net will share that fate. And that might explain some of the behaviour you are seeing.

  2. Pierre Igot says:

    Yes, I suppose there are plenty of factors involved. Another one that I was just thinking of is advertising. Even if the content of a page doesn’t change, its ads can change, and that makes it impossible for the browser to determine whether it should reload the page or not.

    On the other hand, I had a number of pages this morning that contain no advertising whatsoever, and don’t appear to involve any server-side scripts or anything. Safari still reloaded them. Hence my disappointment. But I suppose it is a rather complex issue.

Leave a Reply

Comments are closed.