Time Capsule: Bad disk image causing kernel panics
Posted by Pierre Igot in: MacintoshJune 11th, 2008 • 10:26 am
I bought a 1 TB Time Capsule a couple of months ago, when they first came out, primarily as a seamless backup solution for my wife. I have my own backup strategies for my work station, but my wife depended on being regularly reminded of the importance of backups, etc. I figured that a Time Capsule would be the ideal solution for her, and I bought the 1 TB version just for the extra space that I might be able to use myself as additional storage space.
When the Time Capsule arrived, I set it up using the default options, just adding it to our existing wireless network. The initial backup from my wife’s MacBook Pro took a long time, as expected, but after that everything seemed to work normally.
It continued that way for a couple of months. The only times we had trouble with the Time Capsule were when I tried to fiddle with it myself from my own Mac Pro. Basically, if I didn’t touch it, it worked fine as a seamless backup solution for my wife’s laptop.
Then last week I read this item on John Gruber’s Daring Fireball about kernel panics caused by Time Machine.
As John indicates, it turned out that the kernel panics were apparently caused by a damaged sparse disk image on the Time Capsule hard drive.
At the time, I simply thought that I agreed with John that this was a “nasty bug.”
And then a few days later my wife called to say that her Time Machine menu had an exclamation mark indicating a failed backup. I tested it and indeed it was failing, even though the Time Capsule device itself appeared to be running fine.
I got her to reboot her laptop. It still wouldn’t work. So I tried to connect to the Time Capsule and mount the sparse disk image containing her backup, just to see if I could notice anything amiss. And bam! I got a kernel panic.
I rebooted and tried a second time and bam! Same thing.
I went back to my own Mac Pro, connected to the Time Capsule and mounted the sparse disk image of my wife’s backup and bam! I got a kernel panic on my own Mac Pro.
I then remembered the note on Daring Fireball and went back to check what the solution mentioned was. It was to mount the sparse disk image and repair it with Disk Utility. I went to my other Mac Pro, connected to the Time Capsule, and mounted the sparse disk image and bam! I got a kernel panic on my other Mac Pro.
In other words, I had a disk image that quite obviously was causing three different machines running Mac OS X 10.5.x to kernel panic as soon as the disk image was mounted on the desktop.
There wasn’t even a chance of my being able to repair the disk image. I got a kernel panic as soon as I mounted the disk image and, since it had to be mounted in order for Disk Utility to event attempt to repair it, I was stuck.
I figured that, short of experimenting with all kinds of troubleshooting procedures (and enduring yet more kernel panics), the only solution was to trash the disk image altogether and start again from scratch. So that’s what we did, even though it obviously took a long time to rebuild the entire backup. And Time Machine and the Time Capsule appear to be working normally again now.
But this is worse than a nasty bug. This is a bug that systematically causes the worst possible consequence on any client machine that tries to mount the damaged disk image—a kernel panic—and a bug for which there are no apparent remedies.
And unfortunately, it is the type of bug that is simply too impractical to report to Apple. I cannot exactly send them a damaged 45 GB sparse disk image as an attachment for them to try and reproduce the problem at their end.
So until Apple somehow manages to reproduce this by themselves in-house, Time Capsule users are vulnerable. But regardless of what the cause of the bug is, surely there must be something that Apple can do to prevent a disk image from causing a kernel panic, no matter how damaged it is. In many years of using Mac OS X daily on a multitude of machines, I have never witnessed the mere fact of mounting a disk image causing such a disastrous consequence. There is something very wrong in Mac OS X 10.5 if it can kernel panic just because of a damaged disk image.
As for what caused the damage in the first place, I am afraid that there is no way to tell. All I remember is that, a few hours before the incident, I had noticed that my wife had left her laptop open while she was away for an extended period of time, and I had closed it so that it would not continue to run hourly backups to the Time Capsule for no real reason. (The simplest of changes in the work environment, such as a new e-mail message arrived in Mail, causes Time Machine to update its backup on the Time Capsule. I find that it is overkill and that daily backups would be sufficient, but there is no way to customize this as far as I can tell.)
At the time, I had not woken the monitor from sleep. I had just closed the laptop to put it to sleep. So maybe there was a backup in progress at the time I closed it. But surely Time Machine should be designed so that it degrades gracefully in such a scenario, simply aborting the backup and resuming it when the machine is reopened later on.
Maybe this has nothing to do with the damaged disk image. Maybe it has everything to do with it. I am afraid I have no way to tell, and I have no way to share the information I have with Apple in a useful form (i.e. in the form of a reproducible scenario).
But it is somewhat disappointing that a single-purpose device should fail so abruptly and so miserably, in such a user-hostile fashion.
June 12th, 2008 at Jun 12, 08 | 6:27 pm
Shudder.