Silent data corruption

Alternate title:  Apple’s file system engineers are sadly naive.

I was quite disappointed to see that APFS isn’t even trying to provide data integrity.  Data integrity is kind of step 0 of any file system, and checksums or use of ECC is pretty much standard in modern & leading-edge file systems.  APFS doesn’t want to be one of those, it seems.

Case in point why this matters:

I have a bunch of old backup drives, because drives are cheap and until recently I could just buy a new one once the current one filled, instead of ever deleting a backup.  Periodically I go back through these old backup drives and do some basic integrity checks (S.M.A.R.T. bad block scans, file system checks, etc).

also run a comparison of key data between those backups and the current versions on my computer, for files which generally shouldn’t change nor disappear – e.g. photos, videos, key documents, etc.

And today I found that at least half a dozen valuable personal videos (and a few photos) were corrupt, in the versions on my computer.  Luckily, the versions in the ancient backups were still good, so I could replace the corrupt ones.

This corruption was completely silent, until my ‘paranoid’ and time-consuming checks discovered it.

It’s far from the first time.  A failing drive years back corrupted a huge portion of my music library – silently, as far as the file system & OS were concerned.  Periodically I’ve discovered photos (of which I have huge numbers – the majority of my data) which have become corrupt at some indeterminate point.  And I’ve of course had file system [metadata] corruption occur many times, sometimes requiring complete erasure of the disk, and recovery or rebuilds from backup (a few times I’ve had to use data recovery software, where backups weren’t available).

Most, if not all, of these issues would have been discovered by even the most trivial file integrity protections, in the file system.

The notion that modern disks somehow magically protect against all silent data corruption is abject poppycock.  They’re more likely to suffer from it than older disks – a byproduct of higher densities and market demand for cheaper, crappier storage products.

And the implicit assertion that Apple’s file system driver, and kernel overall, are somehow completely free of bugs… is just batshit crazy.

Addendum

Since Apple aren’t interested in protecting anyone’s valuable personal data, I’m on the look-out for other options.  Manual use of shasum is one, for now, but a more streamlined and fool-proof system would be better.  Alas, none seems to exist1.  Yet.

  1. There is chkbit, but it relies on MD5… probably acceptable for this use case, but needless in the face of decades of better hash algorithms.  And it’s written in JavaScript.  Ew.

Encrypted RAID volumes in El Capitan

Apple crippled Disk Utility in El Capitan, in their usual name of making good functional things pretty & pretty useless.

Luckily I’m far from the first person to need to create RAID and/or encrypted CoreStorage volumes, in El Capitan.  Florian Knapp has a concise summary of how to set up an encrypted RAID volume.  Tom Nelson (of About.com) has a slightly more detailed tutorial for managing the RAID part.

Now I just wish the hard drive industry would actually push capacities up, like they once did, so that I don’t have to resort to striped RAID sets just to make a disk big enough for Time Machine backups.  It feels like we’ve been effectively stuck at 6 TB for many years now, and affordable 8+ TB drives aren’t really on the horizon (Seagate & Western Digital have offerings, but historically have been bad brands for drive reliability, e.g. Backblaze’s data, plus my own personal experience with their drives).

Update:  macOS Sierra partially restores Disk Utility’s functionality, though not enough to be useful.  It adds a “RAID Assistant” which lets you create unencrypted RAID volumes.  The core Disk Utility app can also initiate manual repair of RAID mirrors, and delete RAID volumes.

It’s something of a mystery why you cannot create encrypted RAID volumes with the RAID Assistant.  It doesn’t offer any encrypted file systems as initialisation options, and attempting to erase the unencrypted RAID volume in Disk Utility, to replace it with an encrypted version, fails with the bullshit error message:

An internal state error occurred
Operation failed…

No shit.

Furthermore, encrypted RAID volumes (or more precisely, any RAID volume that’s part of a CoreStorage Logical Volume Group) don’t get recognised as RAID volumes in Sierra’s Disk Utility unless you connect the underlying drives while Disk Utility is running.  Even then it’s hit or miss whether it’ll correctly recognise not just that it is a RAID set but also that there’s an encrypted CoreStorage volume on the set.  And I’m not even going to try testing if it can actually repair a RAID mirror in that configuration.

To be clear, RAID volumes that don’t have CoreStorage volumes atop them seem to work fine.  It’s evident that Apple simply don’t support encrypted RAID volumes.  Maybe in next year’s macOS – it must be hard adding support for things you already fucking supported until you pointlessly removed support for it.

FWIW, here’s a howto from Macworld on how to use the new RAID Assistant, if encryption isn’t something you want.