tmutil is broken by SIP in Mojave

A diskutil bug unceremoniously erased an entire hard drive of mine a few weeks back.  While I was able to successfully (AFAICT) restore the drive’s contents to it from various backups, the erasure gave the drive a new identity (UUID, specifically).  The next time Time Machine ran, it compounded the diskutil bug by also unceremoniously deleting all my old backups (bar one, the latest), because it didn’t recognise the new drive with identical contents to the old drive as being the same drive, and tried to back it up again, requiring way more space, causing all existing backups to be purged, etc.

Sigh.

It turns out there’s actually a nominally supported way to address exactly this scenario – tmutil associatedisk (kudos to Simon Heimlicher for documenting this).  From the man page:

     associatedisk [-a] mount_point snapshot_volume

             Bind a snapshot volume directory to the specified local disk, thereby reconfigur-

             ing the backup history. Requires root privileges.

             In Mac OS X, HFS+ volumes have a persistent UUID that is assigned when the file

             system is created. Time Machine uses this identifier to make an association

             between a source volume and a snapshot volume. Erasing the source volume creates

             a new file system on the disk, and the previous UUID is not retained. The new

             UUID causes the source volume -> snapshot volume association to be broken. If one

             were just erasing the volume and starting over, it would likely be of no real

             consequence, and the new UUID would not be a concern; when erasing a volume in

             order to clone another volume to it, recreating the association may be desired.

             A concrete example of when and how you would use associatedisk:

             After having problems with a volume, you decide to erase it and manually restore

             its contents from a Time Machine backup or copy of another nature. (I.e., not via

             Time Machine System Restore or Migration Assistant.) On your next incremental

             backup, the data will be copied anew, as though none of it had been backed up

             before. Technically, it is true that the data has not been backed up, given the

             new UUID. However, this is probably not what you want Time Machine to do. You

             would then use associatedisk to reconfigure the backup so it appears that this

             volume has been backed up previously:

             thermopylae:~ thoth$ sudo tmutil associatedisk [-a] “/Volumes/MyNewStuffDisk”

             “/Volumes/Chronoton/Backups.backupdb/thermopylae/Latest/MyStuff”

             The result of the above command would associate the snapshot volume MyStuff in

             the specified snapshot with the source volume MyNewStuffDisk. The snapshot volume

             would also be renamed to match. The -a option tells associatedisk to find all

             snapshot volumes in the same machine directory that match the identity of

             MyStuff, and then perform the association on all of them.

Perfect – and I particularly like the subtext of the prose, which seems to be a subtle acknowledgment that this is a thing that happens frequently, and that macOS’s default behaviour is stupid… “recreating the association may be desired”.  No shit.

Unfortunately, that command doesn’t work in Mojave.  I’m apparently not the first person to notice.

It appears the tightened security, and in particular expansion of SIP to cover many more parts of the system including Time Machine backups, are to blame.  Even granting tmutil Full Disk Access etc in the system security settings is of no use (contrary to the stated purpose of Full Disk Access).

So you have to disable SIP first – which requires a reboot, obnoxiously – and only then does tmutil work again.  You’ll want to enable SIP again once you’re done, most likely, as the protections it provides are useful – it appears tmutil nve eeds to be updated to account for the new protections.

Apple Mail crashes on launch if connection logging is enabled

This was a fun one.  Mail started crashing on launch for absolutely no apparent reason – nothing had changed to its config or similar in a long time.  The crash logs were all fingering an identical culprit – -[IMAPTaskManager secondaryIdleMailboxName] called on the wrong GCD queue:

Process: Mail [19884]
Path: /Applications/Mail.app/Contents/MacOS/Mail
Identifier: com.apple.mail
Version: 11.3 (3445.6.18)
Build Info: Mail-3445006018000000~4
Code Type: X86-64 (Native)
Parent Process: ??? [1]
Responsible: Mail [19884]
User ID: …

Date/Time: 2018-04-19 16:44:45.717 -0700
OS Version: Mac OS X 10.13.4 (17E199)
Report Version: 12
Anonymous UUID: …

Sleep/Wake UUID: …

Time Awake Since Boot: 94000 seconds
Time Since Wake: 530 seconds

System Integrity Protection: enabled

Crashed Thread: 13 Dispatch queue: Task Manager Serialization Queue (QOS: UNSPECIFIED)

Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Exception Note: EXC_CORPSE_NOTIFY

Application Specific Information:
*** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'This method should only be called on the serialization queue'
terminating with uncaught exception of type NSException
abort() called

Application Specific Backtrace 1:
0 CoreFoundation 0x00007fff55da96bb __exceptionPreprocess + 171
1 libobjc.A.dylib 0x00007fff7d4c1942 objc_exception_throw + 48
2 CoreFoundation 0x00007fff55daf2a2 +[NSException raise:format:arguments:] + 98
3 Foundation 0x00007fff57ee7340 -[NSAssertionHandler handleFailureInMethod:object:file:lineNumber:description:] + 193
4 IMAP 0x00007fff6fe88959 -[IMAPTaskManager secondaryIdleMailboxName] + 216
5 IMAP 0x00007fff6fe880cb -[IMAPTask mailboxIsUserVisibleUsingDataSource:] + 180
6 IMAP 0x00007fff6fe6ab68 -[IMAPMailboxSyncTask _nextNetworkPriorityAndOperation:reservedNetworkPriority:] + 164
7 IMAP 0x00007fff6fe6c4ba -[IMAPMailboxSyncTask recalculatePriorities] + 398
8 IMAP 0x00007fff6fe67dd9 -[IMAPMailboxSyncTask initWithDataSource:taskManager:imapMailbox:fromStatus:forceFullSync:] + 766
9 IMAP 0x00007fff6fe931f7 -[IMAPTaskManager _syncMailboxWithDataSource:withIMAPMailbox:fromStatus:forceFullSync:userInitiated:] + 370
10 IMAP 0x00007fff6fe92e68 -[IMAPTaskManager syncMailboxWithDataSource:withIMAPMailbox:fromStatus:forceFullSync:userInitiated:] + 240
11 IMAP 0x00007fff6fe9631a -[IMAPTaskManager didAddMessagesWithUnknownUID:toDataSource:] + 872
12 Foundation 0x00007fff57e4a5df __NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 7
13 Foundation 0x00007fff57e4a441 -[NSBlockOperation main] + 68
14 Foundation 0x00007fff57e488ee -[__NSOperationInternal _start:] + 778
15 Foundation 0x00007fff57e44917 __NSOQSchedule_f + 369
16 libdispatch.dylib 0x00007fff7e09ee08 _dispatch_client_callout + 8
17 libdispatch.dylib 0x00007fff7e0b1ed1 _dispatch_continuation_pop + 472
18 libdispatch.dylib 0x00007fff7e0a9783 _dispatch_async_redirect_invoke + 703
19 libdispatch.dylib 0x00007fff7e0a09f9 _dispatch_root_queue_drain + 515
20 libdispatch.dylib 0x00007fff7e0a07a5 _dispatch_worker_thread3 + 101
21 libsystem_pthread.dylib 0x00007fff7e3f0169 _pthread_wqthread + 1387
22 libsystem_pthread.dylib 0x00007fff7e3efbe9 start_wqthread + 13

Long story short, the issue turns out to be having connection logging enabled.  That’d been turned on many months before in order to debug a different stupid Mail bug, and had been simply left on (deliberately IIRC, since Mail tends to bug-out quite often, so why not have logs already available when it comes time to debug it yet again?).

Connection logging is enabled or disabled by opening the “Connection Doctor” (Window menu > Connection Doctor) and using the checkbox titled “Log Connection Activity”.

So how do you get to that checkbox when Mail crashes on launch?  Well, in this specific instance I was able to disable all mail accounts via System Preference’s Accounts pane, launch Mail, disable logging, quit Mail, re-enable all mail accounts via System Preferences, and then relaunch Mail to have it finally actually work.

From even just brief web searching, it’s clear that this issue has been present and well-known in Mail for a really long time.  Sigh.  Apple’s protestations that they care about software quality, or the Mac, are relentlessly undermined by their actual actions.

Your system has run out of application memory HUR HUR HUR

I hate this dialog with the fire of a thousand suns.

When this appears, it basically means one (or both) of two things:

  1. Some application went nuts and chewed through all your memory and/or disk space.
  2. macOS got itself into a darkly comical & embarrassing deadlock.

Quitting any of the listed applications is rarely the correct move.  It’s often enough the case that none of them are the root cause, and you can kill all of them if you want, but it won’t fix the problem.

One important thing to clarify first, though, is that this dialog does not necessarily use the term ‘memory’ in the conventional sense – i.e. RAM.  It can also refer to disk space.  Unfortunately it doesn’t bother to distinguish between the two, which is particularly stupid of it since any possible resolution of the issue is highly dependent on which of the two cases it in fact is.

Thank goodness for iStatMenus, though, which in the most recent incident showed that I had ~20 GiB of RAM completely free (not even inactive, actually outright free).  So immediately that rules out what the daft bloody dialog’s actually saying.

The worst thing about all this is when it’s #2 the occurs.  For example, I had Lightroom do a 63-image panorama merge.  As Lightroom is a gross memory pig when doing panorama merging, it consumed something like 40 GiB of memory.  Which caused a bunch of stuff to page to disk.  Which consumed all the disk space.  Which led to that obnoxious dialog.  Which further led to macOS in its infinite fucking wisdom ‘pausing’ (SIGSTOPing) almost all running programs, including evidently whatever daemon actually handles paging.  Thus when Lightroom actually completed the panorama merge and released all that memory, I now had 20 GiB of free memory and the system refused to use any of it to page back in all that memory it’d paged out.  Because it was out of disk space.

The only solution – short of hard rebooting and hoping it resolves itself – was to delete a bunch of files I actually do still want, but which will now have to be recovered from a backup.  Great job macOS, thanks for all your help.

Of course, even once you do that and recover the system from the derpeche mode it put itself into, it won’t actually unpause any of the shit it broke.  You have to do that manually.  It pretends you can do that via that dialog that started the whole thing – assuming you left it open the entire time, blocking your view as you actually help the situation – but that only shows user-visible applications, not all the other system & background processes that it also rudely halted.

So, simple tip for resuming everything:

sudo killall -CONT -m '.'

Elegant, after a fashion.  Though every time, it reminds me that whomever named it ‘killall’ was either not very friendly or not very wise.

Note that the system will probably still be a bit broken in places, as despite what macOS thinks, you can’t just blindly pause random system tasks and not have things get really, really confused.  A reboot is always wise after seeing this dialog, to properly undo its fuckery.