Ceph-OSD fail to start at boot

Sometimes, it is possible that after a reboot some/many OSD disks fail to be properly mounted and made available to the Ceph cluster.

A quick Google search shows this issue popped up several times during last year, for example in this thread.

Since the number (total number, and numeric ID) of affected OSDs changes at each reboot, we suspect this issue may somehow be related to some magic happening within systemd.

However, there is a simple procedure to recover:

Set the “noout” flag, to prevent system rebalancing:

$ ceph --cluster <cluster_name> osd set noout

Execute the following command on each affected OSD server:

$ ceph-disk activate-all

Remember to unset the “noout” flag if you previously set it:

$ ceph --cluster <cluster_name> osd unset noout