Ceph upgrade from Jewel to Luminous¶
Unlike with previous Ceph releases, for Luminous the official procedure requires to upgrade MON nodes first.
In the following we will proceed with:
upgrade MON
upgrade OSD (at this stage, OSDs will still be using Filestore)
add MGR nodes, on the same MON nodes
upgrade OSD to Bluestore
It is assumed the cluster is managed via ceph-ansible
, although some
commands and the overall procedure are valid in general.
Upgrade MON¶
Perform the following actions on each MON node, one by one, so you can check that after the upgrade the node manages to join the cluster:
sed -i -e 's/jewel/luminous/' /etc/yum.repos.d/ceph_stable.repo
yum update
systemctl restart ceph-mon@<monID>
Verify the mon has joined the cluster (note that the output of this command has changed in Luminous):
ceph -m <monIP> -s
ceph -m <monIP> mon versions
Upgrade OSD¶
Proceed as above with MON, one node at a time, by first updating the package manager configuration file, and then doing a package upgrade.
Finally, restart all OSD daemons with:
systemctl restart ceph-osd.target
Check with:
ceph osd versions
Upgrade Admin node¶
If you have one, now it’s time to upgrade Ceph on your administration node (e.g., the one from which you run Ansible playbooks).
Re-run ceph-ansible¶
Update your ceph-ansible files. At the very least you should:
file
cluster-primary/inventory
: if you have not already done so add[clients]
hosts, to include your administration nodefile
cluster-primary/inventory
: add[mgrs]
hosts, colocating them with MON nodesfile
group_vars/all.yml
:ceph_stable_release: luminous # In the CONFIGURATION section: mon allow pool delete = true # Line below specific to our case, we have hyge memory bluestore_cache_size_hdd: 2*1024*1024*1024
Execute ceph-ansible
playbook site.yml
.
Global cluster settings¶
After the upgrade, ceph -s
will show HEALTH_WARN
. To fix that you will
have to set (note that huge rebalancing may happen):
ceph osd require-osd-release luminous
# Set this if you have reasonably up-to-date clients everywhere
ceph osd set-require-min-compat-client jewel
# this may cause some rebalancing
ceph osd crush tunables optimal
After this, ceph -s
may still complain: in fact, with Luminous it is now
mandatory to enable applications on pools. This is done via the command:
ceph osd application enable <pool_name> <app-name>
Execute ceph health detail
to find what pools need to be enabled and what are
the valid app-names.
Upgrade to BlueStore¶
I performed the upgrade one host at a time, removing all OSDs without prior reweighting and adding them back in. This of course causes rebalancing so schedule the activity to happen during off-peak hours.
You have replica 3, right? So your cluster should withstand losing one server for a while (24 hours, in my case). I think the alternative procedure:
reweight OSDs to some small fraction
remove OSDs
add them back in, which will reset their weight to default
would cause far more rebalancing, for far longer time.
When using ceph-ansible
you remove OSDs by running infrastructure-playbooks/shrink-osds.yml
,
which will also wipe OSD disk partitions and make them ready to be discovered
by subsequent running of site.yml
playbook. Note that the playbook only deals
with OSDs under root=default
so OSDs below other roots
should be moved
to root=default
first and then removed (unless you like doing things by hand).
Note
Upgrading to Luminous and Bluestore when using more than one root
should be safe without any special precaution, as Luminous deals introduces
shadow root
which should make your special devices (ssd
or big
)
always accessible. However, to be on the safe side I opted for upgrading to
Blustore 50% of my special OSD devices, then I modified the CRUSH ruleset as
explained below, then I upgraded the rest of the devices.
Luminous introduced the concept of tags (device classes) for disks and is capable
of guessing some default classes like ssd
, hdd
and nvme
.
Newly discovered disks will have such class properly configured, however if you
need to change it, possibly because you are introducing a new class, you can do so
with:
ceph osd crush rm-device-class osd.2 osd.3
ceph osd crush set-device-class ssd osd.2 osd.3
After each node has been upgraded, wait until the status goes back to HEALTH_OK
.
Upgrade CRUSHmap¶
Download the CRUSHmap and edit such that rulesets match the device class you
intend to use. For example, in my replicated_ruleset
I changed the line:
step take default
to:
step take default class hdd
Compile the CRUSHmap and apply it.