Description of problem: When using devices like /dev/sdc for journals, those can change when a system reboots, which causes the OSD to not be able to start How reproducible: Not consistent, as triggering this bug depends on restarting the system enough until the ordering of devices is no longer what it was when the journal was created Steps to Reproduce: 1. Deploy an OSD with `ceph-volume lvm` 2. Reboot until the OSD doesn't come up Actual results: 2017-08-24 19:57:08.674929 7f33e226ce00 -1 journal read_header error decoding journal header 2017-08-24 19:57:08.675318 7f33e226ce00 -1 filestore(/var/lib/ceph/osd/ceph-0) mount(1821): failed to open journal /var/lib/ceph/osd/ceph-0/journal: (22) Invalid argument 2017-08-24 19:57:08.676046 7f33e226ce00 -1 osd.0 0 OSD:init: unable to mount object store 2017-08-24 19:57:08.676057 7f33e226ce00 -1 ** ERROR: osd init failed: (22) Invalid argument Expected results: The OSD is started correctly Additional info: The way to fix this is to rely on lvm once again, making the device a "PV" so that we can capture that UUID and retrieve it later.
I verified this on RHEL 7.4 with 12.2.1-9.el7cp and on Xenial with 12.2.1-10redhat1xenial To verify I deployed a cluster with ceph-ansible onto vagrant vms, using /dev/sdc1 as a journal for one of the lvm OSDs. After deployment I shut down the OSD vm, used the virtualbox gui to force a port change on /dev/sda which causes all device names to change and then restarted the vm. After restart /dev/sdc1 was renamed to /deb/sdb1 and all OSDS were still up and running. Sample output from the RHEL 7.4 test below: Before the port change: [vagrant@osd0 ~]$ sudo ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.00778 root default -3 0.00778 host osd0 0 hdd 0.00519 osd.0 up 1.00000 1.00000 1 hdd 0.00259 osd.1 up 1.00000 1.00000 [vagrant@osd0 ~]$ sudo lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 8G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 7G 0 part ├─rhel-root 253:0 0 6.2G 0 lvm / └─rhel-swap 253:1 0 820M 0 lvm [SWAP] sdb 8:16 0 10.8G 0 disk ├─test_group-data--lv1 253:2 0 5.4G 0 lvm /var/lib/ceph/osd/ceph-0 └─test_group-data--lv2 253:3 0 2.7G 0 lvm /var/lib/ceph/osd/ceph-1 sdc 8:32 0 10.8G 0 disk ├─sdc1 8:33 0 5.4G 0 part └─sdc2 8:34 0 5.4G 0 part └─journals-journal1 253:4 0 5.4G 0 lvm sdd 8:48 0 10.8G 0 disk loop0 7:0 0 100G 0 loop └─docker-253:0-12006-pool 253:5 0 100G 0 dm loop1 7:1 0 2G 0 loop └─docker-253:0-12006-pool 253:5 0 100G 0 dm [vagrant@osd0 ~]$ sudo service ceph-osd@0 status Redirecting to /bin/systemctl status ceph-osd ● ceph-osd - Ceph object storage daemon osd.0 Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-10-05 08:36:24 PDT; 3min 57s ago Main PID: 6024 (ceph-osd) CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd └─6024 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph Oct 05 08:36:24 osd0 systemd[1]: Starting Ceph object storage daemon osd.0... Oct 05 08:36:24 osd0 systemd[1]: Started Ceph object storage daemon osd.0. Oct 05 08:36:24 osd0 ceph-osd[6024]: starting osd.0 at - osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal Oct 05 08:36:24 osd0 ceph-osd[6024]: 2017-10-05 08:36:24.993907 7f4c1dd06d00 -1 osd.0 0 log_to_monitors {default=true} Oct 05 08:36:25 osd0 ceph-osd[6024]: 2017-10-05 08:36:25.783976 7f4c0427c700 -1 osd.0 0 waiting for initial osdmap [vagrant@osd0 ~]$ sudo service ceph-osd@1 status Redirecting to /bin/systemctl status ceph-osd ● ceph-osd - Ceph object storage daemon osd.1 Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-10-05 08:36:27 PDT; 4min 1s ago Main PID: 6371 (ceph-osd) CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd └─6371 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph Oct 05 08:36:27 osd0 systemd[1]: Starting Ceph object storage daemon osd.1... Oct 05 08:36:27 osd0 systemd[1]: Started Ceph object storage daemon osd.1. Oct 05 08:36:27 osd0 ceph-osd[6371]: starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal Oct 05 08:36:27 osd0 ceph-osd[6371]: 2017-10-05 08:36:27.160526 7f4573b8dd00 -1 osd.1 0 log_to_monitors {default=true} Oct 05 08:36:28 osd0 ceph-osd[6371]: 2017-10-05 08:36:28.804068 7f455a103700 -1 osd.1 0 waiting for initial osdmap After the port change and reboot: [vagrant@osd0 ~]$ sudo ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.00778 root default -3 0.00778 host osd0 0 hdd 0.00519 osd.0 up 1.00000 1.00000 1 hdd 0.00259 osd.1 up 1.00000 1.00000 [vagrant@osd0 ~]$ sudo lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 8G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 7G 0 part ├─rhel-root 253:0 0 6.2G 0 lvm / └─rhel-swap 253:1 0 820M 0 lvm [SWAP] sdb 8:16 0 10.8G 0 disk ├─sdb1 8:17 0 5.4G 0 part └─sdb2 8:18 0 5.4G 0 part └─journals-journal1 253:3 0 5.4G 0 lvm sdc 8:32 0 10.8G 0 disk sdd 8:48 0 10.8G 0 disk ├─test_group-data--lv1 253:2 0 5.4G 0 lvm /var/lib/ceph/osd/ceph-0 └─test_group-data--lv2 253:4 0 2.7G 0 lvm /var/lib/ceph/osd/ceph-1 [vagrant@osd0 ~]$ sudo service ceph-osd@0 status Redirecting to /bin/systemctl status ceph-osd ● ceph-osd - Ceph object storage daemon osd.0 Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-10-05 08:43:29 PDT; 59s ago Process: 1289 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) Main PID: 1313 (ceph-osd) CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd └─1313 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph Oct 05 08:43:29 osd0 systemd[1]: Starting Ceph object storage daemon osd.0... Oct 05 08:43:29 osd0 systemd[1]: Started Ceph object storage daemon osd.0. Oct 05 08:43:29 osd0 ceph-osd[1313]: starting osd.0 at - osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal Oct 05 08:43:29 osd0 ceph-osd[1313]: 2017-10-05 08:43:29.686160 7fef23ea0d00 -1 osd.0 8 log_to_monitors {default=true} [vagrant@osd0 ~]$ sudo service ceph-osd@1 status Redirecting to /bin/systemctl status ceph-osd ● ceph-osd - Ceph object storage daemon osd.1 Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-10-05 08:43:29 PDT; 1min 18s ago Process: 1290 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) Main PID: 1314 (ceph-osd) CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd └─1314 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph Oct 05 08:43:29 osd0 systemd[1]: Starting Ceph object storage daemon osd.1... Oct 05 08:43:29 osd0 systemd[1]: Started Ceph object storage daemon osd.1. Oct 05 08:43:29 osd0 ceph-osd[1314]: starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal Oct 05 08:43:29 osd0 ceph-osd[1314]: 2017-10-05 08:43:29.687066 7f2391344d00 -1 osd.1 8 log_to_monitors {default=true} You can see that all osds are still up and that lsblk reports a different device name for the journal partition.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387