Bug 1485011

Summary:	cannot consistently use non-lv devices as journals
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Alfredo Deza <adeza>
Component:	Ceph-Volume	Assignee:	Alfredo Deza <adeza>
Status:	CLOSED ERRATA	QA Contact:	Andrew Schoen <aschoen>
Severity:	high	Docs Contact:	Bara Ancincova <bancinco>
Priority:	high
Version:	3.0	CC:	adeza, ceph-eng-bugs, ceph-qe-bugs, gmeno, hnallurv, icolle, kdreyer
Target Milestone:	rc
Target Release:	3.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	RHEL: ceph-12.2.0-2.el7cp Ubuntu: ceph_12.2.0-3redhat1xenial	Doc Type:	No Doc Update
Doc Text:	.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-12-05 23:41:09 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Alfredo Deza 2017-08-24 20:17:51 UTC

Description of problem: When using devices like /dev/sdc for journals, those can change when a system reboots, which causes the OSD to not be able to start


How reproducible: Not consistent, as triggering this bug depends on restarting the system enough until the ordering of devices is no longer what it was when the journal was created


Steps to Reproduce:
1. Deploy an OSD with `ceph-volume lvm`
2. Reboot until the OSD doesn't come up

Actual results:
2017-08-24 19:57:08.674929 7f33e226ce00 -1 journal read_header error decoding journal header
2017-08-24 19:57:08.675318 7f33e226ce00 -1 filestore(/var/lib/ceph/osd/ceph-0) mount(1821): failed to open journal /var/lib/ceph/osd/ceph-0/journal: (22) Invalid argument
2017-08-24 19:57:08.676046 7f33e226ce00 -1 osd.0 0 OSD:init: unable to mount object store
2017-08-24 19:57:08.676057 7f33e226ce00 -1  ** ERROR: osd init failed: (22) Invalid argument


Expected results:
The OSD is started correctly

Additional info: The way to fix this is to rely on lvm once again, making the device a "PV" so that we can capture that UUID and retrieve it later.

Comment 14 Andrew Schoen 2017-10-05 19:21:17 UTC

I verified this on RHEL 7.4 with 12.2.1-9.el7cp and on Xenial with 12.2.1-10redhat1xenial

To verify I deployed a cluster with ceph-ansible onto vagrant vms, using /dev/sdc1 as a journal for one of the lvm OSDs. After deployment I shut down the OSD vm, used the virtualbox gui to force a port change on /dev/sda which causes all device names to change and then restarted the vm. After restart /dev/sdc1 was renamed to /deb/sdb1 and all OSDS were still up and running.

Sample output from the RHEL 7.4 test below:

Before the port change:
[vagrant@osd0 ~]$ sudo ceph osd tree
ID CLASS WEIGHT  TYPE NAME     STATUS REWEIGHT PRI-AFF
-1       0.00778 root default
-3       0.00778     host osd0
 0   hdd 0.00519         osd.0     up  1.00000 1.00000
 1   hdd 0.00259         osd.1     up  1.00000 1.00000
[vagrant@osd0 ~]$ sudo lsblk
NAME                      MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                         8:0    0    8G  0 disk
├─sda1                      8:1    0    1G  0 part /boot
└─sda2                      8:2    0    7G  0 part
  ├─rhel-root             253:0    0  6.2G  0 lvm  /
  └─rhel-swap             253:1    0  820M  0 lvm  [SWAP]
sdb                         8:16   0 10.8G  0 disk
├─test_group-data--lv1    253:2    0  5.4G  0 lvm  /var/lib/ceph/osd/ceph-0
└─test_group-data--lv2    253:3    0  2.7G  0 lvm  /var/lib/ceph/osd/ceph-1
sdc                         8:32   0 10.8G  0 disk
├─sdc1                      8:33   0  5.4G  0 part
└─sdc2                      8:34   0  5.4G  0 part
  └─journals-journal1     253:4    0  5.4G  0 lvm
sdd                         8:48   0 10.8G  0 disk
loop0                       7:0    0  100G  0 loop
└─docker-253:0-12006-pool 253:5    0  100G  0 dm
loop1                       7:1    0    2G  0 loop
└─docker-253:0-12006-pool 253:5    0  100G  0 dm
[vagrant@osd0 ~]$ sudo service ceph-osd@0 status
Redirecting to /bin/systemctl status ceph-osd
● ceph-osd - Ceph object storage daemon osd.0
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-10-05 08:36:24 PDT; 3min 57s ago
 Main PID: 6024 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd
           └─6024 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph

Oct 05 08:36:24 osd0 systemd[1]: Starting Ceph object storage daemon osd.0...
Oct 05 08:36:24 osd0 systemd[1]: Started Ceph object storage daemon osd.0.
Oct 05 08:36:24 osd0 ceph-osd[6024]: starting osd.0 at - osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
Oct 05 08:36:24 osd0 ceph-osd[6024]: 2017-10-05 08:36:24.993907 7f4c1dd06d00 -1 osd.0 0 log_to_monitors {default=true}
Oct 05 08:36:25 osd0 ceph-osd[6024]: 2017-10-05 08:36:25.783976 7f4c0427c700 -1 osd.0 0 waiting for initial osdmap
[vagrant@osd0 ~]$ sudo service ceph-osd@1 status
Redirecting to /bin/systemctl status ceph-osd
● ceph-osd - Ceph object storage daemon osd.1
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-10-05 08:36:27 PDT; 4min 1s ago
 Main PID: 6371 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd
           └─6371 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph

Oct 05 08:36:27 osd0 systemd[1]: Starting Ceph object storage daemon osd.1...
Oct 05 08:36:27 osd0 systemd[1]: Started Ceph object storage daemon osd.1.
Oct 05 08:36:27 osd0 ceph-osd[6371]: starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
Oct 05 08:36:27 osd0 ceph-osd[6371]: 2017-10-05 08:36:27.160526 7f4573b8dd00 -1 osd.1 0 log_to_monitors {default=true}
Oct 05 08:36:28 osd0 ceph-osd[6371]: 2017-10-05 08:36:28.804068 7f455a103700 -1 osd.1 0 waiting for initial osdmap

After the port change and reboot:
[vagrant@osd0 ~]$ sudo ceph osd tree
ID CLASS WEIGHT  TYPE NAME     STATUS REWEIGHT PRI-AFF
-1       0.00778 root default
-3       0.00778     host osd0
 0   hdd 0.00519         osd.0     up  1.00000 1.00000
 1   hdd 0.00259         osd.1     up  1.00000 1.00000
[vagrant@osd0 ~]$ sudo lsblk
NAME                   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                      8:0    0    8G  0 disk
├─sda1                   8:1    0    1G  0 part /boot
└─sda2                   8:2    0    7G  0 part
  ├─rhel-root          253:0    0  6.2G  0 lvm  /
  └─rhel-swap          253:1    0  820M  0 lvm  [SWAP]
sdb                      8:16   0 10.8G  0 disk
├─sdb1                   8:17   0  5.4G  0 part
└─sdb2                   8:18   0  5.4G  0 part
  └─journals-journal1  253:3    0  5.4G  0 lvm
sdc                      8:32   0 10.8G  0 disk
sdd                      8:48   0 10.8G  0 disk
├─test_group-data--lv1 253:2    0  5.4G  0 lvm  /var/lib/ceph/osd/ceph-0
└─test_group-data--lv2 253:4    0  2.7G  0 lvm  /var/lib/ceph/osd/ceph-1
[vagrant@osd0 ~]$ sudo service ceph-osd@0 status
Redirecting to /bin/systemctl status ceph-osd
● ceph-osd - Ceph object storage daemon osd.0
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-10-05 08:43:29 PDT; 59s ago
  Process: 1289 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 1313 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd
           └─1313 /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph

Oct 05 08:43:29 osd0 systemd[1]: Starting Ceph object storage daemon osd.0...
Oct 05 08:43:29 osd0 systemd[1]: Started Ceph object storage daemon osd.0.
Oct 05 08:43:29 osd0 ceph-osd[1313]: starting osd.0 at - osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
Oct 05 08:43:29 osd0 ceph-osd[1313]: 2017-10-05 08:43:29.686160 7fef23ea0d00 -1 osd.0 8 log_to_monitors {default=true}
[vagrant@osd0 ~]$ sudo service ceph-osd@1 status
Redirecting to /bin/systemctl status ceph-osd
● ceph-osd - Ceph object storage daemon osd.1
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-10-05 08:43:29 PDT; 1min 18s ago
  Process: 1290 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 1314 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd
           └─1314 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph

Oct 05 08:43:29 osd0 systemd[1]: Starting Ceph object storage daemon osd.1...
Oct 05 08:43:29 osd0 systemd[1]: Started Ceph object storage daemon osd.1.
Oct 05 08:43:29 osd0 ceph-osd[1314]: starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
Oct 05 08:43:29 osd0 ceph-osd[1314]: 2017-10-05 08:43:29.687066 7f2391344d00 -1 osd.1 8 log_to_monitors {default=true}

You can see that all osds are still up and that lsblk reports a different device name for the journal partition.

Comment 20 errata-xmlrpc 2017-12-05 23:41:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387