Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 1479797

Summary:	do not ignore non-zero exit status when activating
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Alfredo Deza <adeza>
Component:	Ceph-Volume	Assignee:	Alfredo Deza <adeza>
Status:	CLOSED ERRATA	QA Contact:	shylesh <shmohan>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.0	CC:	ceph-eng-bugs, ceph-qe-bugs, gmeno, hnallurv, icolle, kdreyer, rperiyas
Target Milestone:	rc
Target Release:	3.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	RHEL: ceph-12.1.4-1.el7cp Ubuntu: ceph_12.1.4-2redhat1xenial	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-12-05 23:39:05 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Alfredo Deza 2017-08-09 12:18:13 UTC

Description of problem: The workflow to activate on boot relies on trying to mount a few times if a volume doesn't come up when the exit status is non-zero. Activate is ignoring this by not checking the exit status, which makes the workflow not try again and volumes not get activated when rebooting


How reproducible: Not always, must reboot on a system that is not very fast


Actual results:
[root@ceph-osd0 ceph]# tail ceph-volume-systemd.log
[2017-08-09 12:07:19,369][systemd][INFO  ] raw systemd input received: lvm-0-8138fb63-affc-4aae-b784-346b86d09439
[2017-08-09 12:07:19,369][systemd][INFO  ] parsed sub-command: lvm, extra data: 0-8138fb63-affc-4aae-b784-346b86d09439
[2017-08-09 12:07:19,369][ceph_volume.process][INFO  ] Running command: ceph-volume lvm trigger 0-8138fb63-affc-4aae-b784-346b86d09439
[2017-08-09 12:07:25,627][ceph_volume.process][INFO  ] stdout Running command: sudo lvs -o lv_tags,lv_path,lv_name,vg_name --reportformat=json
Running command: sudo mount -v /dev/test_group/test_volume /var/lib/ceph/osd/ceph-0
 stderr: mount: special device /dev/test_group/test_volume does not exist
Running command: chown -R ceph:ceph /dev/sdc
Running command: sudo systemctl enable ceph-volume@lvm-0-8138fb63-affc-4aae-b784-346b86d09439
Running command: sudo systemctl start ceph-osd@0
[2017-08-09 12:07:25,683][systemd][INFO  ] successfully trggered activation for: 0-8138fb63-affc-4aae-b784-346b86d09439


Expected results: The osd is actually mounted and started

Comment 2 Alfredo Deza 2017-08-10 11:59:05 UTC

Merged to upstream master as part of pull request:

    https://github.com/ceph/ceph/pull/16919

Relevant commits:

ceph-volume: lvm activate should not ignore exit status codes
c866123017a1defac249bebe76cc7bbaddf3cf67

ceph-volume: util add a helper to check if device is mounted
d77d86aae11fba01834bb8d60633f3f49126c783

ceph-volume: lvm activate should check if the device is mounted to prevent errors from  mount
c61aea41f1d07b824e169bf12328b7eb0055e23f

Comment 3 Ken Dreyer (Red Hat) 2017-08-10 14:43:14 UTC

At this point luminous is permanently branched from master (http://marc.info/?l=ceph-devel&m=150212189321868&w=2)

We need a PR to the luminous branch with the appropriate cherry-picks in order for this to land in v12.2.0 upstream.

Comment 4 Alfredo Deza 2017-08-10 15:35:34 UTC

PR targeting luminous: https://github.com/ceph/ceph/pull/16970/

Comment 9 Ramakrishnan Periyasamy 2017-11-06 11:25:50 UTC

Configured ceph-volume OSD's and rebooted machine for 15 to 20 times, in which 3 times machine boot was slow upto 30-45 sec.

Not observed any problem. OSD's came up without any issues.

Based on above data, moving this bug to verified state if anything otherwise let me know.

Verified in build:
ceph version 12.2.1-39.el7cp (22e26be5a4920c95c43f647b31349484f663e4b9) luminous (stable)

Comment 12 errata-xmlrpc 2017-12-05 23:39:05 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387