Created attachment 1512212 [details] ansible-playbook log Description of problem: purge-cluster.yml fails when initiated second time: TASK [zap and destroy osds created by ceph-volume with devices] ******************************************************************************** Thursday 06 December 2018 16:30:57 +0000 (0:00:00.142) 0:00:14.177 ***** failed: [mero005] (item=/dev/sda) => {"changed": false, "cmd": "ceph-volume lvm zap --destroy /dev/sda", "item": "/dev/sda", "msg": "[Errno 2] No such file or directory", "rc": 2} .... Version-Release number of selected component (if applicable): ceph-ansible: 3.2.0-0.1.rc8.el7cp How reproducible: 1. Deploy cluster with 3.2.0-0.1.rc8.el7cp 2. Perform purge-cluster.yml 2 times: first purge works as expected, 2nd purge fails at task zap and destroy osds. Expected results: Task should be skipped
Does /dev/sda exist on your system?
I think this is happening because on the second purge ceph-volume doesn't exist on the system anymore. The playbook will need to be modified to skip that task if ceph-volume isn't installed.
If you reached the point where packages have been uninstalled then it's pointless to run purge a second time, but we can probably handle that error more elegantly.
With latest build 3.2.0-1.el7cp, I don't see the failure any more. It is now skipping the task: TASK [zap and destroy osds created by ceph-volume with devices] ****************************************************** Thursday 13 December 2018 17:43:38 +0000 (0:00:00.160) 0:00:13.161 ***** skipping: [mero005] => (item=/dev/sda) skipping: [mero005] => (item=/dev/sdb) skipping: [mero005] => (item=/dev/sdc) skipping: [mero005] => (item=/dev/sdd) Did we check in the fix for this in latest build?
https://github.com/ceph/ceph-ansible/pull/3446 is now available in ceph-ansible v3.2.1.
Hi, lvms are not removed from ubuntu machines when playbook is initiated as non-root user as the 'command' in the task "see if ceph-volume is installed" (newly introduced as part of PR 3436) not working as expected. log - https://bugzilla.redhat.com/attachment.cgi?id=1520023 ansible user- ubuntu $ sudo cat /etc/sudoers.d/ubuntu ubuntu ALL = (root) NOPASSWD:ALL ubuntu@magna029:~$ ls -l /etc/sudoers.d/ubuntu -r--r----- 1 root root 33 Jan 11 06:57 /etc/sudoers.d/ubuntu When tried 'command' manually as ansible user - $ sudo command -v ceph-volume sudo: command: command not found Moving back to ASSIGNED state, request you to kindly look int this and let me know your views. Regards, Vasishta shastry QE, Ceph
Yes, the fix is part of 3.2.3. Thanks
Hi Andrew, Request you to kindly provide your views on Comment 13 The fix for this BZ is blocking us to VERIFY the fix for Bug 1653307 .
Created attachment 1558140 [details] File contains log snippets (Attachment contains failure log snippet, inventory with lvm_volumes argument, lsblk before second run, playbook log) playbook is failing when initiated for second time if there were any existing lvs which were not part of the cluster. Moving back to ASSIGNED state, reducing severity to low. Regards, Vasishta Shastry QE, Ceph
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:0911