Bug 1656935
Summary: | ceph-ansible: purge-cluster.yml fails when initiated second time | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Tiffany Nguyen <tunguyen> | ||||||
Component: | Ceph-Ansible | Assignee: | Guillaume Abrioux <gabrioux> | ||||||
Status: | CLOSED NEXTRELEASE | QA Contact: | Vasishta <vashastr> | ||||||
Severity: | low | Docs Contact: | John Brier <jbrier> | ||||||
Priority: | low | ||||||||
Version: | 3.2 | CC: | anharris, aschoen, ceph-eng-bugs, ceph-qe-bugs, edonnell, gabrioux, gmeno, hnallurv, jbrier, kdreyer, nthomas, pasik, sankarshan, seb, shan, tchandra, tserlin, tunguyen | ||||||
Target Milestone: | rc | Keywords: | Reopened | ||||||
Target Release: | 3.3 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | RHEL: ceph-ansible-3.2.12-1.el7cp Ubuntu: ceph-ansible_3.2.12-2redhat1 | Doc Type: | Bug Fix | ||||||
Doc Text: |
.The `ceph-ansible purge-cluster.yml` playbook no longer fails when run against a cluster that has already been purged
Previously, the `ceph-ansible purge-cluster.yml` playbook failed when run against a cluster that had already been purged. This was because `ceph-volume` had been removed during the first run, and the command could no longer be found. With this update, the underlying issue has been fixed, and running `ceph-ansible purge-cluster.yml` for a second time no longer fails.
|
Story Points: | --- | ||||||
Clone Of: | |||||||||
: | 1722663 (view as bug list) | Environment: | |||||||
Last Closed: | 2019-06-20 22:39:36 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1722663 | ||||||||
Attachments: |
|
Does /dev/sda exist on your system? I think this is happening because on the second purge ceph-volume doesn't exist on the system anymore. The playbook will need to be modified to skip that task if ceph-volume isn't installed. If you reached the point where packages have been uninstalled then it's pointless to run purge a second time, but we can probably handle that error more elegantly. With latest build 3.2.0-1.el7cp, I don't see the failure any more. It is now skipping the task: TASK [zap and destroy osds created by ceph-volume with devices] ****************************************************** Thursday 13 December 2018 17:43:38 +0000 (0:00:00.160) 0:00:13.161 ***** skipping: [mero005] => (item=/dev/sda) skipping: [mero005] => (item=/dev/sdb) skipping: [mero005] => (item=/dev/sdc) skipping: [mero005] => (item=/dev/sdd) Did we check in the fix for this in latest build? https://github.com/ceph/ceph-ansible/pull/3446 is now available in ceph-ansible v3.2.1. Hi, lvms are not removed from ubuntu machines when playbook is initiated as non-root user as the 'command' in the task "see if ceph-volume is installed" (newly introduced as part of PR 3436) not working as expected. log - https://bugzilla.redhat.com/attachment.cgi?id=1520023 ansible user- ubuntu $ sudo cat /etc/sudoers.d/ubuntu ubuntu ALL = (root) NOPASSWD:ALL ubuntu@magna029:~$ ls -l /etc/sudoers.d/ubuntu -r--r----- 1 root root 33 Jan 11 06:57 /etc/sudoers.d/ubuntu When tried 'command' manually as ansible user - $ sudo command -v ceph-volume sudo: command: command not found Moving back to ASSIGNED state, request you to kindly look int this and let me know your views. Regards, Vasishta shastry QE, Ceph Yes, the fix is part of 3.2.3. Thanks Hi Andrew, Request you to kindly provide your views on Comment 13 The fix for this BZ is blocking us to VERIFY the fix for Bug 1653307 . Created attachment 1558140 [details]
File contains log snippets
(Attachment contains failure log snippet, inventory with lvm_volumes argument, lsblk before second run, playbook log)
playbook is failing when initiated for second time if there were any existing lvs which were not part of the cluster.
Moving back to ASSIGNED state, reducing severity to low.
Regards,
Vasishta Shastry
QE, Ceph
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:0911 |
Created attachment 1512212 [details] ansible-playbook log Description of problem: purge-cluster.yml fails when initiated second time: TASK [zap and destroy osds created by ceph-volume with devices] ******************************************************************************** Thursday 06 December 2018 16:30:57 +0000 (0:00:00.142) 0:00:14.177 ***** failed: [mero005] (item=/dev/sda) => {"changed": false, "cmd": "ceph-volume lvm zap --destroy /dev/sda", "item": "/dev/sda", "msg": "[Errno 2] No such file or directory", "rc": 2} .... Version-Release number of selected component (if applicable): ceph-ansible: 3.2.0-0.1.rc8.el7cp How reproducible: 1. Deploy cluster with 3.2.0-0.1.rc8.el7cp 2. Perform purge-cluster.yml 2 times: first purge works as expected, 2nd purge fails at task zap and destroy osds. Expected results: Task should be skipped