Bug 1615872
Summary: | purge cluster: do not umount /var/lib/ceph | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Sébastien Han <shan> |
Component: | Ceph-Ansible | Assignee: | Sébastien Han <shan> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | subhash <vpoliset> |
Severity: | medium | Docs Contact: | Aron Gunn <agunn> |
Priority: | medium | ||
Version: | 3.1 | CC: | agunn, anharris, aschoen, ceph-eng-bugs, ceph-qe-bugs, gabrioux, gmeno, hgurav, hnallurv, jbrier, kdreyer, nthomas, sankarshan, shan, tserlin |
Target Milestone: | z1 | Keywords: | TestOnly |
Target Release: | 3.1 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | RHEL: ceph-ansible-3.1.3.el7cp Ubuntu: ceph-ansible_3.1.3-2redhat1 | Doc Type: | Bug Fix |
Doc Text: |
.Purging the cluster no longer unmounts a partition from /var/lib/ceph
Previously, if you mounted a partition to /var/lib/ceph, running the purge playbook caused a failure when it tried to unmount it.
With this update, partitions mounted to /var/lib/ceph are not unmounted during a cluster purge.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2019-01-08 17:26:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1584264 |
Description
Sébastien Han
2018-08-14 12:59:38 UTC
In https://github.com/ceph/ceph-ansible/releases/tag/v3.0.43 and https://github.com/ceph/ceph-ansible/releases/tag/v3.1.0rc18 Hi Sebastian, Steps followed to verify 0)created a dir to mount in /var/lib/ceph/ mkdir /var/lib/ceph/mntdir 1) mount some x partition(used a disk partition which is not part of cluster) on /var/lib/ceph, mount /dev/sdc1 /var/lib/ceph/mntdir 2) Purge the cluster (playbook failed) RUNNING HANDLER [remove data] ************************************************************** The full traceback is: File "/tmp/ansible_rai1lQ/ansible_module_file.py", line 278, in main shutil.rmtree(b_path, ignore_errors=False) File "/usr/lib64/python2.7/shutil.py", line 247, in rmtree rmtree(fullname, ignore_errors, onerror) File "/usr/lib64/python2.7/shutil.py", line 256, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.7/shutil.py", line 254, in rmtree os.rmdir(path) fatal: [magna021]: FAILED! => { "changed": false, "failed": true, "invocation": { "module_args": { "attributes": null, "backup": null, "content": null, "delimiter": null, "diff_peek": null, "directory_mode": null, "follow": false, "force": false, "group": null, "mode": null, "original_basename": null, "owner": null, "path": "/var/lib/ceph", "recurse": false, "regexp": null, "remote_src": null, "selevel": null, "serole": null, "setype": null, "seuser": null, "src": null, "state": "absent", "unsafe_writes": null, "validate": null } }, "msg": "rmtree failed: [Errno 16] Device or resource busy: '/var/lib/ceph/mntdir'" 3) Purge should be successful. Only the OSD dir should be umounted. 4) recreate the cluster. Cluster should come up successfully pls let me know if you have any concerns with the steps. thanks That's not the right approach, /var/lib/ceph should be the mountpoint i have even tried mounting a partition at /var/lib/ceph and got this error while purging. TASK [umount osd data partition] *************************************************************** task path: /usr/share/ceph-ansible/purge-cluster.yml:283 Friday 24 August 2018 05:03:56 +0000 (0:00:02.419) 0:00:41.310 ********* Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py Using module file /usr/lib/python2.7/site-packages/ansible/modules/commands/command.py <magna029> ESTABLISH SSH CONNECTION FOR USER: None <magna028> ESTABLISH SSH CONNECTION FOR USER: None <magna021> ESTABLISH SSH CONNECTION FOR USER: None <magna028> SSH: EXEC ssh -vvv -o ControlMaster=auto -o ControlPersist=600s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=30 -o ControlPath=/root/.ansible/cp/%h-%r-%p magna028 '/bin/sh -c '"'"'sudo -H -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-ijwlzcpfwvhdgjztrqeyktdgdxffjurf; /usr/bin/python'"'"'"'"'"'"'"'"' && sleep 0'"'"'' <magna029> SSH: EXEC ssh -vvv -o ControlMaster=auto -o ControlPersist=600s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=30 -o ControlPath=/root/.ansible/cp/%h-%r-%p magna029 '/bin/sh -c '"'"'sudo -H -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-sepaaedppavugaibmpfvqkczqadufrxv; /usr/bin/python'"'"'"'"'"'"'"'"' && sleep 0'"'"'' <magna021> SSH: EXEC ssh -vvv -o ControlMaster=auto -o ControlPersist=600s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=30 -o ControlPath=/root/.ansible/cp/%h-%r-%p magna021 '/bin/sh -c '"'"'sudo -H -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-trarpzyyqdpfrhouhisakoemvvxwauvc; /usr/bin/python'"'"'"'"'"'"'"'"' && sleep 0'"'"'' <magna021> (1, '\n{"changed": true, "end": "2018-08-24 05:03:58.585417", "stdout": "", "cmd": "umount /var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c", "failed": true, "delta": "0:00:00.033318", "stderr": "umount: /var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c: mountpoint not found", "rc": 32, "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": true, "_raw_params": "umount /var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c", "removes": null, "creates": null, "chdir": null, "stdin": null}}, "start": "2018-08-24 05:03:58.552099", "msg": "non-zero return code"}\n', 'OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017\r\ndebug1: Reading configuration data /root/.ssh/config\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 8: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 12793\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Received exit status from master 1\r\n') failed: [magna021] (item=/var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c) => { "changed": true, "cmd": "umount /var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c", "delta": "0:00:00.033318", "end": "2018-08-24 05:03:58.585417", "failed": true, "invocation": { "module_args": { "_raw_params": "umount /var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c", "_uses_shell": true, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true } }, "item": "/var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c", "msg": "non-zero return code", "rc": 32, "start": "2018-08-24 05:03:58.552099", "stderr": "umount: /var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c: mountpoint not found", "stderr_lines": [ "umount: /var/lib/ceph/osd-lockbox/a9ae88fa-56ae-4025-8330-7e2fc36b875c: mountpoint not found" ], "stdout": "", if you ls in /var/lib/ceph after mounting a partition at that location you get lost+found dir.all the existing files in them are not visible.Hence Above error The mountpoint /var/lib/ceph must contain files from the original /var/lib/ceph/. As discussed followed the below steps to verify: 1) created /var/lib/ceph dir on nodeX and mounted a disk partition in it .(the disk partition will not be part of ceph cluster) 2) deployed ceph cluster with ceph-ansible (with above nodeX as one of osd node).Cluster deployed fine 3) purged the ceph cluster purge-cluster.yml errored at one Task [remove ceph systemd unit files] -->running Handler Remove Data """"msg": "rmtree failed: [Errno 16] Device or resource busy: '/var/lib/ceph'"" (next tasks worked fine) Attaching logs version: ceph-ansible-3.1.0-0.1.rc21.el7cp.noarch Followed the steps as per comment #12 ,purge works fine .moving to verified state [ubuntu@magna097 ~]$ rpm -qa | grep ansible ansible-2.4.6.0-1.el7ae.noarch ceph-ansible-3.1.9-1.el7cp.noarch Updated Doc Text from Known Issue to Bug Fix. Code landed in ceph-ansible v3.1.3, we shipped v3.1.5 in https://access.redhat.com/errata/RHBA-2018:2819 QE verified on ceph-ansible-3.1.9-1.el7cp . Latest available version is ceph-ansible-3.2.0-1.el7cp from http://access.redhat.com/errata/RHBA-2019:0020 |