Bug 1337915 - purge-cluster.yml confused by presence of ceph installer, ceph kernel threads
Summary: purge-cluster.yml confused by presence of ceph installer, ceph kernel threads
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 3.0
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: 3.3
Assignee: Guillaume Abrioux
QA Contact: Vasishta
URL:
Whiteboard:
Depends On:
Blocks: 1726135
TreeView+ depends on / blocked
 
Reported: 2016-05-20 12:59 UTC by Ben England
Modified: 2019-08-21 15:10 UTC (History)
17 users (show)

Fixed In Version: RHEL: ceph-ansible-3.2.18-1.el7cp Ubuntu: ceph-ansible_3.2.18-2redhat1
Doc Type: Bug Fix
Doc Text:
.The `purge-cluster.yml` playbook no longer causes issues with redeploying a cluster Previously the `purge-cluster.yml` Ansible playbook did not clean all Red Hat Ceph Storage kernel threads as it should and could leave CephFS `mountpoint` mounted and Ceph Block Devices mapped. This could prevent redeploying a cluster. With this update, the `purge-cluster.yml` Ansible playbook cleans all Ceph kernel threads, unmounts all Ceph related `mountpoint` on client nodes, and unmaps Ceph Block Devices so the cluster can be redeployed.
Clone Of:
Environment:
Last Closed: 2019-08-21 15:10:24 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 4141 0 None closed purge: ensure no ceph kernel thread is present 2020-08-13 15:17:06 UTC
Red Hat Product Errata RHSA-2019:2538 0 None None None 2019-08-21 15:10:41 UTC

Description Ben England 2016-05-20 12:59:45 UTC
Description of problem:

purge-cluster.yml ceph-ansible playbook does not complete on hosts that have ceph kernel threads or ceph installer present

So why do we care?  purge-cluster.yml is useful, almost essential, when it's necessary to reconfigure the ceph cluster for testing purposes or installation time when an install does not work as expected.

Version-Release number of selected component (if applicable):

RHEL7.2 GA
RHSCON-2 puddle from 5/18/16
CEPH-2 puddle from 5/19/16

How reproducible:

every time

Steps to Reproduce:
1. create a cluster in the usual way with ceph-ansible
2. mount ceph filesystem using KERNEL cephfs not FUSE
3. unmount ceph filesystem (kernel threads remain)
4. purge cluster using the purge-cluster.yml playbook

Actual results:

The task named "check for anything running ceph" fails even if ceph daemons are not running.  It's just not a good way to check we shut them down, sorry I did it this way.  This can also happen if ceph-ansible is run from a bare metal host that is also being used to run ceph daemons such as ceph-mon, not a recommended practice but sometimes useful if you have a few servers but no separate hosts or VMs to run a deployment host on.

Expected results:

purge-cluster.yml should try to remove ceph kernel modules and should fail if the kernel modules cannot be removed (can happen if cephfs kernel module or kernel RBD has resources).


Additional info:

The patch that works for me so far just removes the "check for anything running ceph" task.  Instead, I would put this task up front:

+  - name: remove kernel ceph modules
+    shell: modprobe -rv ceph libceph

modprobe does not return an error status if the modules aren't there, so this will only fail if the modules are there and cannot be removed because they are in use.

Should there be something for kernel rbd in the modprobe list as well?

Comment 5 Sébastien Han 2019-01-10 16:15:25 UTC
Ramana, please take Ben's suggestion and send a PR for this.

Comment 6 Guillaume Abrioux 2019-02-11 10:40:23 UTC
Hi Ramana,

did you get a chance to take a look at this BZ ? (see C5 comment from Sebastien).

Comment 7 Ram Raja 2019-02-12 11:53:56 UTC
(In reply to Guillaume Abrioux from comment #6)
> Hi Ramana,
> 
> did you get a chance to take a look at this BZ ? (see C5 comment from
> Sebastien).

Yes, I did look at the BZ before. But haven't started working on it. I am OK if someone else wants to take up this BZ, if this is urgent.

Comment 8 Ram Raja 2019-02-27 11:10:18 UTC
I can no longer reproduce this issue. "purge-cluster.yml" completed on all hosts, including the ceph client machine where a Ceph filesystem was mounted and unmounted using kernel client.

The issue (back in 2016/05/20) seems to have been caused by task "check for anything running ceph" in purge-cluster.yml  working incorrectly. This particular task has undergone quite a few changes since then
and seems to work currently now.

When this BZ was filed the task used to be ,
  - name: check for anything running ceph
    shell: "ps awux | grep -v grep | grep -q -- ceph-"
    register: check_for_running_ceph
    failed_when: check_for_running_ceph.rc == 0

   introduced by commit 90fd2c70

Now the task is,
  - name: check for anything running ceph
    command: "ps -u ceph -U ceph"
    register: check_for_running_ceph
    failed_when: check_for_running_ceph.rc == 0

    introduced by commit 5a3f95dfc

Also, I don't think we want to remove libceph and rbd kernel module as ceph-ansible doesn't explicitly install them. The modules come bundled with the distribution kernel.
 

Ben, are you still hitting this issue with RHCS 3.2? If not we can close this BZ.

Comment 9 Ben England 2019-02-27 11:50:29 UTC
I haven't tried running purge-cluster or purge-docker-cluster playbooks this way lately.  So as far as I know it's still an issue.  

So it's true that ceph-ansible doesn't install libceph and rbd kernel module, and we really don't have to remove the modules, but we *do* have to unmap all RBD devices and unmount all Cephfs mountpoints before we install a new version or configuration of Ceph, right?  Otherwise these devices and mountpoints will have nothing supporting them and become difficult to clean up or re-initialize (may even require a reboot).  The "modprobe -rv" was put there only because it would fail if someone left /dev/rbd* or Cephfs mountpoints active on any host in the inventory.  

But in hindsight, I see that modprobe -rv is not sufficient to solve this problem, because of the way that ansible works - it will fail on that one host, stopping purge-cluster on that host, but other hosts will continue, and so the cluster will still be mostly taken down and the left-over RBD devices and mountpoints will still be problematic.   Can we change the modprobe -rv task so that it is done first, and if *any* host fails to do modprobe -rv rbd libceph, the whole playbook stops right there?  If not, the playbook should explicitly remove the /dev/rbd* and Cephfs mountpoints, with "rbd unmap" and "umount" commands, and do that at the very beginning of the purge-cluster run, before any other steps have been taken.  

RGW does not have this problem because the clients are using S3 or Swift, which are HTTP-based protocols, and there are no client resources that need to be removed in order to disconnect the clients from the servers.

Comment 10 Ram Raja 2019-03-01 15:01:30 UTC
(In reply to Ben England from comment #9)
> I haven't tried running purge-cluster or purge-docker-cluster playbooks this
> way lately.  So as far as I know it's still an issue.

Oh OK. So you're also concerned about 'clean' purge of client machines with actively
mounted Ceph filesystems with cephfs kernel client or mapped devices with
rbd cephfs client. I didn't get this while reading the bug description.
Thanks for the explanation.

> 
> So it's true that ceph-ansible doesn't install libceph and rbd kernel
> module, and we really don't have to remove the modules, but we *do* have to
> unmap all RBD devices and unmount all Cephfs mountpoints before we install a
> new version or configuration of Ceph, right?  Otherwise these devices and
> mountpoints will have nothing supporting them and become difficult to clean
> up or re-initialize (may even require a reboot).  The "modprobe -rv" was put
> there only because it would fail if someone left /dev/rbd* or Cephfs
> mountpoints active on any host in the inventory.  
> 
> But in hindsight, I see that modprobe -rv is not sufficient to solve this
> problem, because of the way that ansible works - it will fail on that one
> host, stopping purge-cluster on that host, but other hosts will continue,
> and so the cluster will still be mostly taken down and the left-over RBD
> devices and mountpoints will still be problematic.   Can we change the
> modprobe -rv task so that it is done first, and if *any* host fails to do
> modprobe -rv rbd libceph, the whole playbook stops right there?

Yes, I think we can do this by first checking in client hosts whether
rbd or ceph kernel modules is in use. If the kernel module is in
use, then fail on all hosts using,
`any_errors_fatal: true` setting,

https://docs.ansible.com/ansible/latest/user_guide/playbooks_error_handling.html#aborting-the-play


> If not, the playbook should explicitly remove the /dev/rbd* and Cephfs mountpoints, with
> "rbd unmap" and "umount" commands, and do that at the very beginning of the
> purge-cluster run, before any other steps have been taken.  

Yes, we can do this too using the following commands.

  umount -a -type ceph # unmount ceph filesystems

  rbdmap unmap-all # unmap all devices 

> 
> RGW does not have this problem because the clients are using S3 or Swift,
> which are HTTP-based protocols, and there are no client resources that need
> to be removed in order to disconnect the clients from the servers.

Here we're worried about only kernel clients. Would there be similar 
"difficult to clean up or re-initialize (may even require a reboot)" issues
with leftover RBD devices and CephFS mounts that use userspace ceph clients.
e.g. rbd-nbd/librbd, ceph-fuse/libcephfs?

Comment 11 Ben England 2019-03-01 15:36:34 UTC
I'll defer to you on implementation method, all the above sounds good, I didn't know about rbd unmap-all command. 

What about iSCSI? Is there a way to ask the iSCSI daemon if it's serving any remote clients before it is shut down?
 
How would NFS client mountpoints be handled if you uninstall Ceph and Ganesha Ceph FSAL is serving NFS clients at the time?  Is there a way to detect that an NFS Ganesha process is actively serving a NFS client mountpoint before you shut it down?

Don't forget about FUSE Cephfs mounts - some sites use both FUSE and kernel Cephfs.

is nbd still in the picture?   I thought they decided not to implement it.  If nbd is being used, that's a kernel block device so there should be some way to detect that it's attached to a Ceph daemon?

rr>Here we're worried about only kernel clients. Would there be similar 
"difficult to clean up or re-initialize (may even require a reboot)" issues
with leftover RBD devices and CephFS mounts that use userspace ceph clients.
e.g. rbd-nbd/librbd, ceph-fuse/libcephfs?

if you rip ceph out from underneath a process that's using libcephfs or the like, you may have to restart the client process, but you won't have to reboot the host.  And the client's TCP socket should get a disconnection event that should shut it down.

Thanks for looking at this, it's strange but cluster teardown is important functionality because sometimes you need to re-do an install (because it's hard to get an install done right the first time), and purge*cluster.yml is the only practical way to achieve that.

Comment 19 Giridhar Ramaraju 2019-08-05 13:09:01 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 20 Giridhar Ramaraju 2019-08-05 13:10:23 UTC
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. 

Regards,
Giri

Comment 22 errata-xmlrpc 2019-08-21 15:10:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2538


Note You need to log in before you can comment on or make changes to this bug.