Description of problem: purge-cluster.yml ceph-ansible playbook does not complete on hosts that have ceph kernel threads or ceph installer present So why do we care? purge-cluster.yml is useful, almost essential, when it's necessary to reconfigure the ceph cluster for testing purposes or installation time when an install does not work as expected. Version-Release number of selected component (if applicable): RHEL7.2 GA RHSCON-2 puddle from 5/18/16 CEPH-2 puddle from 5/19/16 How reproducible: every time Steps to Reproduce: 1. create a cluster in the usual way with ceph-ansible 2. mount ceph filesystem using KERNEL cephfs not FUSE 3. unmount ceph filesystem (kernel threads remain) 4. purge cluster using the purge-cluster.yml playbook Actual results: The task named "check for anything running ceph" fails even if ceph daemons are not running. It's just not a good way to check we shut them down, sorry I did it this way. This can also happen if ceph-ansible is run from a bare metal host that is also being used to run ceph daemons such as ceph-mon, not a recommended practice but sometimes useful if you have a few servers but no separate hosts or VMs to run a deployment host on. Expected results: purge-cluster.yml should try to remove ceph kernel modules and should fail if the kernel modules cannot be removed (can happen if cephfs kernel module or kernel RBD has resources). Additional info: The patch that works for me so far just removes the "check for anything running ceph" task. Instead, I would put this task up front: + - name: remove kernel ceph modules + shell: modprobe -rv ceph libceph modprobe does not return an error status if the modules aren't there, so this will only fail if the modules are there and cannot be removed because they are in use. Should there be something for kernel rbd in the modprobe list as well?
Ramana, please take Ben's suggestion and send a PR for this.
Hi Ramana, did you get a chance to take a look at this BZ ? (see C5 comment from Sebastien).
(In reply to Guillaume Abrioux from comment #6) > Hi Ramana, > > did you get a chance to take a look at this BZ ? (see C5 comment from > Sebastien). Yes, I did look at the BZ before. But haven't started working on it. I am OK if someone else wants to take up this BZ, if this is urgent.
I can no longer reproduce this issue. "purge-cluster.yml" completed on all hosts, including the ceph client machine where a Ceph filesystem was mounted and unmounted using kernel client. The issue (back in 2016/05/20) seems to have been caused by task "check for anything running ceph" in purge-cluster.yml working incorrectly. This particular task has undergone quite a few changes since then and seems to work currently now. When this BZ was filed the task used to be , - name: check for anything running ceph shell: "ps awux | grep -v grep | grep -q -- ceph-" register: check_for_running_ceph failed_when: check_for_running_ceph.rc == 0 introduced by commit 90fd2c70 Now the task is, - name: check for anything running ceph command: "ps -u ceph -U ceph" register: check_for_running_ceph failed_when: check_for_running_ceph.rc == 0 introduced by commit 5a3f95dfc Also, I don't think we want to remove libceph and rbd kernel module as ceph-ansible doesn't explicitly install them. The modules come bundled with the distribution kernel. Ben, are you still hitting this issue with RHCS 3.2? If not we can close this BZ.
I haven't tried running purge-cluster or purge-docker-cluster playbooks this way lately. So as far as I know it's still an issue. So it's true that ceph-ansible doesn't install libceph and rbd kernel module, and we really don't have to remove the modules, but we *do* have to unmap all RBD devices and unmount all Cephfs mountpoints before we install a new version or configuration of Ceph, right? Otherwise these devices and mountpoints will have nothing supporting them and become difficult to clean up or re-initialize (may even require a reboot). The "modprobe -rv" was put there only because it would fail if someone left /dev/rbd* or Cephfs mountpoints active on any host in the inventory. But in hindsight, I see that modprobe -rv is not sufficient to solve this problem, because of the way that ansible works - it will fail on that one host, stopping purge-cluster on that host, but other hosts will continue, and so the cluster will still be mostly taken down and the left-over RBD devices and mountpoints will still be problematic. Can we change the modprobe -rv task so that it is done first, and if *any* host fails to do modprobe -rv rbd libceph, the whole playbook stops right there? If not, the playbook should explicitly remove the /dev/rbd* and Cephfs mountpoints, with "rbd unmap" and "umount" commands, and do that at the very beginning of the purge-cluster run, before any other steps have been taken. RGW does not have this problem because the clients are using S3 or Swift, which are HTTP-based protocols, and there are no client resources that need to be removed in order to disconnect the clients from the servers.
(In reply to Ben England from comment #9) > I haven't tried running purge-cluster or purge-docker-cluster playbooks this > way lately. So as far as I know it's still an issue. Oh OK. So you're also concerned about 'clean' purge of client machines with actively mounted Ceph filesystems with cephfs kernel client or mapped devices with rbd cephfs client. I didn't get this while reading the bug description. Thanks for the explanation. > > So it's true that ceph-ansible doesn't install libceph and rbd kernel > module, and we really don't have to remove the modules, but we *do* have to > unmap all RBD devices and unmount all Cephfs mountpoints before we install a > new version or configuration of Ceph, right? Otherwise these devices and > mountpoints will have nothing supporting them and become difficult to clean > up or re-initialize (may even require a reboot). The "modprobe -rv" was put > there only because it would fail if someone left /dev/rbd* or Cephfs > mountpoints active on any host in the inventory. > > But in hindsight, I see that modprobe -rv is not sufficient to solve this > problem, because of the way that ansible works - it will fail on that one > host, stopping purge-cluster on that host, but other hosts will continue, > and so the cluster will still be mostly taken down and the left-over RBD > devices and mountpoints will still be problematic. Can we change the > modprobe -rv task so that it is done first, and if *any* host fails to do > modprobe -rv rbd libceph, the whole playbook stops right there? Yes, I think we can do this by first checking in client hosts whether rbd or ceph kernel modules is in use. If the kernel module is in use, then fail on all hosts using, `any_errors_fatal: true` setting, https://docs.ansible.com/ansible/latest/user_guide/playbooks_error_handling.html#aborting-the-play > If not, the playbook should explicitly remove the /dev/rbd* and Cephfs mountpoints, with > "rbd unmap" and "umount" commands, and do that at the very beginning of the > purge-cluster run, before any other steps have been taken. Yes, we can do this too using the following commands. umount -a -type ceph # unmount ceph filesystems rbdmap unmap-all # unmap all devices > > RGW does not have this problem because the clients are using S3 or Swift, > which are HTTP-based protocols, and there are no client resources that need > to be removed in order to disconnect the clients from the servers. Here we're worried about only kernel clients. Would there be similar "difficult to clean up or re-initialize (may even require a reboot)" issues with leftover RBD devices and CephFS mounts that use userspace ceph clients. e.g. rbd-nbd/librbd, ceph-fuse/libcephfs?
I'll defer to you on implementation method, all the above sounds good, I didn't know about rbd unmap-all command. What about iSCSI? Is there a way to ask the iSCSI daemon if it's serving any remote clients before it is shut down? How would NFS client mountpoints be handled if you uninstall Ceph and Ganesha Ceph FSAL is serving NFS clients at the time? Is there a way to detect that an NFS Ganesha process is actively serving a NFS client mountpoint before you shut it down? Don't forget about FUSE Cephfs mounts - some sites use both FUSE and kernel Cephfs. is nbd still in the picture? I thought they decided not to implement it. If nbd is being used, that's a kernel block device so there should be some way to detect that it's attached to a Ceph daemon? rr>Here we're worried about only kernel clients. Would there be similar "difficult to clean up or re-initialize (may even require a reboot)" issues with leftover RBD devices and CephFS mounts that use userspace ceph clients. e.g. rbd-nbd/librbd, ceph-fuse/libcephfs? if you rip ceph out from underneath a process that's using libcephfs or the like, you may have to restart the client process, but you won't have to reboot the host. And the client's TCP socket should get a disconnection event that should shut it down. Thanks for looking at this, it's strange but cluster teardown is important functionality because sometimes you need to re-do an install (because it's hard to get an install done right the first time), and purge*cluster.yml is the only practical way to achieve that.
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2538