Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1418980 - [ceph-ansible]: purge cluster on ubuntu fails to purge client node.
[ceph-ansible]: purge cluster on ubuntu fails to purge client node.
Status: CLOSED ERRATA
Product: Red Hat Storage Console
Classification: Red Hat
Component: ceph-ansible (Show other bugs)
2
Unspecified Linux
unspecified Severity medium
: ---
: 2
Assigned To: Andrew Schoen
ceph-qe-bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-02-03 05:45 EST by Tejas
Modified: 2017-03-14 11:54 EDT (History)
11 users (show)

See Also:
Fixed In Version: ceph-ansible-2.1.7-1.el7scon
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-03-14 11:54:14 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
ansible playbook log (19.04 KB, text/plain)
2017-02-03 05:45 EST, Tejas
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0515 normal SHIPPED_LIVE Important: ansible and ceph-ansible security, bug fix, and enhancement update 2017-04-18 17:12:31 EDT

  None (edit)
Description Tejas 2017-02-03 05:45:14 EST
Created attachment 1247397 [details]
ansible playbook  log

Description of problem:
    I am running a purge_cluster.yml on a Ubuntu cluster with 3 MON 3 OSD and 1 client node.

Seeing this failure on the client node:

TASK [detect init system] ******************************************************
changed: [magna031]
changed: [magna028]
changed: [magna058]
changed: [magna052]
changed: [magna046]
fatal: [magna061]: FAILED! => {"changed": false, "cmd": "ceph-detect-init", "failed": true, "msg": "[Errno 2] No such file or directory", "rc": 2}


Version-Release number of selected component (if applicable):
ceph-ansible-2.1.6-1.el7scon.noarch

magna061:~# ceph -v
ceph version 10.2.5-12redhat1xenial (b765e9865c0ee5b5260cd08c65c7581258a5b3c1)


How reproducible:
Always

Steps to Reproduce:
1. Run the purge_cluster.yml on a Ubuntu cluster.
2. The script fails on the client node


Additional info:

When I ran this command manually on another Ubuntu cluster:

MON node:
root@magna063:~# ceph-detect-init
systemd

Client:
magna087:~# ceph-detect-init
The program 'ceph-detect-init' is currently not installed. You can install it by typing:
apt install ceph

Attaching the ansible playbook log in this bug.
Comment 3 Andrew Schoen 2017-02-03 07:19:41 EST
Tejas,

Could you share your hosts file? How was this client node created? I'm looking at the purge-cluster.yml playbook and it doesn't operate against a 'clients' group.

Thanks,
Andrew
Comment 5 seb 2017-02-03 08:38:46 EST
I don't see 61 in your host file and this is the one failing.
Comment 7 seb 2017-02-06 05:56:57 EST
Ahhh I understand now, magna061 is also a radosgw this is why the task fails. This is annoying. I think we should force this package installation. I don't want to end up into another way to detect the init system. So yes you need to install a package before purging the cluster, I know this is weird. I'll add a check and fail if the command is not present.

Ken, can we somehow make this package a dependency of the radosgw installation package?
Comment 9 Andrew Schoen 2017-02-06 07:04:08 EST
(In reply to seb from comment #7)
> Ahhh I understand now, magna061 is also a radosgw this is why the task
> fails. This is annoying. 

But we're purging rgw nodes in our upstream tests and have never run into this. Is this packaging issues specific to downstream only?

Also, there is still the issue of purge cluster not operating against a 'clients' group. I would not expect a dedicated 'client' node to be purged at all by this playbook. Do we need to add support for purging of client nodes for this release?
Comment 11 seb 2017-02-06 08:33:39 EST
True, we haven't come across that in our upstream CI, maybe RHCS packages have different dependencies downstream?

I'd say yes we should add the support for purging clients as well.
Comment 12 seb 2017-02-06 08:54:14 EST
FYI upstream packages have ceph-detect-init on the client so we won't hit that issue on our upstream CI. The following gets installed:

[centos@ceph-client ~]$ rpm -qa | grep ceph
ceph-fuse-10.2.5-0.el7.x86_64
libcephfs1-10.2.5-0.el7.x86_64
ceph-common-10.2.5-0.el7.x86_64
ceph-selinux-10.2.5-0.el7.x86_64
ceph-release-1-1.el7.noarch
python-cephfs-10.2.5-0.el7.x86_64
ceph-base-10.2.5-0.el7.x86_64

I suspect this is due to the ceph-base package? Ken?
Comment 13 seb 2017-02-06 09:01:53 EST
upstream PR https://github.com/ceph/ceph-ansible/pull/1281
Comment 14 Ken Dreyer (Red Hat) 2017-02-06 10:05:42 EST
ceph-detect-init ships in the ceph-base package. This is the case upstream and downstream.

ceph-radosgw does not depend on the ceph-base package. In fact ceph-base is not present in the Tools repository - you need a Ceph product subscription to Mon or OSD repos to get this.

Why would purging an RGW node require ceph-detect-init?
Comment 15 seb 2017-02-06 10:32:50 EST
So we know it's e.g: systemd and then we can properly stop rgw.
Comment 16 Ken Dreyer (Red Hat) 2017-02-07 12:25:36 EST
Since ceph-ansible only supports a limited set of distros (and even fewer in the downstream product: Ubuntu Xenial and RHEL 7), we should be able to determine the init system is either upstart or systemd without ceph-detect-init.
Comment 17 Gregory Meno 2017-02-07 12:52:19 EST
Sounds like we already do this in the rolling_update playbook. Andrew is working on a patch
Comment 18 Andrew Schoen 2017-02-07 13:04:35 EST
PR opened upstream: https://github.com/ceph/ceph-ansible/pull/1284
Comment 21 Tejas 2017-02-12 22:38:01 EST
Verified on build:
ceph-ansible-2.1.7-1.el7scon.noarch
Comment 23 errata-xmlrpc 2017-03-14 11:54:14 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:0515

Note You need to log in before you can comment on or make changes to this bug.