Bug 1633563

Summary: purge playbooks don't clean /var/lib/ceph/* properly
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Guillaume Abrioux <gabrioux>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: subhash <vpoliset>
Severity: medium Docs Contact: Bara Ancincova <bancinco>
Priority: unspecified    
Version: 3.1CC: aschoen, ceph-eng-bugs, ceph-qe-bugs, gabrioux, gmeno, jharriga, kdreyer, nthomas, rperiyas, sankarshan, tchandra
Target Milestone: z1   
Target Release: 3.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-ansible-3.1.8-1.el7cp Ubuntu: ceph-ansible_3.1.8-2redhat1 Doc Type: Bug Fix
Doc Text:
.Files in `/var/lib/ceph/` are now properly removed when purging clusters The Ansible playbook for purging clusters did not properly remove files in the `/var/lib/ceph/` directory. Consequently, when trying to redeploy a containerized cluster, Ansible assumed that a cluster was already running and tried to join it because the Monitor container detected some existing files in the `/var/lib/ceph/<cluster-name>-<monitor-name>/` directory. With this update, the files in `/var/lib/ceph/` are properly removed, and redeploying a cluster works as expected.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-09 01:00:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1584264, 1641792    

Description Guillaume Abrioux 2018-09-27 09:52:57 UTC
Description of problem:

purge cluster playbooks don't clean /var/lib/ceph/*
It means it leaves some files in place and cause error when trying to redeploy a cluster:

Sep 26 13:18:16 mon0 docker[31316]: 2018-09-26 13:18:16.9323937f15b0d74700 -1 auth: unable to find a keyring on /etc/ceph/test.client.admin.keyring,/etc/ceph/test.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,:(2) No such file or directory

This error occurs because ceph-container relies on the fact there is a keyring present in /var/lib/ceph/mon/<cluster>-<mon-name>/keyring to detect whether we are trying to join an existing cluster.


How reproducible:
100%

Steps to Reproduce:
1. Deploy a containerized cluster
2. Run purge-docker-cluster.yml
3. Try to redeploy a containerized cluster

Actual results:
Redeploying a cluster after purge playbook fails because of leftover.

Expected results:
We can redeploy a cluster after a purge.

Comment 9 subhash 2018-10-22 00:45:53 UTC
followed below steps to reproduce ,no longer see the issue,moving to verified state.

Steps to Reproduce:
1. Deploy a containerized cluster
2. Run purge-docker-cluster.yml
3. Try to redeploy a containerized cluster

Gets deployed fine.

[ubuntu@magna021 ceph-ansible]$ rpm -qa | grep ansible
ceph-ansible-3.1.9-1.el7cp.noarch
ansible-2.4.6.0-1.el7ae.noarch

Comment 14 Sébastien Han 2018-10-24 15:49:51 UTC
*** Bug 1642026 has been marked as a duplicate of this bug. ***

Comment 16 errata-xmlrpc 2018-11-09 01:00:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3530