Bug 1901865

Summary: [ceph-ansible] : switch from rpm to containerized - services collocated with OSDs are stopped creating failure in case of failure
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasishta <vashastr>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: Vasishta <vashastr>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.2CC: amsyedha, aschoen, ceph-eng-bugs, gmeno, kdreyer, nthomas, vereddy, ykaul
Target Milestone: ---   
Target Release: 4.2z1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-ansible-4.0.43-1.el7cp ceph-ansible-4.0.43-1.el8cp Doc Type: Bug Fix
Doc Text:
Cause: in the OSD play of the `switch-from-non-containerized-to-containerized-ceph-daemons` playbook, we disable *and* stop the systemd unit `ceph.target`. Consequence: It makes the playbook stop any other ceph daemons where we only want to stop OSD services. Fix: With this fix, ceph-ansible doesn't stop `ceph.target`, it only disables it. Result: When collocating other Ceph daemons with OSDs, they all have their corresponding play for managing the transition to container.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-04-28 20:12:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vasishta 2020-11-26 09:57:59 UTC
Description of problem:
Switch to containerized cluster playbook failed because of some other issue but mds collocated with OSD was stopped by ceph-ansible and not started even though all the OSDs on that node were containerized and were up and running.

Version-Release number of selected component (if applicable):
ceph-ansible-4.0.40-1.el7cp.noarch

How reproducible:
Tried once

Steps to Reproduce:
1. Configure a bare metal cluster with an OSD collocated with MDS
2. Run switch to containerized daemon playbook
3. Induce playbook failure after OSDs are containerized

Actual results:
Services collocated with OSDs are stopped

Expected results:
All services must be up and running if there is no reason to be down.

Additional info:

Comment 5 Ameena Suhani S H 2021-02-18 02:21:32 UTC
Verified using

ceph-base-14.2.11-121.el7cp.x86_64
ceph-ansible-4.0.46-1.el7cp.noarch

Comment 7 errata-xmlrpc 2021-04-28 20:12:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage security, bug fix, and enhancement Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1452