Bug 1968177
Summary: | switch-to-containerized fails and leaves cluster in degraded state | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Heðin <hej> | ||||
Component: | Ceph-Ansible | Assignee: | Dimitri Savineau <dsavinea> | ||||
Status: | CLOSED ERRATA | QA Contact: | Ameena Suhani S H <amsyedha> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.2 | CC: | aschoen, ceph-eng-bugs, gabrioux, gmeno, nthomas, tserlin, vereddy, ykaul | ||||
Target Milestone: | --- | ||||||
Target Release: | 4.2z3 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | ceph-ansible-4.0.61-1.el8cp, ceph-ansible-4.0.61-1.el7cp | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-09-27 18:26:24 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Set prio to high because the cluster is left in a degraded state. v4.0.59 available upstream Verified using ansible-2.9.24-1.el8ae.noarch ceph-ansible-4.0.62-1.el8cp.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 4.2 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3670 |
Created attachment 1789117 [details] See 2021-06-06 12:11:47,842 Description of problem: switch-to-containerised fails on RHCS-4 when the following is not set in all.yml: ceph_docker_registry: "registry.redhat.io" ceph_docker_registry_auth: true ceph_docker_registry_username: ceph_docker_registry_password: But it does not fail until after the non-containerized mon service have been removed. This results in the cluster missing a monitor and the playbook fails on subsequent runs because it can't find the removed mon service Version-Release number of selected component (if applicable): ceph-ansible.noarch 4.0.41-1.el7cp @rhel-7-server-rhceph-4-tools-rpms How reproducible: I deploye RHCS-3, non-containerized, upgraded to RHCS-4 non-containerized, followed by running infrastructure-playbooks/switch-from-non-containerized-to-containerized-ceph-daemons.yml without adding the above-mentioned ceph_docker_registry variables. Steps to Reproduce: 1. Install RHCS-3 non-containerized 2. Upgrade to RHCS-4 3. Convert to containerized, by running infrastructure-playbooks/switch-from-non-containerized-to-containerized-ceph-daemons.yml with -limit rhceph01 # rhceph01 first monitor Actual results: mon on rhceph01 is removed and cluster is left with 2 functioning mon's and health_warn Expected results: Early fail of playbook, with a message pointing out that registry.redhat.io requires said variables to be set, while keeping the cluster HEALTH_OK Additional info: Look at line: 2021-06-06 12:11:47,842 in the attached ansible.log