Description of problem: * While upgrading to RHCS 4, the rolling-update.yml playbook fails at task 'ceph-container-common : container registry authentication'. Version-Release number of selected component (if applicable): * RHCS 3 to RHCS 4 upgrade. How reproducible: * Upgrade from RHCS 3 to RHCS 4 with newly added grafana-server. Steps to Reproduce: 1. Install RHCS 3.3 cluster. - In ansible inventory file, mention [mons], [mgrs] and [osds] [a]. 2. Upgrade to 3.x latest (if needed) [b]. 3. Upgrade from 3.x(latest) to 4.x(latest) - For RHCS 4, add [grafana-server] section and mention the grafana server details. - Update all.yml and osds.yml accordingly and run rolling-update.yml [c]. Reference: [a]. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/container_guide/deploying-red-hat-ceph-storage-in-containers#installing-a-red-hat-ceph-storage-cluster-in-containers [b]. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/container_guide/upgrading-red-hat-ceph-storage-within-containers [c]. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html-single/installation_guide/index#upgrading-a-red-hat-ceph-storage-cluster Actual results: - The play is failing on newly added grafana-server node with the below error: --- TASK [ceph-container-common : container registry authentication] ************************************************************************************************************* Tuesday 24 November 2020 17:35:19 +0530 (0:00:01.431) 0:01:59.875 ****** fatal: [10.10.10.1]: FAILED! => changed=false censored: 'the output has been hidden due to the fact that ''no_log: true'' was specified for this result' --- Expected results: - Upgrade should complete without any error. Additional info: - This issue happens while mentioning grafana-server on a new node where there is no docker package pre-installed.
- The play is failing at: ~~~ - name: container registry authentication command: '{{ container_binary }} login -u {{ ceph_docker_registry_username }} -p {{ ceph_docker_registry_password }} {{ ceph_docker_registry }}' changed_when: false no_log: true ~~~ - Since the parameter "no_log: True" is set to the task, there is no verbose error. - While removing the parameter 'no_log: true' from play, we will get the error 'docker service/socket not found' **Workaround** - Install `docker` manually on the `grafana-server` node and start/enable the service. ~~~ $ sudo yum install docker -y $ sudo systemctl restart docker.service $ sudo systemctl enable docker.service ~~~ - After starting the docker service, run the playbook again.
I updated this bug to MODIFIED for RHCS 5.0, but this bug is actually targeted to 4.2 z2, so I will reset this back in order to track fixing it in RHCS 4.
Verified using ansible-2.9.22-1.el7ae.noarch ceph-ansible-4.0.56-1.el7cp.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2445