Bug 1903504 - [GSS][ceph-ansible] rolling-update.yml fails with: TASK [ceph-container-common : container registry authentication]
Summary: [GSS][ceph-ansible] rolling-update.yml fails with: TASK [ceph-container-commo...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 4.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: 4.2z2
Assignee: Dimitri Savineau
QA Contact: Ameena Suhani S H
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-02 08:16 UTC by Geo Jose
Modified: 2021-06-15 17:13 UTC (History)
12 users (show)

Fixed In Version: ceph-ansible-4.0.54-1.el8cp, ceph-ansible-4.0.54-1.el7cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-15 17:13:09 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 6240 0 None closed [skip ci] monitoring: add missing repository and requirements 2021-02-19 10:34:08 UTC
Red Hat Knowledge Base (Solution) 5618941 0 None None None 2020-12-02 08:30:50 UTC
Red Hat Product Errata RHSA-2021:2445 0 None None None 2021-06-15 17:13:27 UTC

Description Geo Jose 2020-12-02 08:16:39 UTC
Description of problem:
 * While upgrading to RHCS 4, the rolling-update.yml playbook fails at task 'ceph-container-common : container registry authentication'.

Version-Release number of selected component (if applicable):
 * RHCS 3 to RHCS 4 upgrade.

How reproducible:
 * Upgrade from RHCS 3 to RHCS 4 with newly added grafana-server.

Steps to Reproduce:
1. Install RHCS 3.3 cluster.
  - In ansible inventory file, mention [mons], [mgrs] and [osds] [a].

2. Upgrade to 3.x latest (if needed) [b].

3. Upgrade from 3.x(latest) to 4.x(latest)
  - For RHCS 4, add [grafana-server] section and mention the grafana server details.
  - Update all.yml and osds.yml accordingly and run rolling-update.yml [c].


Reference: 
[a]. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/container_guide/deploying-red-hat-ceph-storage-in-containers#installing-a-red-hat-ceph-storage-cluster-in-containers

[b]. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/container_guide/upgrading-red-hat-ceph-storage-within-containers

[c]. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html-single/installation_guide/index#upgrading-a-red-hat-ceph-storage-cluster


Actual results:

 - The play is failing on newly added grafana-server node with the below error:
---
TASK [ceph-container-common : container registry authentication] *************************************************************************************************************
Tuesday 24 November 2020  17:35:19 +0530 (0:00:01.431)   0:01:59.875 ******
fatal: [10.10.10.1]: FAILED! => changed=false
  censored: 'the output has been hidden due to the fact that ''no_log: true'' was specified for this result'
---

Expected results:
 - Upgrade should complete without any error.

Additional info:
 - This issue happens while mentioning grafana-server on a new node where there is no docker package pre-installed.

Comment 1 Geo Jose 2020-12-02 08:18:36 UTC
- The play is failing at:
~~~
 - name: container registry authentication
  command: '{{ container_binary }} login -u {{ ceph_docker_registry_username }} -p {{ ceph_docker_registry_password }} {{ ceph_docker_registry }}'
  changed_when: false
  no_log: true
~~~
- Since the parameter "no_log: True" is set to the task, there is no verbose error.
 
- While removing the parameter 'no_log: true' from play, we will get the error 'docker service/socket not found'

**Workaround**
 
- Install `docker` manually on the `grafana-server` node and start/enable the service.
~~~
$ sudo yum install docker -y
$ sudo systemctl restart docker.service
$ sudo systemctl enable docker.service
~~~
 
- After starting the docker service, run the playbook again.

Comment 5 Ken Dreyer (Red Hat) 2021-02-24 19:08:23 UTC
I updated this bug to MODIFIED for RHCS 5.0, but this bug is actually targeted to 4.2 z2, so I will reset this back in order to track fixing it in RHCS 4.

Comment 10 Ameena Suhani S H 2021-06-01 13:18:45 UTC
Verified using 
ansible-2.9.22-1.el7ae.noarch
ceph-ansible-4.0.56-1.el7cp.noarch

Comment 12 errata-xmlrpc 2021-06-15 17:13:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2445


Note You need to log in before you can comment on or make changes to this bug.