Bug 1794195 - ceph-container-common role is skipped with containerized HCI environment
Summary: ceph-container-common role is skipped with containerized HCI environment
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 4.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: 4.0
Assignee: Dimitri Savineau
QA Contact: Vasishta
URL:
Whiteboard:
Depends On:
Blocks: 1642481
TreeView+ depends on / blocked
 
Reported: 2020-01-22 21:34 UTC by Dimitri Savineau
Modified: 2020-02-05 02:34 UTC (History)
12 users (show)

Fixed In Version: ceph-ansible-4.0.12-1.el8cp, ceph-ansible-4.0.12-1.el7cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-31 12:48:52 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 4976 0 None closed site-container: don't skip ceph-container-common 2020-12-15 14:21:06 UTC
Red Hat Product Errata RHBA-2020:0312 0 None None None 2020-01-31 12:49:04 UTC

Description Dimitri Savineau 2020-01-22 21:34:07 UTC
Description of problem:

On containerized HCI environment the OSD and Client nodes are collocated.
The ceph-container-common role isn't executed on the Client nodes except the first one (for keyring purpose) so multiple tasks are skipped including:
  - registry authentication (if set to true)
  - udev rules removal
  - set facts (ceph and docker version) used in later roles

Version-Release number of selected component (if applicable):
ceph-ansible 4.0.11

How reproducible:
100%

Steps to Reproduce:
1. deploy containerized HCI ceph with ceph-ansible (see additional info)


Actual results:

The results is different when using docker or podman

a) docker

TASK [ceph-osd : generate ceph osd docker run script] **************************
Tuesday 21 January 2020  23:01:03 +0300 (0:00:01.415)       0:24:14.961 *******
fatal: [hci-1]: FAILED! => changed=false
  msg: 'AnsibleUndefinedVariable: ''dict object'' has no attribute ''split'''
fatal: [hci-2]: FAILED! => changed=false
  msg: 'AnsibleUndefinedVariable: ''dict object'' has no attribute ''split'''

because ceph_docker_version variable isn't set due to ceph-container-common role being skipped (except hci-0).

TASK [ceph-container-common : include prerequisites.yml] ***********************
Wednesday 21 January 2020  22:55:21 +0300 (0:00:01.505)       0:04:07.762 ***** 
skipping: [hci-1] => changed=false 
  skip_reason: Conditional result was False
skipping: [hci-2] => changed=false 
  skip_reason: Conditional result was False

b) podman

if ceph_docker_registry_auth=false then the playbook succeeded but some important tasks aren't executed.
if ceph_docker_registry_auth=true then the playbook fails on task trying to pull the container image.


fatal: [hci-2]: FAILED! => changed=true
  (...)
  stderr: |-
    Trying to pull registry.redhat.io/rhceph-beta/rhceph-4-rhel8:latest...time="2020-01-22T16:27:23-05:00" level=error msg="Error pulling image ref //registry.redhat.io/rhceph-beta/rhceph-4-rhel8:latest: Error initializing source docker://registry.redhat.io/rhceph-beta/rhceph-4-rhel8:latest: unable to retrieve auth token: invalid username/password"
    Failed
    Error: unable to pull registry.redhat.io/rhceph-beta/rhceph-4-rhel8:latest: unable to pull image: Error initializing source docker://registry.redhat.io/rhceph-beta/rhceph-4-rhel8:latest: unable to retrieve auth token: invalid username/password

because the registry auth task has been skipped.

Expected results:

The ceph-container-common role isn't skipped.


Additional info:

---------
containerized_deployment: true
(optional)
ceph_docker_registry_auth: true
---------

---------
[osds]
hci-0
hci-1
hci-2

[clients]
hci-0
hci-1
hci-2
---------

Comment 2 Federico Lucifredi 2020-01-23 04:12:57 UTC
This is a blocker.

Comment 10 Yogev Rabl 2020-01-29 14:29:40 UTC
Verified

Comment 12 errata-xmlrpc 2020-01-31 12:48:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0312


Note You need to log in before you can comment on or make changes to this bug.