Bug 1849559

Summary: [13->16.1 ffwd2] Overcloud Operating System upgrade failed in controller-0 when docker2podman playbook is executed
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Francesco Pantano <fpantano>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: high Docs Contact:
Priority: high    
Version: 4.1CC: aschoen, ceph-eng-bugs, ceph-qe-bugs, dsavinea, gcharot, gfidente, gmeno, johfulto, nthomas, tchandra, tserlin, ykaul, yrabl
Target Milestone: z1Keywords: Triaged
Target Release: 4.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-ansible-4.0.25-1.el8cp, ceph-ansible-4.0.25-1.el7cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-07-20 14:21:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1760354    

Description Francesco Pantano 2020-06-22 08:41:11 UTC
Description of problem:

The upgrade process fails on an overcloud OS upgrade task for Controller when the docker2podman playbook is executed.
Here the relevant log:


020-06-18 20:02:54 |         "TASK [get docker version] ******************************************************",
2020-06-18 20:02:54 |         "task path: /usr/share/ceph-ansible/infrastructure-playbooks/docker-to-podman.yml:63",
2020-06-18 20:02:54 |         "Thursday 18 June 2020  20:02:22 -0400 (0:00:00.100)       0:00:36.741 ********* ",
2020-06-18 20:02:54 |         "TASK [set_fact ceph_docker_version ceph_docker_version.stdout.split] ***********",
2020-06-18 20:02:54 |         "task path: /usr/share/ceph-ansible/infrastructure-playbooks/docker-to-podman.yml:69",
2020-06-18 20:02:54 |         "Thursday 18 June 2020  20:02:23 -0400 (0:00:00.086)       0:00:36.828 ********* ",
2020-06-18 20:02:54 |         "TASK [set_fact docker2podman and container_binary] *****************************",
2020-06-18 20:02:54 |         "task path: /usr/share/ceph-ansible/infrastructure-playbooks/docker-to-podman.yml:73",
2020-06-18 20:02:54 |         "Thursday 18 June 2020  20:02:23 -0400 (0:00:00.085)       0:00:36.913 ********* ",
2020-06-18 20:02:54 |         "ok: [controller-0] => {\"ansible_facts\": {\"container_binary\": \"podman\", \"docker2podman\": true}, \"changed\": false}",
2020-06-18 20:02:54 |         "TASK [pulling undercloud-0.ctlplane.redhat.local:8787/rhceph/rhceph-3-rhel7:3-40 image from docker daemon] ***",
2020-06-18 20:02:54 |         "task path: /usr/share/ceph-ansible/infrastructure-playbooks/docker-to-podman.yml:87",
2020-06-18 20:02:54 |         "Thursday 18 June 2020  20:02:23 -0400 (0:00:00.085)       0:00:36.998 ********* ",
2020-06-18 20:02:54 |         "FAILED - RETRYING: pulling undercloud-0.ctlplane.redhat.local:8787/rhceph/rhceph-3-rhel7:3-40 image from docker daemon (3 retries left).",
2020-06-18 20:02:54 |         "FAILED - RETRYING: pulling undercloud-0.ctlplane.redhat.local:8787/rhceph/rhceph-3-rhel7:3-40 image from docker daemon (2 retries left).",
2020-06-18 20:02:54 |         "FAILED - RETRYING: pulling undercloud-0.ctlplane.redhat.local:8787/rhceph/rhceph-3-rhel7:3-40 image from docker daemon (1 retries left).",
2020-06-18 20:02:54 |         "fatal: [controller-0]: FAILED! => {\"attempts\": 3, \"changed\": false, \"cmd\": [\"timeout\", \"--foreground\", \"-s\", \"KILL\", \"300s\", \"podman\", \"pull\", \"docker-daemon:undercloud-0.ctlplane.redhat.local:8787/rhceph/rhceph-3-rhel7:3-40\"], \"delta\": \"0:00:00.007294\", \"end\": \"2020-06-19 00:02:54.116774\", \"msg\": \"non-zero return code\", \"rc\": 127, \"start\": \"2020-06-19 00:02:54.109480\", \"stderr\": \"timeout: failed to run command 'podman': No such file or directory\", \"stderr_lines\": [\"timeout: failed to run command 'podman': No such file or directory\"], \"stdout\": \"\", \"stdout_lines\": []}",


Sounds like this is introduced by ceph-ansible v4.0.24 [1] and when the task [2] is executed, at this stage container_binary is set to "podman", but this is not
yet installed on the overcloud.


[1] http://cougar11.scl.lab.tlv.redhat.com/DFG-compute-nova-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp_3ceph-ipv4-vxlan-compute-tempest/6/undercloud-0.tar.gz?undercloud-0/var/log/rpm.list
[2] https://github.com/ceph/ceph-ansible/blob/v4.0.24/infrastructure-playbooks/docker-to-podman.yml#L87


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 8 Yogev Rabl 2020-07-08 13:55:01 UTC
verified

Comment 10 errata-xmlrpc 2020-07-20 14:21:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3003