1849211 – [OSP13->OSP16.1] Error: invalid arguments /etc/pki/tls/private/overcloud_endpoint.pem, ad9612da1a4c you must use just one container

Bug 1849211 - [OSP13->OSP16.1] Error: invalid arguments /etc/pki/tls/private/overcloud_endpoint.pem, ad9612da1a4c you must use just one container

Summary: [OSP13->OSP16.1] Error: invalid arguments /etc/pki/tls/private/overcloud_endp...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	16.1 (Train)
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	16.1 (Train on RHEL 8.2)
Assignee:	Grzegorz Grasza
QA Contact:	David Rosenfeld
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-06-19 20:27 UTC by Jose Luis Franco
Modified:	2022-11-14 08:01 UTC (History)
CC List:	6 users (show)
Fixed In Version:	openstack-tripleo-heat-templates-11.3.2-0.20200616081528.396affd.el8ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-07-29 07:53:29 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	737486	None	MERGED	Fix Error: invalid arguments you must use just one container	2021-02-04 08:08:05 UTC
Red Hat Issue Tracker	OSP-1101	None	None	None	2022-11-10 15:16:04 UTC
Red Hat Product Errata	RHBA-2020:3148	None	None	None	2020-07-29 07:54:39 UTC

Description Jose Luis Franco 2020-06-19 20:27:03 UTC

Description of problem:

The FFU job form 13 to 16.1 is failing during the last controller overcloud upgrade run step in the following task:

2020-06-19 06:05:34 | TASK [restart pacemaker resource for haproxy] **********************************
2020-06-19 06:05:34 | Friday 19 June 2020  06:05:33 -0400 (0:00:00.241)       0:01:48.907 *********** 
2020-06-19 06:05:34 | skipping: [controller-0] => {"changed": false, "skip_reason": "Conditional result was False"}
2020-06-19 06:05:34 | skipping: [controller-1] => {"changed": false, "skip_reason": "Conditional result was False"}
2020-06-19 06:05:34 | skipping: [controller-2] => {"changed": false, "skip_reason": "Conditional result was False"}
2020-06-19 06:05:34 | 
2020-06-19 06:05:34 | TASK [copy certificate from host to container] *********************************
2020-06-19 06:05:34 | Friday 19 June 2020  06:05:33 -0400 (0:00:00.236)       0:01:49.144 *********** 
2020-06-19 06:05:34 | skipping: [controller-2] => {"changed": false, "skip_reason": "Conditional result was False"}
2020-06-19 06:05:34 | fatal: [controller-1]: FAILED! => {"changed": true, "cmd": "podman cp /etc/pki/tls/private/overcloud_endpoint.pem ad9612da1a4c\n0d7665445110:/etc/pki/tls/private/overcloud_endpoint.pem", "delta": "0:00:00.098841", "end": "2020-06-19 10:05:34.085772", "msg": "non-zero return code", "rc": 127, "start": "2020-06-19 10:05:33.986931", "stderr": "Error: invalid arguments /etc/pki/tls/private/overcloud_endpoint.pem, ad9612da1a4c you must use just one container\n/bin/sh: line 1: 0d7665445110:/etc/pki/tls/private/overcloud_endpoint.pem: No such file or directory", "stderr_lines": ["Error: invalid arguments /etc/pki/tls/private/overcloud_endpoint.pem, ad9612da1a4c you must use just one container", "/bin/sh: line 1: 0d7665445110:/etc/pki/tls/private/overcloud_endpoint.pem: No such file or directory"], "stdout": "", "stdout_lines": []}
2020-06-19 06:05:34 | changed: [controller-0] => {"changed": true, "cmd": "podman cp /etc/pki/tls/private/overcloud_endpoint.pem 3b4bafa53b98:/etc/pki/tls/private/overcloud_endpoint.pem", "delta": "0:00:00.473766", "end": "2020-06-19 10:05:34.391347", "rc": 0, "start": "2020-06-19 10:05:33.917581", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
2020-06-19 06:05:34 | 
2020-06-19 06:05:34 | NO MORE HOSTS LEFT *************************************************************
2020-06-19 06:05:34 | 
2020-06-19 06:05:34 | PLAY RECAP *********************************************************************
2020-06-19 06:05:34 | controller-0               : ok=103  changed=24   unreachable=0    failed=0    skipped=51   rescued=0    ignored=0   
2020-06-19 06:05:34 | controller-1               : ok=95   changed=23   unreachable=0    failed=1    skipped=51   rescued=0    ignored=0   
2020-06-19 06:05:34 | controller-2               : ok=95   changed=23   unreachable=0    failed=0    skipped=52   rescued=0    ignored=0   
2020-06-19 06:05:34 | 
2020-06-19 06:05:34 | Friday 19 June 2020  06:05:34 -0400 (0:00:00.791)       0:01:49.935 *********** 

Log: http://cougar11.scl.lab.tlv.redhat.com/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv6-vxlan-HA-no-ceph/12/undercloud-0.tar.gz?undercloud-0/home/stack/overcloud_upgrade_run_controller-2,controller-1,controller-0.log

Curiously, when upgrading the other two controllers the task is being skipped so it didn't fall before.

Job logs: http://cougar11.scl.lab.tlv.redhat.com/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv6-vxlan-HA-no-ceph/12/

There is a recently merged patch which could be the reason why we didn't observe this failure before:

https://review.opendev.org/#/c/724863/


Version-Release number of selected component (if applicable):


How reproducible:

Run CI job: 
https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/upgrades/view/ffu/job/DFG-upgrades-ffu-ffu-upgrade-13-16.1_director-rhel-virthost-3cont_2comp-ipv6-vxlan-HA-no-ceph/

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Grzegorz Grasza 2020-06-22 09:49:09 UTC

Looks like the "get container_id" returns two container ids, instead of the assumed one, a fix should probably iterate on it as a list.

Comment 18 errata-xmlrpc 2020-07-29 07:53:29 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3148

Note You need to log in before you can comment on or make changes to this bug.