Description of problem: If a minor update or even a simple redeploy of 16.1 overcloud is performed It fails on the task "copy certificate, chgrp, restart haproxy", the actual error output is: 2021-07-29 21:40:34 | TASK [copy certificate, chgrp, restart haproxy] ******************************** 2021-07-29 21:40:34 | Thursday 29 July 2021 21:40:29 +0000 (0:00:00.079) 0:16:45.289 ********* 2021-07-29 21:40:34 | failed: [central-controller0-0] (item=ae049b5377eb) => {"ansible_loop_var": "item", "changed": true, "cmd": "set -e\nif podman ps -f \"id=ae049b5377eb\" --format \"{{.Names}}\" | grep -q \"^haproxy-bundle\"; then\n tar -c /etc/pki/tls/private/overcloud_endpoint.pem | podman exec -i ae049b5377eb tar -C / -xv\nelse\n podman cp /etc/pki/tls/private/overcloud_endpoint.pem ae049b5377eb:/etc/pki/tls/private/overcloud_endpoint.pem\nfi\npodman exec --user root ae049b5377eb chgrp haproxy /etc/pki/tls/private/overcloud_endpoint.pem\npodman kill --signal=HUP ae049b5377eb\n", "delta": "0:00:00.588193", "end": "2021-07-29 21:40:30.936972", "item": "ae049b5377eb", "msg": "non-zero return code", "rc": 2, "start": "2021-07-29 21:40:30.348779", "stderr": "tar: Removing leading `/' from member names\ntar: This does not look like a tar archive\ntar: Exiting with failure status due to previous errors\ntime=\"2021-07-29T21:40:30Z\" level=error msg=\"read unixpacket @->/var/run/libpod/socket/6b6873c5d71e9cc2db80da8b02040a47645e74ddf34a5f986a25e5517edabfd5/attach: read: connection reset by peer\"\nError: non zero exit code: 2: OCI runtime error", "stderr_lines": ["tar: Removing leading `/' from member names", "tar: This does not look like a tar archive", "tar: Exiting with failure status due to previous errors", "time=\"2021-07-29T21:40:30Z\" level=error msg=\"read unixpacket @->/var/run/libpod/socket/6b6873c5d71e9cc2db80da8b02040a47645e74ddf34a5f986a25e5517edabfd5/attach: read: connection reset by peer\"", "Error: non zero exit code: 2: OCI runtime error"], "stdout": "", "stdout_lines": []} 2021-07-29 21:40:34 | changed: [central-controller0-0] => (item=ca8162173539) => {"ansible_loop_var": "item", "changed": true, "cmd": "set -e\nif podman ps -f \"id=ca8162173539\" --format \"{{.Names}}\" | grep -q \"^haproxy-bundle\"; then\n tar -c /etc/pki/tls/private/overcloud_endpoint.pem | podman exec -i ca8162173539 tar -C / -xv\nelse\n podman cp /etc/pki/tls/private/overcloud_endpoint.pem ca8162173539:/etc/pki/tls/private/overcloud_endpoint.pem\nfi\npodman exec --user root ca8162173539 chgrp haproxy /etc/pki/tls/private/overcloud_endpoint.pem\npodman kill --signal=HUP ca8162173539\n", "delta": "0:00:01.065111", "end": "2021-07-29 21:40:32.379248", "item": "ca8162173539", "rc": 0, "start": "2021-07-29 21:40:31.314137", "stderr": "", "stderr_lines": [], "stdout": "ca8162173539e4d54863a70a80e63b997b4d55e50b67c5d1b5819cfeb910a799", "stdout_lines": ["ca8162173539e4d54863a70a80e63b997b4d55e50b67c5d1b5819cfeb910a799"]} 2021-07-29 21:40:34 | changed: [central-controller0-0] => (item=654f1e423705) => {"ansible_loop_var": "item", "changed": true, "cmd": "set -e\nif podman ps -f \"id=654f1e423705\" --format \"{{.Names}}\" | grep -q \"^haproxy-bundle\"; then\n tar -c /etc/pki/tls/private/overcloud_endpoint.pem | podman exec -i 654f1e423705 tar -C / -xv\nelse\n podman cp /etc/pki/tls/private/overcloud_endpoint.pem 654f1e423705:/etc/pki/tls/private/overcloud_endpoint.pem\nfi\npodman exec --user root 654f1e423705 chgrp haproxy /etc/pki/tls/private/overcloud_endpoint.pem\npodman kill --signal=HUP 654f1e423705\n", "delta": "0:00:01.034272", "end": "2021-07-29 21:40:33.776165", "item": "654f1e423705", "rc": 0, "start": "2021-07-29 21:40:32.741893", "stderr": "", "stderr_lines": [], "stdout": "654f1e4237056ea10dc87fed5e09b321b3cfd9abd852e802eafc187443c5622d", "stdout_lines": ["654f1e4237056ea10dc87fed5e09b321b3cfd9abd852e802eafc187443c5622d"]} It seems that the task steps were changed recently which may have an impact: https://review.opendev.org/c/openstack/tripleo-heat-templates/+/783949/ Version-Release number of selected component (if applicable): openstack-tripleo-common-11.4.1-1.20210719133309.75bd92a.el8ost.noarch openstack-tripleo-common-containers-11.4.1-1.20210719133309.75bd92a.el8ost.noarch openstack-tripleo-heat-templates-11.3.2-1.20210720153309.29a02c1.el8ost.noarch openstack-tripleo-image-elements-10.6.2-1.20210528012405.el8ost.noarch openstack-tripleo-puppet-elements-11.2.2-1.20210528065605.f061f90.el8ost.noarch openstack-tripleo-validations-11.3.2-1.20210715133307.4db92ba.el8ost.noarch puppet-tripleo-11.5.0-1.20210622133307.f716ef5.el8ost.noarch How reproducible: Always Steps to Reproduce: 1. Deploy 16.1 overcloud with TLS-everywhere 2. Redeploy the overcloud or minor update with running the overcloud deploy command line again Actual results: The redeploy fails on mentioned error Expected results: Successful redeploy/update of the overcloud stack Additional info:
See https://github.com/containers/podman/issues/5046 is that command run on podman 1.6.x, or after podman gets upgraded?
And another fix https://github.com/containers/conmon/pull/131 was submitted for conmon, not sure in which version of it that ended up in result
I have tested the change and was able to confirm that we can get away with it. So let's just workaround this in OSP 16.1 for the time being. thanks, Michele
(In reply to Michele Baldessari from comment #8) > I have tested the change and was able to confirm that we can get away with > it. The patch worked for me too. > > > So let's just workaround this in OSP 16.1 for the time being. > > thanks, > Michele
*** Bug 1992630 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.7 (Train) bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3762