Bug 2218455 - HAProxy fails to restart during update from 16.1.8 to 16.2.5 - task "copy certificate, chgrp, restart haproxy"
Summary: HAProxy fails to restart during update from 16.1.8 to 16.2.5 - task "copy cer...
Keywords:
Status: MODIFIED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.2 (Train)
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: z6
: 16.2 (Train on RHEL 8.4)
Assignee: Luca Miccini
QA Contact: Joe H. Rahme
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-06-29 08:08 UTC by Eric Nothen
Modified: 2023-08-10 11:14 UTC (History)
3 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20230808225213.9adcac6.el8osttrunk
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 887360 0 None MERGED Exclude neutron haproxy containers when updating TLS certificates 2023-08-07 05:37:13 UTC
OpenStack gerrit 887678 0 None NEW Exclude neutron haproxy containers when updating TLS certificates 2023-08-07 05:37:33 UTC
Red Hat Issue Tracker OSP-26217 0 None None None 2023-06-29 08:20:20 UTC

Description Eric Nothen 2023-06-29 08:08:41 UTC
Description of problem:

HAproxy fails to restart after update, eventually causing failure of update job in controller-0 when updating from 16.1.8 to 16.2.5

Version-Release number of selected component (if applicable):
16.2.5

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

At the time HAproxy fails to restart, all of the rpms have been updated in the controller, and all of the new container images have been pre-fetched.

The error is mostly the same:

$ grep FATAL 0020-mistral.tar.gz/var/log/containers/mistral/package_update.log | grep -c "does not exist"
19

~~~
{
  "ansible_loop_var": "item",
  "changed": true,
  "cmd": "set -e\nif podman ps -f \"id=01c2b754fee8\" --format \"{{.Names}}\" | grep -q \"^haproxy-bundle\"; then\n  tar -c /etc/pki/tls/private/overcloud_endpoint.pem | podman exec -i 01c2b754fee8 tar -C / -xv\nelse\n  podman cp /etc/pki/tls/private/overcloud_endpoint.pem 01c2b754fee8:/etc/pki/tls/private/overcloud_endpoint.pem\nfi\npodman exec --user root 01c2b754fee8 chgrp haproxy /etc/pki/tls/private/overcloud_endpoint.pem\npodman kill --signal=HUP 01c2b754fee8\n",
  "delta": "0:00:00.569902",
  "end": "2023-06-28 12:29:29.541934",
  "failed_when_result": true,
  "item": "01c2b754fee8",
  "msg": "non-zero return code",
  "rc": 125,
  "start": "2023-06-28 12:29:28.972032",
  "stderr": "Error: container \"01c2b754fee8\" does not exist",
  "stderr_lines": [
    "Error: container \"01c2b754fee8\" does not exist"
  ],
  "stdout": "",
  "stdout_lines": []
}
~~~

But there's one that's slightly different:
~~~
{
  "ansible_loop_var": "item",
  "changed": true,
  "cmd": "set -e\nif podman ps -f \"id=0185c92bf4a9\" --format \"{{.Names}}\" | grep -q \"^haproxy-bundle\"; then\n  tar -c /etc/pki/tls/private/overcloud_endpoint.pem | podman exec -i 0185c92bf4a9 tar -C / -xv\nelse\n  podman cp /etc/pki/tls/private/overcloud_endpoint.pem 0185c92bf4a9:/etc/pki/tls/private/overcloud_endpoint.pem\nfi\npodman exec --user root 0185c92bf4a9 chgrp haproxy /etc/pki/tls/private/overcloud_endpoint.pem\npodman kill --signal=HUP 0185c92bf4a9\n",
  "delta": "0:00:00.696201",
  "end": "2023-06-28 12:29:58.321678",
  "failed_when_result": true,
  "item": "0185c92bf4a9",
  "msg": "non-zero return code",
  "rc": 255,
  "start": "2023-06-28 12:29:57.625477",
  "stderr": "Error: OCI runtime error: exec failed: container_linux.go:380: starting container process caused: process_linux.go:130: executing setns process caused: exit status 1",
  "stderr_lines": [
    "Error: OCI runtime error: exec failed: container_linux.go:380: starting container process caused: process_linux.go:130: executing setns process caused: exit status 1"
  ],
  "stdout": "",
  "stdout_lines": []
}
~~~


Note You need to log in before you can comment on or make changes to this bug.