Description of problem: We're seeing this failure in our tests: tripleo_ironic_neutron_agent_healthcheck.service - ironic_neutron_agent healthcheck Loaded: loaded (/etc/systemd/system/tripleo_ironic_neutron_agent_healthcheck.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Fri 2019-12-06 18:28:05 UTC; 4s ago Process: 275678 ExecStart=/usr/bin/podman exec ironic_neutron_agent /openstack/healthcheck 5672 (code=exited, status=1/FAILURE) Main PID: 275678 (code=exited, status=1/FAILURE) Dec 06 18:28:05 undercloud-0.redhat.local systemd[1]: Starting ironic_neutron_agent healthcheck... Dec 06 18:28:05 undercloud-0.redhat.local podman[275678]: exec failed: container_linux.go:345: starting container process caused "exec: \"/openstack/healthcheck\": stat /openstack/healthcheck: no such file or directory" Dec 06 18:28:05 undercloud-0.redhat.local podman[275678]: time="2019-12-06T18:28:05Z" level=error msg="Error removing exit file for container af5ba0b398aef2b4cc19bc96c167571673fd09818d0e304d3e1cb3e321fe8a8b exec session 2839de7b0cdd77e150e070e0bce4730748158e8416dcf6a2828a62df49f91073: remove /var/run/containers/storage/overlay-containers/af5ba0b398aef2b4cc19bc96c167571673fd09818d0e304d3e1cb3e321fe8a8b/userdata/exec_pid_2839de7b0cdd77e150e070e0bce4730748158e8416dcf6a2828a62df49f91073: no such file or directory" Dec 06 18:28:05 undercloud-0.redhat.local podman[275678]: Error: exit status 1 Dec 06 18:28:05 undercloud-0.redhat.local systemd[1]: tripleo_ironic_neutron_agent_healthcheck.service: Main process exited, code=exited, status=1/FAILURE Dec 06 18:28:05 undercloud-0.redhat.local systemd[1]: tripleo_ironic_neutron_agent_healthcheck.service: Failed with result 'exit-code'. Dec 06 18:28:05 undercloud-0.redhat.local systemd[1]: Failed to start ironic_neutron_agent healthcheck. Version-Release number of selected component (if applicable): Container is: rhosp16-openstack-ironic-neutron-agent:20191202.1 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
This is probably a downstream container issue as the upstream has a block in tripleo-common for this container. {% block ironic_neutron_agent_footer %} RUN mkdir -p /openstack && \ ln -s /usr/share/openstack-tripleo-common/healthcheck/ironic-neutron-agent /openstack/healthcheck && \ chmod a+rx /openstack/healthcheck {% endblock %}
https://review.opendev.org/#/c/693899/
I see the patch https://review.opendev.org/#/c/698580/ is installed in RHOS_TRUNK-16.0-RHEL-8-20200113.n.0, for example its in: /var/lib/containers/storage/overlay/3b6e177aadc2704ead5708dc50b882c7d638beb296839ebafa8042a0574b34cc/diff/usr/share/openstack-tripleo-common-containers/container-images/tripleo_kolla_template_overrides.j2 However I'm seeing in /var/log/messages: Jan 14 18:27:41 undercloud-0 systemd[1]: Starting ironic_neutron_agent healthcheck... Jan 14 18:27:41 undercloud-0 podman[79336]: exec failed: container_linux.go:345: starting container process caused "exec: \"/openstack/healthcheck\": stat /openstack/healthcheck: no such file or directory" Jan 14 18:27:41 undercloud-0 podman[79336]: time="2020-01-14T18:27:41Z" level=error msg="Error removing exit file for container e82b0e8eb016d90dd60eb60b61dacb04cb8eff9b3c4b471c546b6363c1581d3f exec session 59103bca15cbb84fc30808137cd3ac682b020a6535d158fbecb384a94780e4e9: remove /var/run/containers/storage/overlay-containers/e82b0e8eb016d90dd60eb60b61dacb04cb8eff9b3c4b471c546b6363c1581d3f/userdata/exec_pid_59103bca15cbb84fc30808137cd3ac682b020a6535d158fbecb384a94780e4e9: no such file or directory" Jan 14 18:27:41 undercloud-0 podman[79336]: Error: exit status 1 Jan 14 18:27:41 undercloud-0 systemd[1]: tripleo_ironic_neutron_agent_healthcheck.service: Main process exited, code=exited, status=1/FAILURE Jan 14 18:27:41 undercloud-0 systemd[1]: tripleo_ironic_neutron_agent_healthcheck.service: Failed with result 'exit-code'. Jan 14 18:27:41 undercloud-0 systemd[1]: Failed to start ironic_neutron_agent healthcheck.
Moving back to assigned as we need this in downstream container.
Verified we see healthcheck started: Jan 31 14:46:13 hardprov-dl360-g9-01 systemd[1]: Started ironic_neutron_agent healthcheck. and no errors in /var/log/messages
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0659