+++ This bug was initially created as a clone of Bug #2097685 +++ Description of problem: I think we have a bug with Ironic agent restarting on the node, it might be not removing the old container and failing to restart. I think it's in continuation of https://github.com/openshift/image-customization-controller/pull/34 Log:Jun 16 09:09:48 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211421]: b619b974e5cb4330c628c8ee5f2bb3dadd8ef8cf9e56e48e300f7c6f8120d5ba Jun 16 09:09:48 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: Started Ironic Agent. Jun 16 09:09:49 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211472]: Error: error creating container storage: the container name "ironic-agent" is already in use by "6d87213ed9c456f20ae70dee343caf377dc0ba217a63f4a2e9d2b13a9faa06ca". You have to remove that container to be able to reuse that name.: that name is already in use Jun 16 09:09:49 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: ironic-agent.service: Main process exited, code=exited, status=125/n/a Jun 16 09:09:49 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: ironic-agent.service: Failed with result 'exit-code'. Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: ironic-agent.service: Service RestartSec=5s expired, scheduling restart. Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: ironic-agent.service: Scheduled restart job, restart counter is at 11096. Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: Stopped Ironic Agent. Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: Starting Ironic Agent... Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Trying to pull quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a2e56178af2677f1e825a1ede3d8ea80f34d6a89d203f083a5840ebf6abd3e17... Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Getting image source signatures Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Copying blob sha256:21a86dbf0e5a8583d4e1818a201dc0fc18e9ae20a1b98a71e43f9b60fc543466 Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Copying blob sha256:545277d800059b32cf03377a9301094e9ac8aa4bb42d809766d7355ca9aa8652 Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Copying blob sha256:de516cc59493b713e0b33a4954f7eb500383e59642e2897d02e63992d4576720 Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Copying blob sha256:f70d60810c69edad990aaf0977a87c6d2bcc9cd52904fa6825f08507a9b6e7bc Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Copying config sha256:b619b974e5cb4330c628c8ee5f2bb3dadd8ef8cf9e56e48e300f7c6f8120d5ba Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Writing manifest to image destination Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: Storing signatures Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211529]: b619b974e5cb4330c628c8ee5f2bb3dadd8ef8cf9e56e48e300f7c6f8120d5ba Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: Started Ironic Agent. Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com podman[1211581]: Error: error creating container storage: the container name "ironic-agent" is already in use by "6d87213ed9c456f20ae70dee343caf377dc0ba217a63f4a2e9d2b13a9faa06ca". You have to remove that container to be able to reuse that name.: that name is already in use Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: ironic-agent.service: Main process exited, code=exited, status=125/n/a Jun 16 09:09:54 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: ironic-agent.service: Failed with result 'exit-code'. Jun 16 09:09:59 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: ironic-agent.service: Service RestartSec=5s expired, scheduling restart. Jun 16 09:09:59 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: ironic-agent.service: Scheduled restart job, restart counter is at 11097. Jun 16 09:09:59 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: Stopped Ironic Agent. Jun 16 09:09:59 cnfdf05.telco5gran.eng.rdu2.redhat.com systemd[1]: Starting Ironic Agent... And container: $ sudo podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 6d87213ed9c4 quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a2e56178af2677f1e825a1ede3d8ea80f34d6a89d203f083a5840ebf6abd3e17 16 hours ago Exited (1) 16 hours ago ironic-agent Version-Release number of selected component (if applicable): 4.11 nightly How reproducible: always Steps to Reproduce: 1. start introspection, fail it and it will continue to restart Actual results: can't restart ironic-agent Expected results: ironic-agent restarts sucessfully --- Additional comment from Riccardo Pittau on 2022-06-16 11:28:17 UTC --- this was discussed on slack https://coreos.slack.com/archives/CFP6ST0A3/p1655371117596509 --- Additional comment from OpenShift Automated Release Tooling on 2022-06-17 20:11:01 UTC --- Elliott changed bug status from MODIFIED to ON_QA. This bug is expected to ship in the next 4.11 release.
we need to backport https://github.com/openshift/image-customization-controller/pull/55 to 4.10 after https://github.com/openshift/image-customization-controller/pull/54 merges
Issue seems to be resolved (.venv) [kni@titan52 ocp-edge-auto]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-06-08-150219 True False 25h Cluster version is 4.10.0-0.nightly-2022-06-08-150219 (.venv) [kni@titan52 ocp-edge-auto]$ oc ^C (.venv) [kni@titan52 ocp-edge-auto]$ sudo podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d25a3e5af285 quay.io/ocp-edge-qe/nexus3:latest6 sh -c ${SONATYPE_... 26 hours ago Up 26 hours ago registry ec4e821235dd quay.io/ocp-edge-qe/httpd:latest httpd-foreground 26 hours ago Up 26 hours ago image-cache Ironic log inspection added.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.10.23 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:5568