+++ This bug was initially created as a clone of Bug #1857224 +++ Description of problem: After rebooting a node, it sometimes never transitions to the Ready state. This may happen more frequently under load. Typical messages are: Jun 25 14:08:07 worker-2.ostest.test.metalkube.org podman[1424]: Error: error creating container storage: layer not known Jun 25 14:08:07 worker-2.ostest.test.metalkube.org systemd[1]: nodeip-configuration.service: Main process exited, code=exited, status=125/n/a Jun 25 14:08:07 worker-2.ostest.test.metalkube.org systemd[1]: nodeip-configuration.service: Failed with result 'exit-code'. Jun 25 14:08:07 worker-2.ostest.test.metalkube.org systemd[1]: Failed to start Writes IP address configuration so that kubelet and crio services select a valid node IP. The workaround is to ssh to the node, stop the crio and kubelet services, rm -rf /var/lib/containers, and restart crio and kubelet. Version-Release number of selected component (if applicable): 4.5 How reproducible: Infrequent to frequent Steps to Reproduce: 1. Have active, running node 2. Reboot it until this happens 3. Actual results: Node stays not ready, with above messages Expected results: Node reboots and becomes ready Additional info:
*** Bug 1860984 has been marked as a duplicate of this bug. ***
Given that clone BZ 1857224 is CLOSED ERRATA for 4.5.3, we would normally assume that the patch has already been brought into the 4.6 nightly builds. Can someone confirm that that is the case, and maybe this BZ can be moved into ON_QA or VERIFIED?
the PR is still not merged but it is in the merge queue
the PR was merged
Issue not reproduced following https://bugzilla.redhat.com/show_bug.cgi?id=1857224#c37 $ oc version Client Version: 4.5.2 Server Version: 4.6.0-0.nightly-2020-08-18-165040 Kubernetes Version: v1.19.0-rc.2+99cb93a-dirty
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196
Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1]. If you feel like this bug still needs to be a suspect, please add keyword again. [1]: https://github.com/openshift/enhancements/pull/475