Description of problem: Pod stuck in ContainerCreating due to error (from `oc get events`): 1- failed to create pod network sandbox 2- netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input Version-Release number of selected component (if applicable): 3.11.272 and 3.11.232 How reproducible: Unknown at this stage. It seems to appear on worker nodes that have many applications running and have been running for a longer period of time. Steps to Reproduce (uncertain): 1. Install OCP3.11 with Kuryr on OSP13 with CRI-O 2. Put a load on the cluster (applications) and then deploy more applications Actual results: New applications are stuck in ContainerCreating. Expected results: New applications are created and running. Additional info: The problem is resolved or removed by draining each node, performing the steps below, then uncordoning the node: sudo systemctl disable crio sudo systemctl disable atomic-openshift-node.service sudo reboot sudo rm -fr /var/lib/containers/* sudo systemctl enable crio sudo systemctl enable atomic-openshift-node.service sudo systemctl start atomic-openshift-node.service sudo systemctl start crio We think it might have to do with the Kuryr cni. The kuryr controller allocates the ports on OpenStack, and annotates the pods with the new IPs, but the kuryr-cni is unable to attach the network to the pods.
Ran tempest tests on v3.11.380 and all passed. (docker not cri-o)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 3.11.380 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0274
When updating from v3.11.346 to v3.11.386 I got the following: (shiftstack) [stack@undercloud-0 ~]$ oc get pods NAME READY STATUS RESTARTS AGE demo-68dbc445d-8dt5m 1/1 Running 0 7h demo-68dbc445d-cw8p5 1/1 Running 0 7h demo-68dbc445d-nrfxt 0/1 Error 0 7h docker-registry-1-cm2wk 1/1 Running 0 8h registry-console-1-h2lv9 0/1 Error 0 8h router-1-8mkt2 1/1 Running 0 8h router-1-9mtbp 1/1 Running 0 8h router-1-bkcjf 1/1 Running 0 8h and (shiftstack) [stack@undercloud-0 ~]$ oc get pods -n kuryr NAME READY STATUS RESTARTS AGE kuryr-cni-ds-4g78t 1/2 CrashLoopBackOff 21 1h kuryr-cni-ds-565df 2/2 Running 0 8h kuryr-cni-ds-7gm75 1/2 CrashLoopBackOff 19 1h kuryr-cni-ds-j4nrl 2/2 Running 0 8h kuryr-cni-ds-jqt4j 1/2 CrashLoopBackOff 23 1h kuryr-cni-ds-l99xw 2/2 Running 0 8h kuryr-cni-ds-n5n8h 2/2 Running 0 8h kuryr-cni-ds-q9fr7 2/2 Running 0 8h kuryr-controller-74c988b946-tldhv 0/1 Running 21 1h
Opened a bug: https://bugzilla.redhat.com/show_bug.cgi?id=1929170
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days