Description of problem: Currently during an upgrade from 4.4.26 to 4.5.12, the upgrade is stuck on the roll-out of the ovs/ovn daemonsets. On one particular worker node, OVS pod is stuck in a crashloop and as a result ovnkube-node is unable to start as well. Increasing the initialDelaySeconds helps the ovs container come up correctly and the upgrade proceeds after that. Version-Release number of selected component (if applicable): 4.5.12 How reproducible: Very likely on worker nodes when there are a lot of nodes Steps to Reproduce: 1. Deploy a a large environemnt with OVNKubernetes 2. Perform an upgrade from 4.4.26 to 4.5.12 3. Actual results: Upgrade stuck on network operator due to ovs daemon rollout Expected results: Upgrade should proceed and the ovs container should come up correctly without being stuck in a crashloopbackoff Additional info: Logs from ovnkube-node 2021-02-01T15:46:28Z|08085|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:46:36Z|12358|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:46:36Z|08086|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:46:44Z|12359|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:46:44Z|08087|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:46:52Z|12360|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:46:52Z|08088|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:00Z|12361|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:00Z|08089|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:08Z|12362|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:08Z|08090|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:16Z|12363|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:16Z|08091|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:24Z|12364|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:24Z|08092|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:32Z|12365|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:32Z|08093|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:40Z|12366|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:40Z|08094|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:48Z|12367|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:48Z|08095|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:48Z|08096|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connected 2021-02-01T15:47:49Z|08097|reconnect|INFO|unix:/var/run/openvswitch/db.sock: connection closed by peer 2021-02-01T15:47:56Z|12368|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:47:56Z|08098|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:04Z|12369|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:04Z|08099|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:12Z|12370|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:12Z|08100|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:20Z|12371|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:20Z|08101|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:28Z|12372|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:28Z|08102|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:36Z|12373|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:36Z|08103|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:44Z|12374|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:44Z|08104|rconn|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refused) 2021-02-01T15:48:52Z|12375|rconn(ovn_pinctrl0)|WARN|unix:/var/run/openvswitch/br-int.mgmt: connection failed (Connection refu ============================================================================ Logs from ovs pod [kni@e16-h18-b03-fc640 ~]$ oc get pods -o wide | grep -i crash ovnkube-node-kjgsx 1/2 CrashLoopBackOff 147 12h 192.168.220.44 worker031-fc640 <none> <none> ovs-node-9mj8m 0/1 CrashLoopBackOff 241 12h 192.168.220.44 worker031-fc640 <none> <none> [kni@e16-h18-b03-fc640 ~]$ oc logs ovs-node-9mj8m Starting ovsdb-server. Configuring Open vSwitch system IDs.
*** Bug 1921561 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days