Bug 1943336

Summary: [4.7] [openshift-sdn] node pod should taint NoSchedule on termination; clear on startup
Product: OpenShift Container Platform Reporter: Dan Williams <dcbw>
Component: NetworkingAssignee: Surya Seetharaman <surya>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED DEFERRED Docs Contact:
Severity: high    
Priority: high CC: aconstan, anbhat, anusaxen, bbennett
Version: 4.7   
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1943334 Environment:
Last Closed: 2021-09-14 07:02:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1943334    
Bug Blocks:    

Description Dan Williams 2021-03-25 19:54:21 UTC
+++ This bug was initially created as a clone of Bug #1943334 +++

When an ovnkube node pod is upgraded, the old pods are killed and new ones started some time later. Observed gap between old -> new can be 1m or more. During this time no pods can be started, but the node is still available for scheduling and indeed this happens and those pods time out. They will get retried, but it's pointless to try running pods while the node networking is down.

One fix could be to taint the node NoSchedule in the ovnkube-node container termination hook, and clear any existing taint when ovnkube-node starts. ovnkube containers (and anything else network-y like multus) might have to tolerate this taint.

eg

        lifecycle:
          preStop:
            exec:
              command:
              - /bin/bash
              - -c
              - |
                rm -f /etc/cni/net.d/10-ovn-kubernetes.conf
                kubectl taint nodes ${K8S_NODE} "k8s.ovn.org/network-unavailable:NoSchedule"

and then programmatically remove the taint in ovnkube-node after writing out the CNI config file when everything is initialized.

----

Same strategy could likely be done for openshift-sdn's node process.

Comment 3 Surya Seetharaman 2021-09-14 07:02:35 UTC
Closing this bug in favour of https://issues.redhat.com/browse/SDN-2241. Solution will have to be implemented in CRI-O