Bug 1986370
Summary: | network-metrics-daemon pods are scheduled to nodes where network is not ready | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Oleg Bulatov <obulatov> |
Component: | Networking | Assignee: | Tomofumi Hayashi <tohayash> |
Networking sub component: | multus | QA Contact: | Weibin Liang <weliang> |
Status: | CLOSED NOTABUG | Docs Contact: | |
Severity: | medium | ||
Priority: | unspecified | CC: | anbhat |
Version: | 4.9 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-10-23 15:06:02 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Oleg Bulatov
2021-07-27 11:56:53 UTC
ns/openshift-network-diagnostics pod/network-check-target-shj2x node/ip-10-0-180-110.us-west-2.compute.internal - reason/NetworkNotReady network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started? These events seems to have similar nature. The mentioned test hasn't been merged yet, the PR with the test: https://github.com/openshift/origin/pull/26323 This issue is not a bug. Looking the CI, just before the error message, OCP cluster introduces a new node. At that time, kubelet is started in early phase of the node deploy and daemonset Pods, which includes network related pods, are deployed (at that time, network is not ready because CNI plugin will be installed by multus and ovn/openshift-sdn pod). In addition, network-metrics-daemon daemonset is pretty light weight container image, hence the pod could be started before the readiness of CNI plugin because CNI plugin is not installed yet. After several minutes, then multus/ovn/openshift-sdn pods install CNI plugins into the node, kubelet stops to show error message and starts network-metrics-daemon and other pods. Hence kubelet shows this error message (container runtime network not ready) because network is actually not ready. https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet.go#L2347 To prevent the error message, cri-o or kubelet need to recognize dependency of the pod (i.e. wait all pod except openshit-sdn/ovn/multus), however it is not discussed/designed/implemented in upstream because upstream recognize that it is not an error. |