Description of problem: hybrid-overlay-node.exe is crashing trying to list pods on Windows Server 2019 Datacenter: PS C:\var\log\hybrid-overlay> C:\\k\\hybrid-overlay-node.exe --node ip-10-0-158-179.us-east-2.compute.internal --k8s-kubeconfig c:\\k\\kubeconfig --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay2.log I1130 21:22:58.700822 2756 cert_rotation.go:137] Starting client certificate rotation controller time="2020-11-30T21:22:58Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10" time="2020-11-30T21:22:58Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 12" time="2020-11-30T21:22:58Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10" time="2020-11-30T21:22:58Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 12" time="2020-11-30T21:22:58Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 13" time="2020-11-30T21:22:58Z" level=info msg="currentVersion.Major < versionRange.MinVersion.Major: 9, 10" panic: node "ip-10-0-158-179.us-east-2.compute.internal" not found goroutine 1 [running]: main.main.func1(0xc0003da140, 0xc000370500, 0x60) /build/windows-machine-config-operator/ovn-kubernetes/go-controller/hybrid-overlay/cmd/hybrid-overlay-node/hybrid-overlay-node.go:53 +0x126 github.com/urfave/cli/v2.(*App).RunContext(0xc0002a6d80, 0x1995b80, 0xc000034028, 0xc000128000, 0x7, 0x8, 0x0, 0x0) /build/windows-machine-config-operator/ovn-kubernetes/go-controller/vendor/github.com/urfave/cli/v2/app.go:315 +0x72d main.main() /build/windows-machine-config-operator/ovn-kubernetes/go-controller/hybrid-overlay/cmd/hybrid-overlay-node/hybrid-overlay-node.go:58 +0x705 This is happening even though the node object is present on the cluster: windows-machine-config-operator update-submodules {1} U:1 oc get nodes ip-10-0-158-179.us-east-2.compute.internal NAME STATUS ROLES AGE VERSION ip-10-0-158-179.us-east-2.compute.internal Ready worker 132m v1.19.2-1007+ad738ba548b6d6 How reproducible: Always Steps to Reproduce: 1. Bring up a Windows nodes 2. Install the kubelet service 3. Run hybrid-overlay-node.exe Actual results: Expected results: Additional info:
Commit "pkg/factory, pkg/node: let nodes watch less, skip headless services" [0] made changes in this area. The following patch fixes the issue: diff --git a/go-controller/hybrid-overlay/pkg/controller/node_windows.go b/go-controller/hybrid-overlay/pkg/controller/node_windows.go index d4c0209a..fe01c677 100644 --- a/go-controller/hybrid-overlay/pkg/controller/node_windows.go +++ b/go-controller/hybrid-overlay/pkg/controller/node_windows.go @@ -53,7 +53,7 @@ func newNodeController(kube kube.Interface, "UDP port. Please make sure you install all the KB updates on your system.") } - node, err := nodeLister.Get(nodeName) + node, err := kube.GetNode(nodeName) if err != nil { return nil, err } I am not suggesting that this is the correct fix but posting it here as a data point. [0] https://github.com/openshift/ovn-kubernetes/commit/99e7c6a14e0a81bcc2ad6724ca693251218b5e00
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633