Description of problem: In scenarios where hybrid-overlay or kube-proxy fails to come up when WMCO is configuring the Windows instances, the node still incorrectly shows as ready. How reproducible: Always Steps to Reproduce: 1. Create a Windows node and force either hybrid-overlay or kube-proxy to fail during configuration Actual results: Node is shown as ready Expected results: Node should be marked as not ready
This bug has been verified on OCP 4.8 + vSphere + Windows Server 2019 and passed, thanks. Version-Release number of selected component (if applicable): WMCO built from https://github.com/openshift/windows-machine-config-operator/commit/1ca41c250ff937d1543559ba19e805a7473d45bf OCP version 4.8.0-0.nightly-2021-04-30-201824 Steps: 1. Install OCP 4.8 on vSphere, build WMCO and install it, refer to https://github.com/openshift/windows-machine-config-operator/blob/master/docs/HACKING.md 2. Create Windows machineset with Windows Server 2019 3. Check WMCO log and watch Windows node status 1), When kubelet service started, Windows node would be Ready but cordoned. $ oc logs -f deployment.apps/windows-machine-config-operator -n openshift-windows-machine-config-operator ... 2021-05-06T11:59:48.281Z INFO VM 172.31.249.149 configured kubelet {"cmd": "C:\\k\\\\wmcb.exe initialize-kubelet --ignition-file C:\\Windows\\Temp\\worker.ign --kubelet-path C:\\k\\kubelet.exe", "output": "Bootstrapping completed successfully"} $ oc get nodes -l kubernetes.io/os=windows -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME winworker-zk6s4 Ready,SchedulingDisabled worker 14s v1.21.0-rc.0.1190+e22a836a8b2659 172.31.249.149 172.31.249.149 Windows Server 2019 Standard 10.0.17763.1697 docker://19.3.14 2), Wait until running hybrid-overlay-node service failed, Windows node would be NotReady and cordoned. 2021-05-06T12:13:04.920Z ERROR controller-runtime.manager.controller.machine Reconciler error {"reconciler group": "machine.openshift.io", "reconciler kind": "Machine", "name": "winworker-zk6s4", "namespace": "openshift-machine-api", "error": "failed to configure Windows VM 422c050e-a0bc-b215-2a89-3986cbc84aab: configuring node network failed: error waiting for k8s.ovn.org/hybrid-overlay-distributed-router-gateway-mac node annotation for winworker-zk6s4: timeout waiting for k8s.ovn.org/hybrid-overlay-distributed-router-gateway-mac node annotation: timed out waiting for the condition", "errorVerbose": "timed out waiting for the condition\ntimeout waiting for k8s.ovn.org/hybrid-overlay-distributed-router-gateway-mac node annotation\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).waitForNodeAnnotation\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:306\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).configureNetwork\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:225\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Configure.func1\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:170\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Configure\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:193\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).addWorkerNode\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:440\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).Reconcile\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:374\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371\nerror waiting for k8s.ovn.org/hybrid-overlay-distributed-router-gateway-mac node annotation for winworker-zk6s4\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).configureNetwork\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:226\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Configure.func1\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:170\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Configure\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:193\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).addWorkerNode\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:440\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).Reconcile\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:374\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371\nconfiguring node network failed\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Configure.func1\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:171\ngithub.com/openshift/windows-machine-config-operator/pkg/nodeconfig.(*nodeConfig).Configure\n\t/build/windows-machine-config-operator/pkg/nodeconfig/nodeconfig.go:193\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).addWorkerNode\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:440\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).Reconcile\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:374\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371\nfailed to configure Windows VM 422c050e-a0bc-b215-2a89-3986cbc84aab\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).addWorkerNode\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:442\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).Reconcile\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:374\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"} sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:214 $ oc get nodes -l kubernetes.io/os=windows -owide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME winworker-zk6s4 NotReady,SchedulingDisabled worker 53m v1.21.0-rc.0.1190+e22a836a8b2659 172.31.249.149 172.31.249.149 Windows Server 2019 Standard 10.0.17763.1697 docker://19.3.14
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Platform for Windows Containers 3.0.0 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3001