+++ This bug was initially created as a clone of Bug #1996615 +++ Description of problem: After added a BYOH Windows node, I'm trying to upgrade WMCO and get panic error: # oc logs -f deployment.apps/windows-machine-config-operator -n openshift-windows-machine-config-operator ... 2021-08-23T09:52:12.230Z INFO controllers.configmap instance requires upgrade {"node": "ip-10-0-135-226.us-east-2.compute.internal", "version": "3.0.0+af53aa0", "expected version": "3.0.0+4708e9d"} panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4efbf7] goroutine 761 [running]: fmt.Fprintf(0x0, 0x0, 0x1dc7078, 0x13, 0xc0035db700, 0x2, 0x2, 0x1db7d54, 0xc0035db388, 0x40d97b) /usr/local/go/src/fmt/print.go:205 +0x77 k8s.io/kubectl/pkg/drain.(*Helper).evictPods.func1(0xc000bd2a90, 0x203c760, 0xc000421e00, 0xc001c4f468, 0xc001c3c9b0, 0xc0017f9137, 0x9, 0x0, 0x0, 0x0, ...) /build/windows-machine-config-operator/vendor/k8s.io/kubectl/pkg/drain/drain.go:276 +0xd99 created by k8s.io/kubectl/pkg/drain.(*Helper).evictPods /build/windows-machine-config-operator/vendor/k8s.io/kubectl/pkg/drain/drain.go:269 +0x233 # oc get pod -n openshift-windows-machine-config-operator NAME READY STATUS RESTARTS AGE windows-machine-config-operator-9f56f78f-57rwx 0/1 CrashLoopBackOff 6 9m55s Version-Release number of selected component (if applicable): OCP version: 4.8.0-0.nightly-2021-08-22-035234 WMCO release-4.8 branch commit 4708e9d0ba70ac04727217f8e64bea9cb49bb5f0 How reproducible: Always Steps to Reproduce: 1, Build WMCO locally with an older commit in release-4.8 branch, e.g. WMCO v3.1.0 2, Bootstrap a BYOH Windows node 3, Build WMCO locally with latest commit in release-4.8 branch, e.g. WMCO v3.1.1 4, Uninstall WMCO v3.1.0 and change catalogsource to v3.1.1 and install WMCO v3.1.1 Actual results: WMCO panic with BYOH Windows node Expected results: WMCO should not panic and reconfigure BYOH Windows node Additional info: --- Additional comment from sgao on 2021-08-24 05:24:15 UTC --- Found this bug is not related to upgrade, if I deconfigure a BYOH Windows instance, also got panic on OCP 4.8 Full log: # oc logs -f deployment.apps/windows-machine-config-operator -n openshift-windows-machine-config-operator 2021-08-24T05:02:17.399Z INFO version operator {"version": "3.0.0+af53aa0"} 2021-08-24T05:02:17.399Z INFO version go {"version": "go1.16 linux/amd64"} 2021-08-24T05:02:17.427Z INFO leader Trying to become the leader. I0824 05:02:18.478106 1 request.go:668] Waited for 1.044207723s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/user.openshift.io/v1?timeout=32s 2021-08-24T05:02:20.283Z DEBUG leader Found podname {"Pod.Name": "windows-machine-config-operator-85ffc8d564-tdr7n"} 2021-08-24T05:02:20.288Z DEBUG leader Found Pod {"Pod.Namespace": "openshift-windows-machine-config-operator", "Pod.Name": "windows-machine-config-operator-85ffc8d564-tdr7n"} 2021-08-24T05:02:20.291Z INFO leader No pre-existing lock was found. 2021-08-24T05:02:20.296Z INFO leader Became the leader. 2021-08-24T05:02:23.157Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": "0.0.0.0:9182"} 2021-08-24T05:02:23.159Z ERROR controller.secret Unable to retrieve private key, please ensure it is created {"error": "the cache is not started, can not read objects"} main.main /build/windows-machine-config-operator/main.go:176 runtime.main /usr/local/go/src/runtime/proc.go:225 2021-08-24T05:02:23.198Z INFO setup starting manager 2021-08-24T05:02:23.499Z INFO controller-runtime.manager starting metrics server {"path": "/metrics"} 2021-08-24T05:02:23.499Z INFO controller-runtime.manager.controller.secret Starting EventSource {"reconciler group": "", "reconciler kind": "Secret", "source": "kind source: /, Kind="} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.secret Starting EventSource {"reconciler group": "", "reconciler kind": "Secret", "source": "kind source: /, Kind="} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.secret Starting Controller {"reconciler group": "", "reconciler kind": "Secret"} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.configmap Starting EventSource {"reconciler group": "", "reconciler kind": "ConfigMap", "source": "kind source: /, Kind="} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.configmap Starting EventSource {"reconciler group": "", "reconciler kind": "ConfigMap", "source": "kind source: /, Kind="} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.configmap Starting Controller {"reconciler group": "", "reconciler kind": "ConfigMap"} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.machine Starting EventSource {"reconciler group": "machine.openshift.io", "reconciler kind": "Machine", "source": "kind source: /, Kind="} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.machine Starting EventSource {"reconciler group": "machine.openshift.io", "reconciler kind": "Machine", "source": "kind source: /, Kind="} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.machine Starting Controller {"reconciler group": "machine.openshift.io", "reconciler kind": "Machine"} 2021-08-24T05:02:23.500Z INFO controller-runtime.manager.controller.certificatesigningrequest Starting EventSource {"reconciler group": "certificates.k8s.io", "reconciler kind": "CertificateSigningRequest", "source": "kind source: /, Kind="} 2021-08-24T05:02:23.501Z INFO controller-runtime.manager.controller.certificatesigningrequest Starting Controller {"reconciler group": "certificates.k8s.io", "reconciler kind": "CertificateSigningRequest"} 2021-08-24T05:02:23.801Z INFO controller-runtime.manager.controller.configmap Starting workers {"reconciler group": "", "reconciler kind": "ConfigMap", "worker count": 1} 2021-08-24T05:02:23.801Z INFO controllers.configmap processing {"instances in": "windows-instances"} 2021-08-24T05:02:23.801Z INFO controller-runtime.manager.controller.secret Starting workers {"reconciler group": "", "reconciler kind": "Secret", "worker count": 1} 2021-08-24T05:02:23.801Z INFO controller-runtime.manager.controller.machine Starting workers {"reconciler group": "machine.openshift.io", "reconciler kind": "Machine", "worker count": 1} 2021-08-24T05:02:23.802Z INFO controller-runtime.manager.controller.certificatesigningrequest Starting workers {"reconciler group": "certificates.k8s.io", "reconciler kind": "CertificateSigningRequest", "worker count": 1} 2021-08-24T05:02:23.814Z INFO metrics Prometheus configured {"endpoints": "windows-exporter", "port": 9182, "name": "metrics"} 2021-08-24T05:02:51.992Z DEBUG controller.windowsmachine Machine has no phase associated with it {"name": "sgao-8-5hc8g-windows-byoh-us-east-2a-w6ddm"} 2021-08-24T05:02:52.048Z DEBUG controller.windowsmachine Machine has no phase associated with it {"name": "sgao-8-5hc8g-windows-byoh-us-east-2a-w6ddm"} 2021-08-24T05:02:52.162Z DEBUG controller.windowsmachine invalid Machine {"name": "sgao-8-5hc8g-windows-byoh-us-east-2a-w6ddm", "error": "no IP addresses defined", "errorVerbose": "no IP addresses defined\ngithub.com/openshift/windows-machine-config-operator/controllers.getInternalIPAddress\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:487\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).isValidMachine\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:193\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).SetupWithManager.func2\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:107\nsigs.k8s.io/controller-runtime/pkg/predicate.Funcs.Update\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/predicate/predicate.go:87\nsigs.k8s.io/controller-runtime/pkg/source/internal.EventHandler.OnUpdate\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/source/internal/eventsource.go:88\nk8s.io/client-go/tools/cache.(*processorListener).run.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:775\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90\nk8s.io/client-go/tools/cache.(*processorListener).run\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:771\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"} 2021-08-24T05:02:54.163Z DEBUG controller.windowsmachine invalid Machine {"name": "sgao-8-5hc8g-windows-byoh-us-east-2a-w6ddm", "error": "no IP addresses defined", "errorVerbose": "no IP addresses defined\ngithub.com/openshift/windows-machine-config-operator/controllers.getInternalIPAddress\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:487\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).isValidMachine\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:193\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).SetupWithManager.func2\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:107\nsigs.k8s.io/controller-runtime/pkg/predicate.Funcs.Update\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/predicate/predicate.go:87\nsigs.k8s.io/controller-runtime/pkg/source/internal.EventHandler.OnUpdate\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/source/internal/eventsource.go:88\nk8s.io/client-go/tools/cache.(*processorListener).run.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:775\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90\nk8s.io/client-go/tools/cache.(*processorListener).run\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:771\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"} 2021-08-24T05:05:18.652Z INFO controllers.configmap processing {"instances in": "windows-instances"} 2021-08-24T05:05:18.665Z INFO metrics Prometheus configured {"endpoints": "windows-exporter", "port": 9182, "name": "metrics"} 2021-08-24T05:05:23.246Z INFO controllers.configmap processing {"instances in": "windows-instances"} 2021-08-24T05:05:23.246Z DEBUG wc 10.0.159.251 initializing SSH connection 2021-08-24T05:07:34.780Z DEBUG wc 10.0.159.251 SSH dial {"IP Address": "10.0.159.251", "error": "dial tcp 10.0.159.251:22: connect: connection timed out"} 2021-08-24T05:09:36.620Z INFO wc 10.0.159.251 configuring 2021-08-24T05:09:36.755Z DEBUG wc 10.0.159.251 run {"cmd": "if not exist C:\\k\\ mkdir C:\\k\\", "out": ""} 2021-08-24T05:09:36.770Z DEBUG wc 10.0.159.251 run {"cmd": "if not exist C:\\Temp\\ mkdir C:\\Temp\\", "out": ""} 2021-08-24T05:09:36.785Z DEBUG wc 10.0.159.251 run {"cmd": "if not exist C:\\k\\cni\\ mkdir C:\\k\\cni\\", "out": ""} 2021-08-24T05:09:36.800Z DEBUG wc 10.0.159.251 run {"cmd": "if not exist C:\\k\\cni\\config\\ mkdir C:\\k\\cni\\config\\", "out": ""} 2021-08-24T05:09:36.818Z DEBUG wc 10.0.159.251 run {"cmd": "if not exist C:\\var\\log\\ mkdir C:\\var\\log\\", "out": ""} 2021-08-24T05:09:36.833Z DEBUG wc 10.0.159.251 run {"cmd": "if not exist C:\\var\\log\\kube-proxy\\ mkdir C:\\var\\log\\kube-proxy\\", "out": ""} 2021-08-24T05:09:36.848Z DEBUG wc 10.0.159.251 run {"cmd": "if not exist C:\\var\\log\\hybrid-overlay\\ mkdir C:\\var\\log\\hybrid-overlay\\", "out": ""} 2021-08-24T05:09:36.848Z INFO wc 10.0.159.251 transferring files 2021-08-24T05:09:39.976Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\hybrid-overlay-node.exe", "out": "False\r\n"} 2021-08-24T05:09:39.976Z DEBUG wc 10.0.159.251 copy {"local file": "/payload/hybrid-overlay-node.exe", "remote dir": "C:\\k\\"} 2021-08-24T05:09:46.984Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\kube-proxy.exe", "out": "False\r\n"} 2021-08-24T05:09:46.984Z DEBUG wc 10.0.159.251 copy {"local file": "/payload//kube-node/kube-proxy.exe", "remote dir": "C:\\k\\"} 2021-08-24T05:09:53.883Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\wmcb.exe", "out": "False\r\n"} 2021-08-24T05:09:53.883Z DEBUG wc 10.0.159.251 copy {"local file": "/payload/wmcb.exe", "remote dir": "C:\\k\\"} 2021-08-24T05:09:57.469Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\\\flannel.exe", "out": "False\r\n"} 2021-08-24T05:09:57.469Z DEBUG wc 10.0.159.251 copy {"local file": "/payload//cni/flannel.exe", "remote dir": "C:\\k\\cni\\"} 2021-08-24T05:09:59.975Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\Temp\\\\hns.psm1", "out": "False\r\n"} 2021-08-24T05:09:59.975Z DEBUG wc 10.0.159.251 copy {"local file": "/payload//powershell/hns.psm1", "remote dir": "C:\\Temp\\"} 2021-08-24T05:10:01.769Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\\\host-local.exe", "out": "False\r\n"} 2021-08-24T05:10:01.769Z DEBUG wc 10.0.159.251 copy {"local file": "/payload//cni/host-local.exe", "remote dir": "C:\\k\\cni\\"} 2021-08-24T05:10:03.939Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\\\win-overlay.exe", "out": "False\r\n"} 2021-08-24T05:10:03.939Z DEBUG wc 10.0.159.251 copy {"local file": "/payload//cni/win-overlay.exe", "remote dir": "C:\\k\\cni\\"} 2021-08-24T05:10:06.409Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\Temp\\\\wget-ignore-cert.ps1", "out": "False\r\n"} 2021-08-24T05:10:06.409Z DEBUG wc 10.0.159.251 copy {"local file": "/payload//powershell/wget-ignore-cert.ps1", "remote dir": "C:\\Temp\\"} 2021-08-24T05:10:08.290Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\windows_exporter.exe", "out": "False\r\n"} 2021-08-24T05:10:08.290Z DEBUG wc 10.0.159.251 copy {"local file": "/payload/windows_exporter.exe", "remote dir": "C:\\k\\"} 2021-08-24T05:10:11.978Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\\\win-bridge.exe", "out": "False\r\n"} 2021-08-24T05:10:11.978Z DEBUG wc 10.0.159.251 copy {"local file": "/payload//cni/win-bridge.exe", "remote dir": "C:\\k\\cni\\"} 2021-08-24T05:10:14.560Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\kubelet.exe", "out": "False\r\n"} 2021-08-24T05:10:14.560Z DEBUG wc 10.0.159.251 copy {"local file": "/payload//kube-node/kubelet.exe", "remote dir": "C:\\k\\"} 2021-08-24T05:10:26.089Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe create windows_exporter binPath=\"C:\\k\\windows_exporter.exe --collectors.enabled cpu,cs,logical_disk,net,os,service,system,textfile,container,memory,cpu_info\" start=auto", "out": "[SC] CreateService SUCCESS\r\n"} 2021-08-24T05:10:26.127Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe query windows_exporter", "out": "\r\nSERVICE_NAME: windows_exporter \r\n TYPE : 10 WIN32_OWN_PROCESS \r\n STATE : 1 STOPPED \r\n WIN32_EXIT_CODE : 1077 (0x435)\r\n SERVICE_EXIT_CODE : 0 (0x0)\r\n CHECKPOINT : 0x0\r\n WAIT_HINT : 0x0\r\n"} 2021-08-24T05:10:46.231Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe start windows_exporter", "out": "\r\nSERVICE_NAME: windows_exporter \r\n TYPE : 10 WIN32_OWN_PROCESS \r\n STATE : 2 START_PENDING \r\n (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)\r\n WIN32_EXIT_CODE : 0 (0x0)\r\n SERVICE_EXIT_CODE : 0 (0x0)\r\n CHECKPOINT : 0x0\r\n WAIT_HINT : 0x7d0\r\n PID : 3260\r\n FLAGS : \r\n"} 2021-08-24T05:10:46.231Z INFO wc 10.0.159.251 configured {"service": "windows_exporter", "args": "--collectors.enabled cpu,cs,logical_disk,net,os,service,system,textfile,container,memory,cpu_info\""} 2021-08-24T05:10:48.949Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass C:\\Temp\\wget-ignore-cert.ps1 -server https://api-int.sgao-8.qe.devcluster.openshift.com:22623/config/worker -output C:\\Windows\\Temp\\worker.ign -acceptHeader application/vnd.coreos.ignition+json`;version=3.1.0", "out": ""} 2021-08-24T05:10:52.138Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass C:\\k\\\\wmcb.exe initialize-kubelet --ignition-file C:\\Windows\\Temp\\worker.ign --kubelet-path C:\\k\\kubelet.exe", "out": "Bootstrapping completed successfully"} 2021-08-24T05:10:52.138Z INFO wc 10.0.159.251 configured kubelet {"cmd": "C:\\k\\\\wmcb.exe initialize-kubelet --ignition-file C:\\Windows\\Temp\\worker.ign --kubelet-path C:\\k\\kubelet.exe", "output": "Bootstrapping completed successfully"} 2021-08-24T05:10:52.493Z INFO controllers.CertificateSigningRequests CSR approved {"CSR": "csr-4gtl7"} 2021-08-24T05:10:55.454Z INFO controllers.CertificateSigningRequests CSR approved {"CSR": "csr-c5vk4"} 2021-08-24T05:11:22.181Z INFO wc 10.0.159.251 configure {"service": "hybrid-overlay-node", "args": "--node ip-10-0-159-251.us-east-2.compute.internal --k8s-kubeconfig c:\\k\\kubeconfig --windows-service --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay.log\" depend= kubelet"} 2021-08-24T05:11:22.237Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe create hybrid-overlay-node binPath=\"C:\\k\\hybrid-overlay-node.exe --node ip-10-0-159-251.us-east-2.compute.internal --k8s-kubeconfig c:\\k\\kubeconfig --windows-service --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay.log\" depend= kubelet start=auto", "out": "[SC] CreateService SUCCESS\r\n"} 2021-08-24T05:11:22.263Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe query hybrid-overlay-node", "out": "\r\nSERVICE_NAME: hybrid-overlay-node \r\n TYPE : 10 WIN32_OWN_PROCESS \r\n STATE : 1 STOPPED \r\n WIN32_EXIT_CODE : 1077 (0x435)\r\n SERVICE_EXIT_CODE : 0 (0x0)\r\n CHECKPOINT : 0x0\r\n WAIT_HINT : 0x0\r\n"} 2021-08-24T05:11:22.739Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe start hybrid-overlay-node", "out": "\r\nSERVICE_NAME: hybrid-overlay-node \r\n TYPE : 10 WIN32_OWN_PROCESS \r\n STATE : 2 START_PENDING \r\n (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)\r\n WIN32_EXIT_CODE : 0 (0x0)\r\n SERVICE_EXIT_CODE : 0 (0x0)\r\n CHECKPOINT : 0x0\r\n WAIT_HINT : 0x7d0\r\n PID : 1016\r\n FLAGS : \r\n"} 2021-08-24T05:11:22.783Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe query hybrid-overlay-node", "out": "\r\nSERVICE_NAME: hybrid-overlay-node \r\n TYPE : 10 WIN32_OWN_PROCESS \r\n STATE : 4 RUNNING \r\n (STOPPABLE, NOT_PAUSABLE, ACCEPTS_SHUTDOWN)\r\n WIN32_EXIT_CODE : 0 (0x0)\r\n SERVICE_EXIT_CODE : 0 (0x0)\r\n CHECKPOINT : 0x0\r\n WAIT_HINT : 0x0\r\n"} 2021-08-24T05:13:26.027Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Get-HnsNetwork", "out": "\r\n\r\nActivityId : 3F65AADE-E980-44A7-9512-4E0B7C6EB812\r\nAdditionalParams : \r\nCurrentEndpointCount : 0\r\nExtensions : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering \r\n Platform}, @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=False; Name=Microsoft Azure \r\n VFP Switch Extension}, @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; \r\n Name=Microsoft NDIS Capture}}\r\nFlags : 0\r\nHealth : @{AddressNotificationMissedCount=0; AddressNotificationSequenceNumber=0; \r\n InterfaceNotificationMissedCount=0; InterfaceNotificationSequenceNumber=0; LastErrorCode=0; \r\n LastUpdateTime=132742550242152330; RouteNotificationMissedCount=0; \r\n RouteNotificationSequenceNumber=0}\r\nID : E06E62E8-C35E-462A-870D-8C0BECE58CC7\r\nIPv6 : False\r\nLayeredOn : 4FFEC140-8139-4482-A787-AAC40F2FB5ED\r\nMacPools : {@{EndMacAddress=00-15-5D-9B-1F-FF; StartMacAddress=00-15-5D-9B-10-00}}\r\nMaxConcurrentEndpoints : 0\r\nName : nat\r\nNatName : ICSA21632B3-526B-4199-A32D-B87EDCEF1DC3\r\nPolicies : {}\r\nResources : @{AdditionalParams=; AllocationOrder=2; Allocators=System.Object[]; Health=; \r\n ID=3F65AADE-E980-44A7-9512-4E0B7C6EB812; PortOperationTime=0; State=1; SwitchOperationTime=0; \r\n VfpOperationTime=0; parentId=D7848AFB-E3F6-4302-8FB3-02AF4102B8FD}\r\nState : 1\r\nSubnets : {@{AdditionalParams=; AddressPrefix=172.24.192.0/20; GatewayAddress=172.24.192.1; Health=; \r\n ID=14DFA1DA-8F7B-4095-9484-EF6287E51ABB; Policies=System.Object[]; State=0}}\r\nTotalEndpoints : 0\r\nType : nat\r\nVersion : 38654705669\r\n\r\nActivityId : EE77D950-8550-49D6-B4EB-E2EC090643A3\r\nAdditionalParams : \r\nCurrentEndpointCount : 0\r\nDNSServerCompartment : 3\r\nDrMacAddress : 00-15-5D-BD-1B-0D\r\nExtensions : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering \r\n Platform}, @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=True; Name=Microsoft Azure \r\n VFP Switch Extension}, @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; \r\n Name=Microsoft NDIS Capture}}\r\nFlags : 0\r\nHealth : @{LastErrorCode=0; LastUpdateTime=132742554829538077}\r\nID : 74184937-92DE-405E-B968-7F184CF93F13\r\nIPv6 : False\r\nInterfaceConstraint : @{InterfaceGuid=00000000-0000-0000-0000-000000000000}\r\nLayeredOn : FE8BDB4A-FD4A-4245-B7BB-7D1DEC18669F\r\nMacPools : {@{EndMacAddress=00-15-5D-DD-BF-FF; StartMacAddress=00-15-5D-DD-B0-00}}\r\nManagementIP : 10.0.159.251\r\nMaxConcurrentEndpoints : 0\r\nName : BaseOVNKubernetesHybridOverlayNetwork\r\nPolicies : {}\r\nResources : @{AdditionalParams=; AllocationOrder=1; Allocators=System.Object[]; Health=; \r\n ID=EE77D950-8550-49D6-B4EB-E2EC090643A3; PortOperationTime=0; State=1; SwitchOperationTime=0; \r\n VfpOperationTime=0; parentId=70258D96-A17E-4E4C-9727-42887EACA4A4}\r\nState : 1\r\nSubnets : {@{AdditionalParams=; AddressPrefix=100.64.0.0/30; GatewayAddress=100.64.0.2; Health=; \r\n ID=239CF3FE-1CE2-402F-9AB7-16971E29F09D; ObjectType=5; Policies=System.Object[]; State=0}}\r\nTotalEndpoints : 0\r\nType : Overlay\r\nVersion : 38654705669\r\n\r\nActivityId : 9D121761-9A0E-472B-9B75-238EEB516B52\r\nAdditionalParams : \r\nAutomaticDNS : True\r\nCurrentEndpointCount : 0\r\nDNSServerCompartment : 4\r\nDrMacAddress : 00-15-5D-BD-1B-0D\r\nExtensions : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering \r\n Platform}, @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=True; Name=Microsoft Azure \r\n VFP Switch Extension}, @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; \r\n Name=Microsoft NDIS Capture}}\r\nFlags : 8\r\nHealth : @{LastErrorCode=0; LastUpdateTime=132742554970525883}\r\nID : 2574A48D-FC38-4C5C-AFFC-837C0196432F\r\nIPv6 : False\r\nInterfaceConstraint : @{InterfaceGuid=00000000-0000-0000-0000-000000000000}\r\nLayeredOn : FE8BDB4A-FD4A-4245-B7BB-7D1DEC18669F\r\nMacPools : {@{EndMacAddress=00-15-5D-D5-7F-FF; StartMacAddress=00-15-5D-D5-70-00}}\r\nManagementIP : 10.0.159.251\r\nMaxConcurrentEndpoints : 0\r\nName : OVNKubernetesHybridOverlayNetwork\r\nPolicies : {@{Type=HostRoute}, @{DestinationPrefix=10.131.0.0/23; \r\n DistributedRouterMacAddress=0a:58:0a:83:00:03; IsolationId=4097; \r\n ProviderAddress=10.0.153.243; Type=RemoteSubnetRoute}, @{DestinationPrefix=10.128.0.0/23; \r\n DistributedRouterMacAddress=0a:58:0a:80:00:03; IsolationId=4097; \r\n ProviderAddress=10.0.154.199; Type=RemoteSubnetRoute}, @{DestinationPrefix=10.129.2.0/23; \r\n DistributedRouterMacAddress=0a:58:0a:81:02:03; IsolationId=4097; \r\n ProviderAddress=10.0.171.166; Type=RemoteSubnetRoute}...}\r\nResources : @{AdditionalParams=; AllocationOrder=1; Allocators=System.Object[]; Health=; \r\n ID=9D121761-9A0E-472B-9B75-238EEB516B52; PortOperationTime=0; State=1; SwitchOperationTime=0; \r\n VfpOperationTime=0; parentId=70258D96-A17E-4E4C-9727-42887EACA4A4}\r\nState : 1\r\nSubnets : {@{AdditionalParams=; AddressPrefix=10.132.8.0/24; GatewayAddress=10.132.8.1; Health=; \r\n ID=5FE59656-544F-427F-85A2-1F970ABDC085; ObjectType=5; Policies=System.Object[]; State=0}}\r\nTotalEndpoints : 0\r\nType : Overlay\r\nVersion : 38654705669\r\n\r\n\r\n\r\n"} 2021-08-24T05:13:26.027Z INFO wc 10.0.159.251 configured {"service": "hybrid-overlay-node", "args": "--node ip-10-0-159-251.us-east-2.compute.internal --k8s-kubeconfig c:\\k\\kubeconfig --windows-service --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay.log\" depend= kubelet"} 2021-08-24T05:13:43.196Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\config\\\\cni.conf", "out": "False\r\n"} 2021-08-24T05:13:43.197Z DEBUG wc 10.0.159.251 copy {"local file": "/tmp/cni015561117/cni.conf", "remote dir": "C:\\k\\cni\\config\\"} 2021-08-24T05:14:16.480Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass C:\\k\\wmcb.exe configure-cni --cni-dir=\"C:\\k\\cni\\ --cni-config=\"C:\\k\\cni\\config\\cni.conf", "out": "CNI configuration completed successfully"} 2021-08-24T05:14:16.481Z INFO wc 10.0.159.251 configured kubelet for CNI {"cmd": "C:\\k\\wmcb.exe configure-cni --cni-dir=\"C:\\k\\cni\\ --cni-config=\"C:\\k\\cni\\config\\cni.conf", "output": "CNI configuration completed successfully"} 2021-08-24T05:14:34.285Z DEBUG wc 10.0.159.251 run {"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass \"Import-Module -DisableNameChecking C:\\Temp\\hns.psm1; $net = (Get-HnsNetwork | where { $_.Name -eq 'OVNKubernetesHybridOverlayNetwork' }); $endpoint = New-HnsEndpoint -NetworkId $net.ID -Name VIPEndpoint; Attach-HNSHostEndpoint -EndpointID $endpoint.ID -CompartmentID 1; (Get-NetIPConfiguration -AllCompartments -All -Detailed | where { $_.NetAdapter.LinkLayerAddress -eq $endpoint.MacAddress }).IPV4Address.IPAddress.Trim()\"", "out": "10.132.8.141\r\n"} 2021-08-24T05:14:34.377Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe create kube-proxy binPath=\"C:\\k\\kube-proxy.exe --windows-service --v=4 --proxy-mode=kernelspace --feature-gates=WinOverlay=true --hostname-override=ip-10-0-159-251.us-east-2.compute.internal --kubeconfig=c:\\k\\kubeconfig --cluster-cidr=10.132.8.0/24 --log-dir=C:\\var\\log\\kube-proxy\\ --logtostderr=false --network-name=OVNKubernetesHybridOverlayNetwork --source-vip=10.132.8.141 --enable-dsr=false --feature-gates=IPv6DualStack=false\" depend= hybrid-overlay-node start=auto", "out": "[SC] CreateService SUCCESS\r\n"} 2021-08-24T05:14:34.422Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe query kube-proxy", "out": "\r\nSERVICE_NAME: kube-proxy \r\n TYPE : 10 WIN32_OWN_PROCESS \r\n STATE : 1 STOPPED \r\n WIN32_EXIT_CODE : 1077 (0x435)\r\n SERVICE_EXIT_CODE : 0 (0x0)\r\n CHECKPOINT : 0x0\r\n WAIT_HINT : 0x0\r\n"} 2021-08-24T05:14:35.137Z DEBUG wc 10.0.159.251 run {"cmd": "sc.exe start kube-proxy", "out": "\r\nSERVICE_NAME: kube-proxy \r\n TYPE : 10 WIN32_OWN_PROCESS \r\n STATE : 2 START_PENDING \r\n (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)\r\n WIN32_EXIT_CODE : 0 (0x0)\r\n SERVICE_EXIT_CODE : 0 (0x0)\r\n CHECKPOINT : 0x0\r\n WAIT_HINT : 0x7d0\r\n PID : 5104\r\n FLAGS : \r\n"} 2021-08-24T05:14:35.137Z INFO wc 10.0.159.251 configured {"service": "kube-proxy", "args": "--windows-service --v=4 --proxy-mode=kernelspace --feature-gates=WinOverlay=true --hostname-override=ip-10-0-159-251.us-east-2.compute.internal --kubeconfig=c:\\k\\kubeconfig --cluster-cidr=10.132.8.0/24 --log-dir=C:\\var\\log\\kube-proxy\\ --logtostderr=false --network-name=OVNKubernetesHybridOverlayNetwork --source-vip=10.132.8.141 --enable-dsr=false --feature-gates=IPv6DualStack=false\" depend= hybrid-overlay-node"} 2021-08-24T05:14:50.228Z INFO nc 10.0.159.251 instance has been configured as a worker node {"version": "3.0.0+af53aa0"} 2021-08-24T05:14:50.238Z DEBUG controller-runtime.manager.events Normal {"object": {"kind":"ConfigMap","namespace":"openshift-windows-machine-config-operator","name":"windows-instances","apiVersion":"v1"}, "reason": "InstanceSetup", "message": "Configured instance with address 10.0.159.251 as a worker node"} 2021-08-24T05:14:50.282Z INFO metrics Prometheus configured {"endpoints": "windows-exporter", "port": 9182, "name": "metrics"} 2021-08-24T05:14:50.299Z INFO controllers.configmap processing {"instances in": "windows-instances"} 2021-08-24T05:14:50.299Z INFO controllers.configmap instance is up to date {"node": "ip-10-0-159-251.us-east-2.compute.internal", "version": "3.0.0+af53aa0"} 2021-08-24T05:14:50.301Z DEBUG controller-runtime.manager.events Normal {"object": {"kind":"ConfigMap","namespace":"openshift-windows-machine-config-operator","name":"windows-instances","apiVersion":"v1"}, "reason": "InstanceSetup", "message": "Configured instance with address 10.0.159.251 as a worker node"} 2021-08-24T05:14:50.310Z INFO metrics Prometheus configured {"endpoints": "windows-exporter", "port": 9182, "name": "metrics"} 2021-08-24T05:15:30.906Z INFO controllers.configmap processing {"instances in": "windows-instances"} 2021-08-24T05:15:30.907Z DEBUG wc 10.0.159.251 initializing SSH connection panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4efbf7] goroutine 1015 [running]: fmt.Fprintf(0x0, 0x0, 0x1dc7077, 0x13, 0xc004f51700, 0x2, 0x2, 0x20527e8, 0xc000571650, 0xc00432fb30) /usr/local/go/src/fmt/print.go:205 +0x77 k8s.io/kubectl/pkg/drain.(*Helper).evictPods.func1(0xc0004a4b60, 0x203c7c0, 0xc0004a87e0, 0xc001253138, 0xc001217f40, 0xc0012523a7, 0x9, 0x0, 0x0, 0x0, ...) /build/windows-machine-config-operator/vendor/k8s.io/kubectl/pkg/drain/drain.go:276 +0xd99 created by k8s.io/kubectl/pkg/drain.(*Helper).evictPods /build/windows-machine-config-operator/vendor/k8s.io/kubectl/pkg/drain/drain.go:269 +0x233 --- Additional comment from jvaldes on 2021-08-24 15:48:19 UTC --- @gaoshang, could you attach the `must-gather` ? --- Additional comment from sgao on 2021-08-25 09:47:02 UTC --- Another found that this bug only show up when only 1 byoh Windows node exist and be deleted, if we have other Windows node exist(ie. Window node created by machineset), it can not be reproduced. @jvaldes must-gather attached to http://file.apac.redhat.com/sgao/must-gather.tar.gz
This bug has been verified on OCP 4.8.0-0.nightly-2021-09-06-042819 and passed, thanks. Steps: 1, Configure only 1 BYOH Windows node in cluster and create Windows workload on it 2, Delete config map, check above panic should not exist, the Windows workload is evicted and BYOH windows node de-configured. # oc logs -f deployment.apps/windows-machine-config-operator -n openshift-windows-machine-config-operator ... 2021-09-08T10:04:36.525Z INFO controllers.configmap processing {"instances in": "windows-instances"} 2021-09-08T10:04:36.526Z DEBUG wc 10.0.48.121 initializing SSH connection 2021-09-08T10:04:46.678Z DEBUG nc 10.0.48.121 evicting pod openshift-windows-machine-config-operator/win-webserver-549cd7495d-q5762 2021-09-08T10:04:46.733Z INFO wc 10.0.48.121 deconfiguring
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Windows Container Support for Red Hat OpenShift 3.1.0 product release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3215