Bug 2001163 - WMCO panic when upgrading with BYOH Windows node
Summary: WMCO panic when upgrading with BYOH Windows node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Windows Containers
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
: 4.8.z
Assignee: jvaldes
QA Contact: gaoshang
URL:
Whiteboard:
Depends On: 1996615
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-04 00:06 UTC by OpenShift BugZilla Robot
Modified: 2021-09-21 11:11 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-21 11:11:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift windows-machine-config-operator pull 660 0 None None None 2021-09-07 03:14:08 UTC
Red Hat Product Errata RHBA-2021:3215 0 None None None 2021-09-21 11:11:52 UTC

Description OpenShift BugZilla Robot 2021-09-04 00:06:39 UTC
+++ This bug was initially created as a clone of Bug #1996615 +++

Description of problem:
After added a BYOH Windows node, I'm trying to upgrade WMCO and get panic error:
# oc logs -f deployment.apps/windows-machine-config-operator -n openshift-windows-machine-config-operator
...
2021-08-23T09:52:12.230Z	INFO	controllers.configmap	instance requires upgrade	{"node": "ip-10-0-135-226.us-east-2.compute.internal", "version": "3.0.0+af53aa0", "expected version": "3.0.0+4708e9d"}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4efbf7]

goroutine 761 [running]:
fmt.Fprintf(0x0, 0x0, 0x1dc7078, 0x13, 0xc0035db700, 0x2, 0x2, 0x1db7d54, 0xc0035db388, 0x40d97b)
	/usr/local/go/src/fmt/print.go:205 +0x77
k8s.io/kubectl/pkg/drain.(*Helper).evictPods.func1(0xc000bd2a90, 0x203c760, 0xc000421e00, 0xc001c4f468, 0xc001c3c9b0, 0xc0017f9137, 0x9, 0x0, 0x0, 0x0, ...)
	/build/windows-machine-config-operator/vendor/k8s.io/kubectl/pkg/drain/drain.go:276 +0xd99
created by k8s.io/kubectl/pkg/drain.(*Helper).evictPods
	/build/windows-machine-config-operator/vendor/k8s.io/kubectl/pkg/drain/drain.go:269 +0x233

# oc get pod -n openshift-windows-machine-config-operator
NAME                                             READY   STATUS             RESTARTS   AGE
windows-machine-config-operator-9f56f78f-57rwx   0/1     CrashLoopBackOff   6          9m55s

Version-Release number of selected component (if applicable):
OCP version: 4.8.0-0.nightly-2021-08-22-035234
WMCO release-4.8 branch commit 4708e9d0ba70ac04727217f8e64bea9cb49bb5f0

How reproducible:
Always

Steps to Reproduce:
1, Build WMCO locally with an older commit in release-4.8 branch, e.g. WMCO v3.1.0
2, Bootstrap a BYOH Windows node
3, Build WMCO locally with latest commit in release-4.8 branch, e.g. WMCO v3.1.1
4, Uninstall WMCO v3.1.0 and change catalogsource to v3.1.1 and install WMCO v3.1.1

Actual results:
WMCO panic with BYOH Windows node

Expected results:
WMCO should not panic and reconfigure BYOH Windows node

Additional info:

--- Additional comment from sgao on 2021-08-24 05:24:15 UTC ---

Found this bug is not related to upgrade, if I deconfigure a BYOH Windows instance, also got panic on OCP 4.8

Full log:

# oc logs -f deployment.apps/windows-machine-config-operator -n openshift-windows-machine-config-operator
2021-08-24T05:02:17.399Z	INFO	version	operator	{"version": "3.0.0+af53aa0"}
2021-08-24T05:02:17.399Z	INFO	version	go	{"version": "go1.16 linux/amd64"}
2021-08-24T05:02:17.427Z	INFO	leader	Trying to become the leader.
I0824 05:02:18.478106       1 request.go:668] Waited for 1.044207723s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/user.openshift.io/v1?timeout=32s
2021-08-24T05:02:20.283Z	DEBUG	leader	Found podname	{"Pod.Name": "windows-machine-config-operator-85ffc8d564-tdr7n"}
2021-08-24T05:02:20.288Z	DEBUG	leader	Found Pod	{"Pod.Namespace": "openshift-windows-machine-config-operator", "Pod.Name": "windows-machine-config-operator-85ffc8d564-tdr7n"}
2021-08-24T05:02:20.291Z	INFO	leader	No pre-existing lock was found.
2021-08-24T05:02:20.296Z	INFO	leader	Became the leader.
2021-08-24T05:02:23.157Z	INFO	controller-runtime.metrics	metrics server is starting to listen	{"addr": "0.0.0.0:9182"}
2021-08-24T05:02:23.159Z	ERROR	controller.secret	Unable to retrieve private key, please ensure it is created	{"error": "the cache is not started, can not read objects"}
main.main
	/build/windows-machine-config-operator/main.go:176
runtime.main
	/usr/local/go/src/runtime/proc.go:225
2021-08-24T05:02:23.198Z	INFO	setup	starting manager
2021-08-24T05:02:23.499Z	INFO	controller-runtime.manager	starting metrics server	{"path": "/metrics"}
2021-08-24T05:02:23.499Z	INFO	controller-runtime.manager.controller.secret	Starting EventSource	{"reconciler group": "", "reconciler kind": "Secret", "source": "kind source: /, Kind="}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.secret	Starting EventSource	{"reconciler group": "", "reconciler kind": "Secret", "source": "kind source: /, Kind="}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.secret	Starting Controller	{"reconciler group": "", "reconciler kind": "Secret"}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.configmap	Starting EventSource	{"reconciler group": "", "reconciler kind": "ConfigMap", "source": "kind source: /, Kind="}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.configmap	Starting EventSource	{"reconciler group": "", "reconciler kind": "ConfigMap", "source": "kind source: /, Kind="}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.configmap	Starting Controller	{"reconciler group": "", "reconciler kind": "ConfigMap"}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.machine	Starting EventSource	{"reconciler group": "machine.openshift.io", "reconciler kind": "Machine", "source": "kind source: /, Kind="}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.machine	Starting EventSource	{"reconciler group": "machine.openshift.io", "reconciler kind": "Machine", "source": "kind source: /, Kind="}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.machine	Starting Controller	{"reconciler group": "machine.openshift.io", "reconciler kind": "Machine"}
2021-08-24T05:02:23.500Z	INFO	controller-runtime.manager.controller.certificatesigningrequest	Starting EventSource	{"reconciler group": "certificates.k8s.io", "reconciler kind": "CertificateSigningRequest", "source": "kind source: /, Kind="}
2021-08-24T05:02:23.501Z	INFO	controller-runtime.manager.controller.certificatesigningrequest	Starting Controller	{"reconciler group": "certificates.k8s.io", "reconciler kind": "CertificateSigningRequest"}
2021-08-24T05:02:23.801Z	INFO	controller-runtime.manager.controller.configmap	Starting workers	{"reconciler group": "", "reconciler kind": "ConfigMap", "worker count": 1}
2021-08-24T05:02:23.801Z	INFO	controllers.configmap	processing	{"instances in": "windows-instances"}
2021-08-24T05:02:23.801Z	INFO	controller-runtime.manager.controller.secret	Starting workers	{"reconciler group": "", "reconciler kind": "Secret", "worker count": 1}
2021-08-24T05:02:23.801Z	INFO	controller-runtime.manager.controller.machine	Starting workers	{"reconciler group": "machine.openshift.io", "reconciler kind": "Machine", "worker count": 1}
2021-08-24T05:02:23.802Z	INFO	controller-runtime.manager.controller.certificatesigningrequest	Starting workers	{"reconciler group": "certificates.k8s.io", "reconciler kind": "CertificateSigningRequest", "worker count": 1}
2021-08-24T05:02:23.814Z	INFO	metrics	Prometheus configured	{"endpoints": "windows-exporter", "port": 9182, "name": "metrics"}
2021-08-24T05:02:51.992Z	DEBUG	controller.windowsmachine	Machine has no phase associated with it	{"name": "sgao-8-5hc8g-windows-byoh-us-east-2a-w6ddm"}
2021-08-24T05:02:52.048Z	DEBUG	controller.windowsmachine	Machine has no phase associated with it	{"name": "sgao-8-5hc8g-windows-byoh-us-east-2a-w6ddm"}
2021-08-24T05:02:52.162Z	DEBUG	controller.windowsmachine	invalid Machine	{"name": "sgao-8-5hc8g-windows-byoh-us-east-2a-w6ddm", "error": "no IP addresses defined", "errorVerbose": "no IP addresses defined\ngithub.com/openshift/windows-machine-config-operator/controllers.getInternalIPAddress\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:487\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).isValidMachine\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:193\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).SetupWithManager.func2\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:107\nsigs.k8s.io/controller-runtime/pkg/predicate.Funcs.Update\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/predicate/predicate.go:87\nsigs.k8s.io/controller-runtime/pkg/source/internal.EventHandler.OnUpdate\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/source/internal/eventsource.go:88\nk8s.io/client-go/tools/cache.(*processorListener).run.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:775\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90\nk8s.io/client-go/tools/cache.(*processorListener).run\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:771\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}
2021-08-24T05:02:54.163Z	DEBUG	controller.windowsmachine	invalid Machine	{"name": "sgao-8-5hc8g-windows-byoh-us-east-2a-w6ddm", "error": "no IP addresses defined", "errorVerbose": "no IP addresses defined\ngithub.com/openshift/windows-machine-config-operator/controllers.getInternalIPAddress\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:487\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).isValidMachine\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:193\ngithub.com/openshift/windows-machine-config-operator/controllers.(*WindowsMachineReconciler).SetupWithManager.func2\n\t/build/windows-machine-config-operator/controllers/windowsmachine_controller.go:107\nsigs.k8s.io/controller-runtime/pkg/predicate.Funcs.Update\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/predicate/predicate.go:87\nsigs.k8s.io/controller-runtime/pkg/source/internal.EventHandler.OnUpdate\n\t/build/windows-machine-config-operator/vendor/sigs.k8s.io/controller-runtime/pkg/source/internal/eventsource.go:88\nk8s.io/client-go/tools/cache.(*processorListener).run.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:775\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90\nk8s.io/client-go/tools/cache.(*processorListener).run\n\t/build/windows-machine-config-operator/vendor/k8s.io/client-go/tools/cache/shared_informer.go:771\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\t/build/windows-machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:73\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1371"}
2021-08-24T05:05:18.652Z	INFO	controllers.configmap	processing	{"instances in": "windows-instances"}
2021-08-24T05:05:18.665Z	INFO	metrics	Prometheus configured	{"endpoints": "windows-exporter", "port": 9182, "name": "metrics"}
2021-08-24T05:05:23.246Z	INFO	controllers.configmap	processing	{"instances in": "windows-instances"}
2021-08-24T05:05:23.246Z	DEBUG	wc 10.0.159.251	initializing SSH connection
2021-08-24T05:07:34.780Z	DEBUG	wc 10.0.159.251	SSH dial	{"IP Address": "10.0.159.251", "error": "dial tcp 10.0.159.251:22: connect: connection timed out"}
2021-08-24T05:09:36.620Z	INFO	wc 10.0.159.251	configuring
2021-08-24T05:09:36.755Z	DEBUG	wc 10.0.159.251	run	{"cmd": "if not exist C:\\k\\ mkdir C:\\k\\", "out": ""}
2021-08-24T05:09:36.770Z	DEBUG	wc 10.0.159.251	run	{"cmd": "if not exist C:\\Temp\\ mkdir C:\\Temp\\", "out": ""}
2021-08-24T05:09:36.785Z	DEBUG	wc 10.0.159.251	run	{"cmd": "if not exist C:\\k\\cni\\ mkdir C:\\k\\cni\\", "out": ""}
2021-08-24T05:09:36.800Z	DEBUG	wc 10.0.159.251	run	{"cmd": "if not exist C:\\k\\cni\\config\\ mkdir C:\\k\\cni\\config\\", "out": ""}
2021-08-24T05:09:36.818Z	DEBUG	wc 10.0.159.251	run	{"cmd": "if not exist C:\\var\\log\\ mkdir C:\\var\\log\\", "out": ""}
2021-08-24T05:09:36.833Z	DEBUG	wc 10.0.159.251	run	{"cmd": "if not exist C:\\var\\log\\kube-proxy\\ mkdir C:\\var\\log\\kube-proxy\\", "out": ""}
2021-08-24T05:09:36.848Z	DEBUG	wc 10.0.159.251	run	{"cmd": "if not exist C:\\var\\log\\hybrid-overlay\\ mkdir C:\\var\\log\\hybrid-overlay\\", "out": ""}
2021-08-24T05:09:36.848Z	INFO	wc 10.0.159.251	transferring files
2021-08-24T05:09:39.976Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\hybrid-overlay-node.exe", "out": "False\r\n"}
2021-08-24T05:09:39.976Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload/hybrid-overlay-node.exe", "remote dir": "C:\\k\\"}
2021-08-24T05:09:46.984Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\kube-proxy.exe", "out": "False\r\n"}
2021-08-24T05:09:46.984Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload//kube-node/kube-proxy.exe", "remote dir": "C:\\k\\"}
2021-08-24T05:09:53.883Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\wmcb.exe", "out": "False\r\n"}
2021-08-24T05:09:53.883Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload/wmcb.exe", "remote dir": "C:\\k\\"}
2021-08-24T05:09:57.469Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\\\flannel.exe", "out": "False\r\n"}
2021-08-24T05:09:57.469Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload//cni/flannel.exe", "remote dir": "C:\\k\\cni\\"}
2021-08-24T05:09:59.975Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\Temp\\\\hns.psm1", "out": "False\r\n"}
2021-08-24T05:09:59.975Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload//powershell/hns.psm1", "remote dir": "C:\\Temp\\"}
2021-08-24T05:10:01.769Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\\\host-local.exe", "out": "False\r\n"}
2021-08-24T05:10:01.769Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload//cni/host-local.exe", "remote dir": "C:\\k\\cni\\"}
2021-08-24T05:10:03.939Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\\\win-overlay.exe", "out": "False\r\n"}
2021-08-24T05:10:03.939Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload//cni/win-overlay.exe", "remote dir": "C:\\k\\cni\\"}
2021-08-24T05:10:06.409Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\Temp\\\\wget-ignore-cert.ps1", "out": "False\r\n"}
2021-08-24T05:10:06.409Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload//powershell/wget-ignore-cert.ps1", "remote dir": "C:\\Temp\\"}
2021-08-24T05:10:08.290Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\windows_exporter.exe", "out": "False\r\n"}
2021-08-24T05:10:08.290Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload/windows_exporter.exe", "remote dir": "C:\\k\\"}
2021-08-24T05:10:11.978Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\\\win-bridge.exe", "out": "False\r\n"}
2021-08-24T05:10:11.978Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload//cni/win-bridge.exe", "remote dir": "C:\\k\\cni\\"}
2021-08-24T05:10:14.560Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\\\kubelet.exe", "out": "False\r\n"}
2021-08-24T05:10:14.560Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/payload//kube-node/kubelet.exe", "remote dir": "C:\\k\\"}
2021-08-24T05:10:26.089Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe create windows_exporter binPath=\"C:\\k\\windows_exporter.exe --collectors.enabled cpu,cs,logical_disk,net,os,service,system,textfile,container,memory,cpu_info\" start=auto", "out": "[SC] CreateService SUCCESS\r\n"}
2021-08-24T05:10:26.127Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe query windows_exporter", "out": "\r\nSERVICE_NAME: windows_exporter \r\n        TYPE               : 10  WIN32_OWN_PROCESS  \r\n        STATE              : 1  STOPPED \r\n        WIN32_EXIT_CODE    : 1077  (0x435)\r\n        SERVICE_EXIT_CODE  : 0  (0x0)\r\n        CHECKPOINT         : 0x0\r\n        WAIT_HINT          : 0x0\r\n"}
2021-08-24T05:10:46.231Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe start windows_exporter", "out": "\r\nSERVICE_NAME: windows_exporter \r\n        TYPE               : 10  WIN32_OWN_PROCESS  \r\n        STATE              : 2  START_PENDING \r\n                                (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)\r\n        WIN32_EXIT_CODE    : 0  (0x0)\r\n        SERVICE_EXIT_CODE  : 0  (0x0)\r\n        CHECKPOINT         : 0x0\r\n        WAIT_HINT          : 0x7d0\r\n        PID                : 3260\r\n        FLAGS              : \r\n"}
2021-08-24T05:10:46.231Z	INFO	wc 10.0.159.251	configured	{"service": "windows_exporter", "args": "--collectors.enabled cpu,cs,logical_disk,net,os,service,system,textfile,container,memory,cpu_info\""}
2021-08-24T05:10:48.949Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass C:\\Temp\\wget-ignore-cert.ps1 -server https://api-int.sgao-8.qe.devcluster.openshift.com:22623/config/worker -output C:\\Windows\\Temp\\worker.ign -acceptHeader application/vnd.coreos.ignition+json`;version=3.1.0", "out": ""}
2021-08-24T05:10:52.138Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass C:\\k\\\\wmcb.exe initialize-kubelet --ignition-file C:\\Windows\\Temp\\worker.ign --kubelet-path C:\\k\\kubelet.exe", "out": "Bootstrapping completed successfully"}
2021-08-24T05:10:52.138Z	INFO	wc 10.0.159.251	configured kubelet	{"cmd": "C:\\k\\\\wmcb.exe initialize-kubelet --ignition-file C:\\Windows\\Temp\\worker.ign --kubelet-path C:\\k\\kubelet.exe", "output": "Bootstrapping completed successfully"}
2021-08-24T05:10:52.493Z	INFO	controllers.CertificateSigningRequests	CSR approved	{"CSR": "csr-4gtl7"}
2021-08-24T05:10:55.454Z	INFO	controllers.CertificateSigningRequests	CSR approved	{"CSR": "csr-c5vk4"}
2021-08-24T05:11:22.181Z	INFO	wc 10.0.159.251	configure	{"service": "hybrid-overlay-node", "args": "--node ip-10-0-159-251.us-east-2.compute.internal --k8s-kubeconfig c:\\k\\kubeconfig --windows-service --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay.log\" depend= kubelet"}
2021-08-24T05:11:22.237Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe create hybrid-overlay-node binPath=\"C:\\k\\hybrid-overlay-node.exe --node ip-10-0-159-251.us-east-2.compute.internal --k8s-kubeconfig c:\\k\\kubeconfig --windows-service --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay.log\" depend= kubelet start=auto", "out": "[SC] CreateService SUCCESS\r\n"}
2021-08-24T05:11:22.263Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe query hybrid-overlay-node", "out": "\r\nSERVICE_NAME: hybrid-overlay-node \r\n        TYPE               : 10  WIN32_OWN_PROCESS  \r\n        STATE              : 1  STOPPED \r\n        WIN32_EXIT_CODE    : 1077  (0x435)\r\n        SERVICE_EXIT_CODE  : 0  (0x0)\r\n        CHECKPOINT         : 0x0\r\n        WAIT_HINT          : 0x0\r\n"}
2021-08-24T05:11:22.739Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe start hybrid-overlay-node", "out": "\r\nSERVICE_NAME: hybrid-overlay-node \r\n        TYPE               : 10  WIN32_OWN_PROCESS  \r\n        STATE              : 2  START_PENDING \r\n                                (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)\r\n        WIN32_EXIT_CODE    : 0  (0x0)\r\n        SERVICE_EXIT_CODE  : 0  (0x0)\r\n        CHECKPOINT         : 0x0\r\n        WAIT_HINT          : 0x7d0\r\n        PID                : 1016\r\n        FLAGS              : \r\n"}
2021-08-24T05:11:22.783Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe query hybrid-overlay-node", "out": "\r\nSERVICE_NAME: hybrid-overlay-node \r\n        TYPE               : 10  WIN32_OWN_PROCESS  \r\n        STATE              : 4  RUNNING \r\n                                (STOPPABLE, NOT_PAUSABLE, ACCEPTS_SHUTDOWN)\r\n        WIN32_EXIT_CODE    : 0  (0x0)\r\n        SERVICE_EXIT_CODE  : 0  (0x0)\r\n        CHECKPOINT         : 0x0\r\n        WAIT_HINT          : 0x0\r\n"}
2021-08-24T05:13:26.027Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Get-HnsNetwork", "out": "\r\n\r\nActivityId             : 3F65AADE-E980-44A7-9512-4E0B7C6EB812\r\nAdditionalParams       : \r\nCurrentEndpointCount   : 0\r\nExtensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering \r\n                         Platform}, @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=False; Name=Microsoft Azure \r\n                         VFP Switch Extension}, @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; \r\n                         Name=Microsoft NDIS Capture}}\r\nFlags                  : 0\r\nHealth                 : @{AddressNotificationMissedCount=0; AddressNotificationSequenceNumber=0; \r\n                         InterfaceNotificationMissedCount=0; InterfaceNotificationSequenceNumber=0; LastErrorCode=0; \r\n                         LastUpdateTime=132742550242152330; RouteNotificationMissedCount=0; \r\n                         RouteNotificationSequenceNumber=0}\r\nID                     : E06E62E8-C35E-462A-870D-8C0BECE58CC7\r\nIPv6                   : False\r\nLayeredOn              : 4FFEC140-8139-4482-A787-AAC40F2FB5ED\r\nMacPools               : {@{EndMacAddress=00-15-5D-9B-1F-FF; StartMacAddress=00-15-5D-9B-10-00}}\r\nMaxConcurrentEndpoints : 0\r\nName                   : nat\r\nNatName                : ICSA21632B3-526B-4199-A32D-B87EDCEF1DC3\r\nPolicies               : {}\r\nResources              : @{AdditionalParams=; AllocationOrder=2; Allocators=System.Object[]; Health=; \r\n                         ID=3F65AADE-E980-44A7-9512-4E0B7C6EB812; PortOperationTime=0; State=1; SwitchOperationTime=0; \r\n                         VfpOperationTime=0; parentId=D7848AFB-E3F6-4302-8FB3-02AF4102B8FD}\r\nState                  : 1\r\nSubnets                : {@{AdditionalParams=; AddressPrefix=172.24.192.0/20; GatewayAddress=172.24.192.1; Health=; \r\n                         ID=14DFA1DA-8F7B-4095-9484-EF6287E51ABB; Policies=System.Object[]; State=0}}\r\nTotalEndpoints         : 0\r\nType                   : nat\r\nVersion                : 38654705669\r\n\r\nActivityId             : EE77D950-8550-49D6-B4EB-E2EC090643A3\r\nAdditionalParams       : \r\nCurrentEndpointCount   : 0\r\nDNSServerCompartment   : 3\r\nDrMacAddress           : 00-15-5D-BD-1B-0D\r\nExtensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering \r\n                         Platform}, @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=True; Name=Microsoft Azure \r\n                         VFP Switch Extension}, @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; \r\n                         Name=Microsoft NDIS Capture}}\r\nFlags                  : 0\r\nHealth                 : @{LastErrorCode=0; LastUpdateTime=132742554829538077}\r\nID                     : 74184937-92DE-405E-B968-7F184CF93F13\r\nIPv6                   : False\r\nInterfaceConstraint    : @{InterfaceGuid=00000000-0000-0000-0000-000000000000}\r\nLayeredOn              : FE8BDB4A-FD4A-4245-B7BB-7D1DEC18669F\r\nMacPools               : {@{EndMacAddress=00-15-5D-DD-BF-FF; StartMacAddress=00-15-5D-DD-B0-00}}\r\nManagementIP           : 10.0.159.251\r\nMaxConcurrentEndpoints : 0\r\nName                   : BaseOVNKubernetesHybridOverlayNetwork\r\nPolicies               : {}\r\nResources              : @{AdditionalParams=; AllocationOrder=1; Allocators=System.Object[]; Health=; \r\n                         ID=EE77D950-8550-49D6-B4EB-E2EC090643A3; PortOperationTime=0; State=1; SwitchOperationTime=0; \r\n                         VfpOperationTime=0; parentId=70258D96-A17E-4E4C-9727-42887EACA4A4}\r\nState                  : 1\r\nSubnets                : {@{AdditionalParams=; AddressPrefix=100.64.0.0/30; GatewayAddress=100.64.0.2; Health=; \r\n                         ID=239CF3FE-1CE2-402F-9AB7-16971E29F09D; ObjectType=5; Policies=System.Object[]; State=0}}\r\nTotalEndpoints         : 0\r\nType                   : Overlay\r\nVersion                : 38654705669\r\n\r\nActivityId             : 9D121761-9A0E-472B-9B75-238EEB516B52\r\nAdditionalParams       : \r\nAutomaticDNS           : True\r\nCurrentEndpointCount   : 0\r\nDNSServerCompartment   : 4\r\nDrMacAddress           : 00-15-5D-BD-1B-0D\r\nExtensions             : {@{Id=E7C3B2F0-F3C5-48DF-AF2B-10FED6D72E7A; IsEnabled=False; Name=Microsoft Windows Filtering \r\n                         Platform}, @{Id=E9B59CFA-2BE1-4B21-828F-B6FBDBDDC017; IsEnabled=True; Name=Microsoft Azure \r\n                         VFP Switch Extension}, @{Id=EA24CD6C-D17A-4348-9190-09F0D5BE83DD; IsEnabled=True; \r\n                         Name=Microsoft NDIS Capture}}\r\nFlags                  : 8\r\nHealth                 : @{LastErrorCode=0; LastUpdateTime=132742554970525883}\r\nID                     : 2574A48D-FC38-4C5C-AFFC-837C0196432F\r\nIPv6                   : False\r\nInterfaceConstraint    : @{InterfaceGuid=00000000-0000-0000-0000-000000000000}\r\nLayeredOn              : FE8BDB4A-FD4A-4245-B7BB-7D1DEC18669F\r\nMacPools               : {@{EndMacAddress=00-15-5D-D5-7F-FF; StartMacAddress=00-15-5D-D5-70-00}}\r\nManagementIP           : 10.0.159.251\r\nMaxConcurrentEndpoints : 0\r\nName                   : OVNKubernetesHybridOverlayNetwork\r\nPolicies               : {@{Type=HostRoute}, @{DestinationPrefix=10.131.0.0/23; \r\n                         DistributedRouterMacAddress=0a:58:0a:83:00:03; IsolationId=4097; \r\n                         ProviderAddress=10.0.153.243; Type=RemoteSubnetRoute}, @{DestinationPrefix=10.128.0.0/23; \r\n                         DistributedRouterMacAddress=0a:58:0a:80:00:03; IsolationId=4097; \r\n                         ProviderAddress=10.0.154.199; Type=RemoteSubnetRoute}, @{DestinationPrefix=10.129.2.0/23; \r\n                         DistributedRouterMacAddress=0a:58:0a:81:02:03; IsolationId=4097; \r\n                         ProviderAddress=10.0.171.166; Type=RemoteSubnetRoute}...}\r\nResources              : @{AdditionalParams=; AllocationOrder=1; Allocators=System.Object[]; Health=; \r\n                         ID=9D121761-9A0E-472B-9B75-238EEB516B52; PortOperationTime=0; State=1; SwitchOperationTime=0; \r\n                         VfpOperationTime=0; parentId=70258D96-A17E-4E4C-9727-42887EACA4A4}\r\nState                  : 1\r\nSubnets                : {@{AdditionalParams=; AddressPrefix=10.132.8.0/24; GatewayAddress=10.132.8.1; Health=; \r\n                         ID=5FE59656-544F-427F-85A2-1F970ABDC085; ObjectType=5; Policies=System.Object[]; State=0}}\r\nTotalEndpoints         : 0\r\nType                   : Overlay\r\nVersion                : 38654705669\r\n\r\n\r\n\r\n"}
2021-08-24T05:13:26.027Z	INFO	wc 10.0.159.251	configured	{"service": "hybrid-overlay-node", "args": "--node ip-10-0-159-251.us-east-2.compute.internal --k8s-kubeconfig c:\\k\\kubeconfig --windows-service --logfile C:\\var\\log\\hybrid-overlay\\hybrid-overlay.log\" depend= kubelet"}
2021-08-24T05:13:43.196Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass Test-Path C:\\k\\cni\\config\\\\cni.conf", "out": "False\r\n"}
2021-08-24T05:13:43.197Z	DEBUG	wc 10.0.159.251	copy	{"local file": "/tmp/cni015561117/cni.conf", "remote dir": "C:\\k\\cni\\config\\"}
2021-08-24T05:14:16.480Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass C:\\k\\wmcb.exe configure-cni --cni-dir=\"C:\\k\\cni\\ --cni-config=\"C:\\k\\cni\\config\\cni.conf", "out": "CNI configuration completed successfully"}
2021-08-24T05:14:16.481Z	INFO	wc 10.0.159.251	configured kubelet for CNI	{"cmd": "C:\\k\\wmcb.exe configure-cni --cni-dir=\"C:\\k\\cni\\ --cni-config=\"C:\\k\\cni\\config\\cni.conf", "output": "CNI configuration completed successfully"}


2021-08-24T05:14:34.285Z	DEBUG	wc 10.0.159.251	run	{"cmd": "powershell.exe -NonInteractive -ExecutionPolicy Bypass \"Import-Module -DisableNameChecking C:\\Temp\\hns.psm1; $net = (Get-HnsNetwork | where { $_.Name -eq 'OVNKubernetesHybridOverlayNetwork' }); $endpoint = New-HnsEndpoint -NetworkId $net.ID -Name VIPEndpoint; Attach-HNSHostEndpoint -EndpointID $endpoint.ID -CompartmentID 1; (Get-NetIPConfiguration -AllCompartments -All -Detailed | where { $_.NetAdapter.LinkLayerAddress -eq $endpoint.MacAddress }).IPV4Address.IPAddress.Trim()\"", "out": "10.132.8.141\r\n"}
2021-08-24T05:14:34.377Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe create kube-proxy binPath=\"C:\\k\\kube-proxy.exe --windows-service --v=4 --proxy-mode=kernelspace --feature-gates=WinOverlay=true --hostname-override=ip-10-0-159-251.us-east-2.compute.internal --kubeconfig=c:\\k\\kubeconfig --cluster-cidr=10.132.8.0/24 --log-dir=C:\\var\\log\\kube-proxy\\ --logtostderr=false --network-name=OVNKubernetesHybridOverlayNetwork --source-vip=10.132.8.141 --enable-dsr=false --feature-gates=IPv6DualStack=false\" depend= hybrid-overlay-node start=auto", "out": "[SC] CreateService SUCCESS\r\n"}
2021-08-24T05:14:34.422Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe query kube-proxy", "out": "\r\nSERVICE_NAME: kube-proxy \r\n        TYPE               : 10  WIN32_OWN_PROCESS  \r\n        STATE              : 1  STOPPED \r\n        WIN32_EXIT_CODE    : 1077  (0x435)\r\n        SERVICE_EXIT_CODE  : 0  (0x0)\r\n        CHECKPOINT         : 0x0\r\n        WAIT_HINT          : 0x0\r\n"}
2021-08-24T05:14:35.137Z	DEBUG	wc 10.0.159.251	run	{"cmd": "sc.exe start kube-proxy", "out": "\r\nSERVICE_NAME: kube-proxy \r\n        TYPE               : 10  WIN32_OWN_PROCESS  \r\n        STATE              : 2  START_PENDING \r\n                                (NOT_STOPPABLE, NOT_PAUSABLE, IGNORES_SHUTDOWN)\r\n        WIN32_EXIT_CODE    : 0  (0x0)\r\n        SERVICE_EXIT_CODE  : 0  (0x0)\r\n        CHECKPOINT         : 0x0\r\n        WAIT_HINT          : 0x7d0\r\n        PID                : 5104\r\n        FLAGS              : \r\n"}
2021-08-24T05:14:35.137Z	INFO	wc 10.0.159.251	configured	{"service": "kube-proxy", "args": "--windows-service --v=4 --proxy-mode=kernelspace --feature-gates=WinOverlay=true --hostname-override=ip-10-0-159-251.us-east-2.compute.internal --kubeconfig=c:\\k\\kubeconfig --cluster-cidr=10.132.8.0/24 --log-dir=C:\\var\\log\\kube-proxy\\ --logtostderr=false --network-name=OVNKubernetesHybridOverlayNetwork --source-vip=10.132.8.141 --enable-dsr=false --feature-gates=IPv6DualStack=false\" depend= hybrid-overlay-node"}
2021-08-24T05:14:50.228Z	INFO	nc 10.0.159.251	instance has been configured as a worker node	{"version": "3.0.0+af53aa0"}
2021-08-24T05:14:50.238Z	DEBUG	controller-runtime.manager.events	Normal	{"object": {"kind":"ConfigMap","namespace":"openshift-windows-machine-config-operator","name":"windows-instances","apiVersion":"v1"}, "reason": "InstanceSetup", "message": "Configured instance with address 10.0.159.251 as a worker node"}
2021-08-24T05:14:50.282Z	INFO	metrics	Prometheus configured	{"endpoints": "windows-exporter", "port": 9182, "name": "metrics"}
2021-08-24T05:14:50.299Z	INFO	controllers.configmap	processing	{"instances in": "windows-instances"}
2021-08-24T05:14:50.299Z	INFO	controllers.configmap	instance is up to date	{"node": "ip-10-0-159-251.us-east-2.compute.internal", "version": "3.0.0+af53aa0"}
2021-08-24T05:14:50.301Z	DEBUG	controller-runtime.manager.events	Normal	{"object": {"kind":"ConfigMap","namespace":"openshift-windows-machine-config-operator","name":"windows-instances","apiVersion":"v1"}, "reason": "InstanceSetup", "message": "Configured instance with address 10.0.159.251 as a worker node"}
2021-08-24T05:14:50.310Z	INFO	metrics	Prometheus configured	{"endpoints": "windows-exporter", "port": 9182, "name": "metrics"}
2021-08-24T05:15:30.906Z	INFO	controllers.configmap	processing	{"instances in": "windows-instances"}
2021-08-24T05:15:30.907Z	DEBUG	wc 10.0.159.251	initializing SSH connection
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4efbf7]

goroutine 1015 [running]:
fmt.Fprintf(0x0, 0x0, 0x1dc7077, 0x13, 0xc004f51700, 0x2, 0x2, 0x20527e8, 0xc000571650, 0xc00432fb30)
	/usr/local/go/src/fmt/print.go:205 +0x77
k8s.io/kubectl/pkg/drain.(*Helper).evictPods.func1(0xc0004a4b60, 0x203c7c0, 0xc0004a87e0, 0xc001253138, 0xc001217f40, 0xc0012523a7, 0x9, 0x0, 0x0, 0x0, ...)
	/build/windows-machine-config-operator/vendor/k8s.io/kubectl/pkg/drain/drain.go:276 +0xd99
created by k8s.io/kubectl/pkg/drain.(*Helper).evictPods
	/build/windows-machine-config-operator/vendor/k8s.io/kubectl/pkg/drain/drain.go:269 +0x233

--- Additional comment from jvaldes on 2021-08-24 15:48:19 UTC ---

@gaoshang, could you attach the `must-gather` ?

--- Additional comment from sgao on 2021-08-25 09:47:02 UTC ---

Another found that this bug only show up when only 1 byoh Windows node exist and be deleted, if we have other Windows node exist(ie. Window node created by machineset), it can not be reproduced.

@jvaldes must-gather attached to http://file.apac.redhat.com/sgao/must-gather.tar.gz

Comment 2 gaoshang 2021-09-08 10:10:36 UTC
This bug has been verified on OCP 4.8.0-0.nightly-2021-09-06-042819 and passed, thanks.

Steps:

1, Configure only 1 BYOH Windows node in cluster and create Windows workload on it
2, Delete config map, check above panic should not exist, the Windows workload is evicted and BYOH windows node de-configured. 

# oc logs -f deployment.apps/windows-machine-config-operator -n openshift-windows-machine-config-operator
...
2021-09-08T10:04:36.525Z	INFO	controllers.configmap	processing	{"instances in": "windows-instances"}
2021-09-08T10:04:36.526Z	DEBUG	wc 10.0.48.121	initializing SSH connection
2021-09-08T10:04:46.678Z	DEBUG	nc 10.0.48.121	evicting pod openshift-windows-machine-config-operator/win-webserver-549cd7495d-q5762
2021-09-08T10:04:46.733Z	INFO	wc 10.0.48.121	deconfiguring

Comment 4 errata-xmlrpc 2021-09-21 11:11:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Windows Container Support for Red Hat OpenShift 3.1.0 product release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3215


Note You need to log in before you can comment on or make changes to this bug.