Description of problem: WMCO is not able to configure Windows nodes with error related to ignition file download. This is blocking our CI. Local run output: 2020-07-06T15:12:46.560Z DEBUG windows ignition file download {"cmd": "C:\\Temp\\wget-ignore-cert.ps1 -server https://api-int.ravig126.devcluster.openshift.com:22623/config/worker -output C:\\Windows\\Temp\\worker.ign", "output": "wget : The remote server returned an error: (400) Bad Request.\r\nAt C:\\Temp\\wget-ignore-cert.ps1:31 char:9\r\n+ $null | wget $server -o $output > $null\r\n+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n + CategoryInfo : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebExc \r\n eption\r\n + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand\r\n \r\n"} 2020-07-06T15:12:47.829Z DEBUG windows output from wmcb {"output": ""} 2020-07-06T15:12:47.829Z DEBUG controller_wmc destroying the Windows VM {"ID": "i-0ce63d14fb481b9af"} 2020-07-06T15:14:03.624Z INFO controller_wmc Windows worker has been removed from the cluster {"ID": "i-0ce63d14fb481b9af"} 2020-07-06T15:14:03.624Z ERROR controller_wmc error adding a Windows worker node {"error": "VMConfigurationFailure: failed to configure Windows VM: configuring the Windows VM failed: error running bootstrapper: Process exited with status 1"} github.com/go-logr/zapr.(*zapLogger).Error /home/ravig/go/src/pkg/mod/github.com/go-logr/zapr.1/zapr.go:128 github.com/openshift/windows-machine-config-operator/pkg/controller/windowsmachineconfig.(*ReconcileWindowsMachineConfig).addWorkerNodes windows-machine-config-operator/pkg/controller/windowsmachineconfig/windowsmachineconfig_controller.go:358 github.com/openshift/windows-machine-config-operator/pkg/controller/windowsmachineconfig.(*ReconcileWindowsMachineConfig).reconcileWindowsNodes windows-machine-config-operator/pkg/controller/windowsmachineconfig/windowsmachineconfig_controller.go:250 github.com/openshift/windows-machine-config-operator/pkg/controller/windowsmachineconfig.(*ReconcileWindowsMachineConfig).Reconcile windows-machine-config-operator/pkg/controller/windowsmachineconfig/windowsmachineconfig_controller.go:211 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /home/ravig/go/src/pkg/mod/sigs.k8s.io/controller-runtime.0/pkg/internal/controller/controller.go:256 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /home/ravig/go/src/pkg/mod/sigs.k8s.io/controller-runtime.0/pkg/internal/controller/controller.go:232 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker /home/ravig/go/src/pkg/mod/sigs.k8s.io/controller-runtime.0/pkg/internal/controller/controller.go:211 k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1 /home/ravig/go/src/pkg/mod/k8s.io/apimachinery.2/pkg/util/wait/wait.go:155 k8s.io/apimachinery/pkg/util/wait.BackoffUntil /home/ravig/go/src/pkg/mod/k8s.io/apimachinery.2/pkg/util/wait/wait.go:156 k8s.io/apimachinery/pkg/util/wait.JitterUntil /home/ravig/go/src/pkg/mod/k8s.io/apimachinery.2/pkg/util/wait/wait.go:133 k8s.io/apimachinery/pkg/util/wait.Until /home/ravig/go/src/pkg/mod/k8s.io/apimachinery.2/pkg/util/wait/wait.go:90 CI Run: https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_windows-machine-config-operator/75/pull-ci-openshift-windows-machine-config-operator-master-e2e-operator/1279959578544443392/artifacts/e2e-operator/gather-extra/pods/windows-machine-config-operator_windows-machine-config-operator-57cf95787c-qcn7k_windows-machine-config-operator.log Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: WMCO Windows VM configuration failure. Expected results: WMCO configuration Windows node properly and Windows node joining the existing OpenShift cluster Additional info:
Verified on 4.6.0-0.nightly-2020-07-14-092216 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-07-14-092216 True False 9h Cluster version is 4.6.0-0.nightly-2020-07-14-092216 1. Remove the iptables rule that is blocking requests to the MCS from pods on a master node $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-131-12.us-west-2.compute.internal Ready master 9h v1.18.3+b9ac23f ip-10-0-133-198.us-west-2.compute.internal Ready worker 141m v1.18.3+b9ac23f ip-10-0-136-229.us-west-2.compute.internal Ready worker 9h v1.18.3+b9ac23f ip-10-0-182-165.us-west-2.compute.internal Ready master 9h v1.18.3+b9ac23f ip-10-0-186-178.us-west-2.compute.internal Ready worker 9h v1.18.3+b9ac23f ip-10-0-192-116.us-west-2.compute.internal Ready worker 9h v1.18.3+b9ac23f ip-10-0-210-115.us-west-2.compute.internal Ready master 9h v1.18.3+b9ac23f $ oc debug node/ip-10-0-182-165.us-west-2.compute.internal Starting pod/ip-10-0-182-165us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# /sbin/iptables -L -v -n --line-numbers Chain INPUT (policy ACCEPT 0 packets, 0 bytes) num pkts bytes target prot opt in out source destination 1 2430K 4865M KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */ 2 2430K 4865M KUBE-EXTERNAL-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes externally-visible service portals */ 3 37M 18G KUBE-NODEPORT-NON-LOCAL all -- * * 0.0.0.0/0 0.0.0.0/0 /* Ensure that non-local NodePort traffic can flow */ 4 37M 18G OPENSHIFT-FIREWALL-ALLOW all -- * * 0.0.0.0/0 0.0.0.0/0 /* firewall overrides */ 5 33M 13G KUBE-FIREWALL all -- * * 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) num pkts bytes target prot opt in out source destination 1 3855K 3053M KUBE-FORWARD all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */ 2 28782 3194K KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */ 3 10901 743K OPENSHIFT-ADMIN-OUTPUT-RULES all -- tun0 !tun0 0.0.0.0/0 0.0.0.0/0 /* administrator overrides */ 4 28244 3161K OPENSHIFT-FIREWALL-FORWARD all -- * * 0.0.0.0/0 0.0.0.0/0 /* firewall overrides */ 5 0 0 OPENSHIFT-BLOCK-OUTPUT all -- * * 0.0.0.0/0 0.0.0.0/0 /* firewall overrides */ Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) num pkts bytes target prot opt in out source destination 1 1986K 444M KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */ 2 37M 21G OPENSHIFT-BLOCK-OUTPUT all -- * * 0.0.0.0/0 0.0.0.0/0 /* firewall overrides */ 3 37M 21G KUBE-FIREWALL all -- * * 0.0.0.0/0 0.0.0.0/0 Chain KUBE-FIREWALL (2 references) num pkts bytes target prot opt in out source destination 1 0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000 2 0 0 DROP all -- * * !127.0.0.0/8 127.0.0.0/8 /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT Chain KUBE-KUBELET-CANARY (0 references) num pkts bytes target prot opt in out source destination Chain OPENSHIFT-BLOCK-OUTPUT (2 references) num pkts bytes target prot opt in out source destination 1 31 1860 REJECT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:22623 reject-with icmp-port-unreachable 2 31 1860 REJECT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:22624 reject-with icmp-port-unreachable Chain OPENSHIFT-FIREWALL-FORWARD (1 references) num pkts bytes target prot opt in out source destination 1 0 0 DROP all -- * * 10.128.0.0/14 0.0.0.0/0 /* attempted resend after connection close */ ctstate INVALID 2 17343 2418K ACCEPT all -- * * 0.0.0.0/0 10.128.0.0/14 /* forward traffic from SDN */ 3 10901 743K ACCEPT all -- * * 10.128.0.0/14 0.0.0.0/0 /* forward traffic to SDN */ Chain OPENSHIFT-ADMIN-OUTPUT-RULES (1 references) num pkts bytes target prot opt in out source destination Chain OPENSHIFT-FIREWALL-ALLOW (1 references) num pkts bytes target prot opt in out source destination 1 1931K 4835M ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:4789 /* VXLAN incoming */ 2 1585K 338M ACCEPT all -- tun0 * 0.0.0.0/0 0.0.0.0/0 /* from SDN to localhost */ 3 0 0 ACCEPT all -- docker0 * 0.0.0.0/0 0.0.0.0/0 /* from docker to localhost */ Chain KUBE-PROXY-CANARY (0 references) num pkts bytes target prot opt in out source destination Chain KUBE-NODEPORT-NON-LOCAL (1 references) num pkts bytes target prot opt in out source destination Chain KUBE-EXTERNAL-SERVICES (1 references) num pkts bytes target prot opt in out source destination Chain KUBE-SERVICES (3 references) num pkts bytes target prot opt in out source destination Chain KUBE-FORWARD (1 references) num pkts bytes target prot opt in out source destination 1 6 240 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate INVALID 2 5 570 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */ mark match 0x1/0x1 3 175K 127M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED 4 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED Chain OPENSHIFT-SDN-CANARY (0 references) num pkts bytes target prot opt in out source destination sh-4.4# /sbin/iptables -D OPENSHIFT-BLOCK-OUTPUT 1 sh-4.4# /sbin/iptables -L -v -n --line-numbers Chain INPUT (policy ACCEPT 0 packets, 0 bytes) num pkts bytes target prot opt in out source destination 1 2431K 4865M KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */ 2 2431K 4865M KUBE-EXTERNAL-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes externally-visible service portals */ 3 37M 18G KUBE-NODEPORT-NON-LOCAL all -- * * 0.0.0.0/0 0.0.0.0/0 /* Ensure that non-local NodePort traffic can flow */ 4 37M 18G OPENSHIFT-FIREWALL-ALLOW all -- * * 0.0.0.0/0 0.0.0.0/0 /* firewall overrides */ 5 33M 13G KUBE-FIREWALL all -- * * 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) num pkts bytes target prot opt in out source destination 1 3859K 3055M KUBE-FORWARD all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */ 2 28795 3195K KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */ 3 10906 744K OPENSHIFT-ADMIN-OUTPUT-RULES all -- tun0 !tun0 0.0.0.0/0 0.0.0.0/0 /* administrator overrides */ 4 28257 3163K OPENSHIFT-FIREWALL-FORWARD all -- * * 0.0.0.0/0 0.0.0.0/0 /* firewall overrides */ 5 0 0 OPENSHIFT-BLOCK-OUTPUT all -- * * 0.0.0.0/0 0.0.0.0/0 /* firewall overrides */ Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) num pkts bytes target prot opt in out source destination 1 1987K 444M KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate NEW /* kubernetes service portals */ 2 37M 21G OPENSHIFT-BLOCK-OUTPUT all -- * * 0.0.0.0/0 0.0.0.0/0 /* firewall overrides */ 3 37M 21G KUBE-FIREWALL all -- * * 0.0.0.0/0 0.0.0.0/0 Chain KUBE-FIREWALL (2 references) num pkts bytes target prot opt in out source destination 1 0 0 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000 2 0 0 DROP all -- * * !127.0.0.0/8 127.0.0.0/8 /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT Chain KUBE-KUBELET-CANARY (0 references) num pkts bytes target prot opt in out source destination Chain OPENSHIFT-BLOCK-OUTPUT (2 references) num pkts bytes target prot opt in out source destination 1 37 2220 REJECT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp dpt:22624 reject-with icmp-port-unreachable Chain OPENSHIFT-FIREWALL-FORWARD (1 references) num pkts bytes target prot opt in out source destination 1 0 0 DROP all -- * * 10.128.0.0/14 0.0.0.0/0 /* attempted resend after connection close */ ctstate INVALID 2 17351 2419K ACCEPT all -- * * 0.0.0.0/0 10.128.0.0/14 /* forward traffic from SDN */ 3 10906 744K ACCEPT all -- * * 10.128.0.0/14 0.0.0.0/0 /* forward traffic to SDN */ Chain OPENSHIFT-ADMIN-OUTPUT-RULES (1 references) num pkts bytes target prot opt in out source destination Chain OPENSHIFT-FIREWALL-ALLOW (1 references) num pkts bytes target prot opt in out source destination 1 1932K 4835M ACCEPT udp -- * * 0.0.0.0/0 0.0.0.0/0 udp dpt:4789 /* VXLAN incoming */ 2 1586K 338M ACCEPT all -- tun0 * 0.0.0.0/0 0.0.0.0/0 /* from SDN to localhost */ 3 0 0 ACCEPT all -- docker0 * 0.0.0.0/0 0.0.0.0/0 /* from docker to localhost */ Chain KUBE-PROXY-CANARY (0 references) num pkts bytes target prot opt in out source destination Chain KUBE-NODEPORT-NON-LOCAL (1 references) num pkts bytes target prot opt in out source destination Chain KUBE-EXTERNAL-SERVICES (1 references) num pkts bytes target prot opt in out source destination Chain KUBE-SERVICES (3 references) num pkts bytes target prot opt in out source destination Chain KUBE-FORWARD (1 references) num pkts bytes target prot opt in out source destination 1 6 240 DROP all -- * * 0.0.0.0/0 0.0.0.0/0 ctstate INVALID 2 5 570 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */ mark match 0x1/0x1 3 179K 129M ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED 4 0 0 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED Chain OPENSHIFT-SDN-CANARY (0 references) num pkts bytes target prot opt in out source destination sh-4.4# exit sh-4.2# exit exit Removing debug pod ... 2. Quickly get into a worker debug node, and run the curl commands below. You can get a token by logging into the web console. Verify that 2.2.0 ignition is served when no User-agent is added. Verify that 3.1.0 ignition is served when the User-agent is 2.3.0 $ oc debug node/ip-10-0-136-229.us-west-2.compute.internal Starting pod/ip-10-0-136-229us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# curl -kH "Authorization: Bearer <TOKEN>" https://api.mnguyen46.devcluster.openshift.com:22623/config/worker" {"ignition":{"version":"2.2.0"},"passwd":{"users":[{"name":"core","sshAuthorizedKeys": ..snip.. sh-4.4# curl -kH "Authorization: Bearer <TOKEN>" -H "User-agent:Ignition/2.3.0" "https://api.mnguyen46.devcluster.oenshift.com:22623/config/worker" {"ignition":{"version":"3.1.0"},"passwd":{"users":[{"name":"core","sshAuthorizedKeys" ..snip..
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196