Description of problem: Install a fresh cluster, add windows worker, then enable ccm, Check Windows nodes' kubelet cannot run with --cloud-provider=external But if install a fresh cluster with ccm, then add windows worker, Check Windows nodes' kubelet run with --cloud-provider=external as expected Version-Release number of selected component (if applicable): 4.11.0-0.nightly-2022-06-30-005428 How reproducible: Always Steps to Reproduce: 1. Install a fresh cluster, add windows worker liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-06-30-005428 True False 41m Cluster version is 4.11.0-0.nightly-2022-06-30-005428 liuhuali@Lius-MacBook-Pro huali-test % oc get node NAME STATUS ROLES AGE VERSION huliu-azure71a-4wmh2-master-0 Ready master 71m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-master-1 Ready master 71m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-master-2 Ready master 71m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-worker-southcentralus1-9d6lb Ready worker 57m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-worker-southcentralus2-mqpqq Ready worker 54m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-worker-southcentralus3-md7pc Ready worker 57m v1.24.0+9ddc8b1 windows-bgpxw Ready worker 27m v1.24.0-2323+01aa0f3f6052c9 windows-dz85l Ready worker 21m v1.24.0-2323+01aa0f3f6052c9 2. enable ccm liuhuali@Lius-MacBook-Pro huali-test % oc edit featuregate featuregate.config.openshift.io/cluster edited liuhuali@Lius-MacBook-Pro huali-test % oc get deploy -n openshift-cloud-controller-manager NAME READY UP-TO-DATE AVAILABLE AGE azure-cloud-controller-manager 2/2 2 2 10m liuhuali@Lius-MacBook-Pro huali-test % oc get pod -n openshift-cloud-controller-manager NAME READY STATUS RESTARTS AGE azure-cloud-controller-manager-5946ff4bb9-6hc5k 1/1 Running 0 10m azure-cloud-controller-manager-5946ff4bb9-qdsb7 1/1 Running 0 10m azure-cloud-node-manager-62srs 1/1 Running 0 9m44s azure-cloud-node-manager-72gjm 1/1 Running 0 10m azure-cloud-node-manager-k4bdb 1/1 Running 0 6m50s azure-cloud-node-manager-tlk4c 1/1 Running 0 10m azure-cloud-node-manager-tpwhc 1/1 Running 0 10m azure-cloud-node-manager-vpgw6 1/1 Running 0 10m liuhuali@Lius-MacBook-Pro huali-test % oc get node NAME STATUS ROLES AGE VERSION huliu-azure71a-4wmh2-master-0 Ready master 99m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-master-1 Ready master 98m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-master-2 Ready master 99m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-worker-southcentralus1-9d6lb Ready worker 84m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-worker-southcentralus2-mqpqq Ready worker 81m v1.24.0+9ddc8b1 huliu-azure71a-4wmh2-worker-southcentralus3-md7pc Ready worker 84m v1.24.0+9ddc8b1 windows-bgpxw Ready worker 54m v1.24.0-2323+01aa0f3f6052c9 windows-dz85l Ready worker 48m v1.24.0-2323+01aa0f3f6052c9 3. Ssh to windows node liuhuali@Lius-MacBook-Pro huali-test % oc debug node/huliu-azure71a-4wmh2-master-0 W0701 11:40:06.697694 61245 warnings.go:70] would violate PodSecurity "restricted:v1.24": host namespaces (hostNetwork=true, hostPID=true), privileged (container "container-00" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (container "container-00" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "container-00" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "host" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or container "container-00" must set securityContext.runAsNonRoot=true), runAsUser=0 (container "container-00" must not set runAsUser=0), seccompProfile (pod or container "container-00" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost") Starting pod/huliu-azure71a-4wmh2-master-0-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.0.7 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# cd ~ sh-4.4# ssh -i /tmp/openshift-qe.pem capi.128.7 powershell Windows PowerShell Copyright (C) Microsoft Corporation. All rights reserved. PS C:\Users\capi> Get-Item -path HKLM:HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\kubelet Get-Item -path HKLM:HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\kubelet Hive: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services Name Property ---- -------- kubelet Type : 16 Start : 2 ErrorControl : 1 ImagePath : c:\k\kubelet.exe --config=c:\k\kubelet.conf --bootstrap-kubeconfig=c:\k\bootstrap-kubeconfig --kubeconfig=c:\k\kubeconfig --cert-dir=c:\var\lib\kubelet\pki\ --windows-service --logtostderr=false --log-file=C:\var\log\kubelet\kubelet.log --register-with-taints=os=Windows:NoSchedule --node-labels=node.openshift.io/os_id=Windows --container-runtime=remote --container-runtime-endpoint=npipe://./pipe/containerd-containerd --resolv-conf= --cloud-provider=azure --v=3 --cloud-config=c:\k\cloud.conf DependOnService : {containerd} ObjectName : LocalSystem Description : OpenShift managed kubelet FailureActions : {88, 2, 0, 0...} PS C:\Users\capi> Get-Service cloud-node-manager Get-Service cloud-node-manager Get-Service : Cannot find any service with service name 'cloud-node-manager'. At line:1 char:1 + Get-Service cloud-node-manager + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : ObjectNotFound: (cloud-node-manager:String) [Get-Service], ServiceCommandException + FullyQualifiedErrorId : NoServiceFoundForGivenName,Microsoft.PowerShell.Commands.GetServiceCommand PS C:\Users\capi> Actual results: kubelet run with --cloud-provider=azure; no cloud-node-manager service. Expected results: kubelet run with --cloud-provider=external; Should have cloud-node-manager. Additional info: Checked on aws(4.11.0-0.nightly-2022-06-30-005428), vsphere(4.11.0-0.nightly-2022-06-30-005428), azure(4.10.0-fc.0, 4.10.0-0.nightly-2022-06-08-150219, 4.11.0-0.nightly-2022-06-30-005428), all can reproduce this issue. Also checked on aws(4.11.0-0.nightly-2022-06-30-005428), vsphere(4.11.0-0.nightly-2022-06-30-005428), azure(4.11.0-0.nightly-2022-06-30-005428), install a fresh cluster with ccm, then add windows worker, Check Windows nodes' kubelet run with --cloud-provider=external as expected. PS C:\Users\capi> Get-Service cloud-node-manager Get-Service cloud-node-manager Status Name DisplayName ------ ---- ----------- Running cloud-node-manager cloud-node-manager PS C:\Users\capi> Get-Item -path HKLM:HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\kubelet Get-Item -path HKLM:HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\kubelet Hive: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services Name Property ---- -------- kubelet Type : 16 Start : 2 ErrorControl : 1 ImagePath : c:\k\kubelet.exe --config=c:\k\kubelet.conf --bootstrap-kubeconfig=c:\k\bootstrap-kubeconfig --kubeconfig=c:\k\kubeconfig --cert-dir=c:\var\lib\kubelet\pki\ --windows-service --logtostderr=false --log-file=C:\var\log\kubelet\kubelet.log --register-with-taints=os=Windows:NoSchedule --node-labels=node.openshift.io/os_id=Windows --container-runtime=remote --container-runtime-endpoint=npipe://./pipe/containerd-containerd --resolv-conf= --cloud-provider=external --v=3 DependOnService : {containerd} ObjectName : LocalSystem Description : OpenShift managed kubelet FailureActions : {88, 2, 0, 0...} PS C:\Users\capi> Must-gather: azure(install a fresh cluster, add windows worker, then enable ccm) - https://drive.google.com/file/d/1N2InQFe_mDIqayfUCqMyP-U8OE-2wss2/view?usp=sharing azure(install a fresh cluster with ccm, then add windows worker) - https://drive.google.com/file/d/1iHR9LzQuCmwtRxsVz6oBMGCCJrJFwTYC/view?usp=sharing
On discussion with Mikhail, this seems to be a limitation of WMCO. WMCO applies the configuration on create but isn't updating the configuration. Please can the WMCO team confirm that this is a limitation and let us know if they need help resolving it, we have an interest in seeing this working by the end of 4.12.
The team discussed and this is indeed a limitation of WMCO. Prioritization of this work is yet to be done.
OpenShift has moved to Jira for its defect tracking! This bug can now be found in the OCPBUGS project in Jira. https://issues.redhat.com/browse/OCPBUGS-9356