Description of Problem: AWS EBS CSI driver node pods should not run on windows nodes Version-Release number of selected component (if applicable): 4.6.0-0.nightly-2020-09-21-030155 How Reproducible: Always Steps to Reproduce: 1. Setup an OCP cluster with windows nodes 2. Check aws ebs csi operator and drivers Actual Results: CSO status is in Progressing status always AWS EBS CSI driver nodes pods scheduled to windows nodes are stucked in "ContainerCreating" status Warning FailedMount 10m (x3 over 49m) kubelet, ip-10-0-154-194.us-east-2.compute.internal Unable to attach or mount volumes: unmounted volumes=[device-dir], unattached volumes=[aws-ebs-csi-driver-node-sa-token-g56qc registration-dir kubelet-dir plugin-dir device-dir]: timed out waiting for the condition Warning FailedMount 6m12s (x5 over 42m) kubelet, ip-10-0-154-194.us-east-2.compute.internal Unable to attach or mount volumes: unmounted volumes=[device-dir], unattached volumes=[device-dir aws-ebs-csi-driver-node-sa-token-g56qc registration-dir kubelet-dir plugin-dir]: timed out waiting for the condition Warning FailedMount 18s (x33 over 51m) kubelet, ip-10-0-154-194.us-east-2.compute.internal MountVolume.SetUp failed for volume "device-dir" : hostPath type check failed: /dev is not a directory [piqin@preserve-storage-server1 local-storage]$ oc get nodes -o wide Expected Results: AWS EBS CSI Driver node Pods should not be scheduled to windows nodes. CSO can be run successfully. Additional info: $ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-131-208.us-east-2.compute.internal Ready worker 53m v1.19.0-rc.2.1023+f5121a6a6a02dd 10.0.131.208 <none> Windows Server 2019 Datacenter 10.0.17763.1339 docker://19.3.5 ip-10-0-154-124.us-east-2.compute.internal Ready worker 21h v1.19.0+7f9e863 10.0.154.124 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ip-10-0-154-194.us-east-2.compute.internal Ready worker 57m v1.19.0-rc.2.1023+f5121a6a6a02dd 10.0.154.194 <none> Windows Server 2019 Datacenter 10.0.17763.1339 docker://19.3.5 ip-10-0-157-223.us-east-2.compute.internal Ready master,worker 21h v1.19.0+7f9e863 10.0.157.223 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ip-10-0-175-185.us-east-2.compute.internal Ready master,worker 21h v1.19.0+7f9e863 10.0.175.185 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ip-10-0-175-60.us-east-2.compute.internal Ready worker 21h v1.19.0+7f9e863 10.0.175.60 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ip-10-0-193-252.us-east-2.compute.internal Ready master,worker 21h v1.19.0+7f9e863 10.0.193.252 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 ip-10-0-195-129.us-east-2.compute.internal Ready worker 21h v1.19.0+7f9e863 10.0.195.129 <none> Red Hat Enterprise Linux CoreOS 46.82.202009182140-0 (Ootpa) 4.18.0-193.23.1.el8_2.x86_64 cri-o://1.19.0-18.rhaos4.6.gitd802e19.el8 The manila csi operator and oVirt csi operator should have the same issue, but for 4.6, we don't support windows workers for Openstack and oVirt, so it's ok to fix manila csi operator nd oVirt csi operator later.
For the record, Windows node labels: labels: beta.kubernetes.io/arch: amd64 beta.kubernetes.io/instance-type: m5a.large beta.kubernetes.io/os: windows failure-domain.beta.kubernetes.io/region: us-east-2 failure-domain.beta.kubernetes.io/zone: us-east-2a kubernetes.io/arch: amd64 kubernetes.io/hostname: ec2amaz-79adi55 kubernetes.io/os: windows node-role.kubernetes.io/worker: "" node.kubernetes.io/instance-type: m5a.large node.kubernetes.io/windows-build: 10.0.17763 node.openshift.io/os_id: Windows topology.kubernetes.io/region: us-east-2 topology.kubernetes.io/zone: us-east-2a Linux node: labels: beta.kubernetes.io/arch: amd64 beta.kubernetes.io/instance-type: m5.large beta.kubernetes.io/os: linux failure-domain.beta.kubernetes.io/region: us-east-2 failure-domain.beta.kubernetes.io/zone: us-east-2a kubernetes.io/arch: amd64 kubernetes.io/hostname: ip-10-0-154-124 kubernetes.io/os: linux node-role.kubernetes.io/worker: "" node.kubernetes.io/instance-type: m5.large node.openshift.io/os_id: rhcos topology.ebs.csi.aws.com/zone: us-east-2a topology.kubernetes.io/region: us-east-2 topology.kubernetes.io/zone: us-east-2a The driver should target "kubernetes.io/os: linux".
Verified with: 4.6.0-0.nightly-2020-09-23-022756
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196