Description of problem: After scale up RHEL worker on OCP4.4, when using azure-file volume to create pvc/pod, the volume cannot be mounted on the RHEL worker. Version-Release number of selected component (if applicable): 4.4.0-rc.11 How reproducible: 100% Steps to Reproduce: 1. Scale up RHEL worker [wduan@MINT 01_general]$ oc get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME wduan425-rcbmd-master-0 Ready master 16h v1.17.1 10.0.0.6 <none> Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8 wduan425-rcbmd-master-1 Ready master 16h v1.17.1 10.0.0.7 <none> Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8 wduan425-rcbmd-master-2 Ready master 16h v1.17.1 10.0.0.8 <none> Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8 wduan425-rcbmd-rhel-0 Ready worker 126m v1.17.1 10.0.1.6 <none> Red Hat Enterprise Linux Server 7.8 (Maipo) 3.10.0-1127.el7.x86_64 cri-o://1.17.4-8.dev.rhaos4.4.git5f5c5e4.el7 wduan425-rcbmd-rhel-1 Ready worker 125m v1.17.1 10.0.1.7 <none> Red Hat Enterprise Linux Server 7.8 (Maipo) 3.10.0-1127.el7.x86_64 cri-o://1.17.4-8.dev.rhaos4.4.git5f5c5e4.el7 wduan425-rcbmd-worker-centralus-1 Ready worker 16h v1.17.1 10.0.1.5 <none> Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8 wduan425-rcbmd-worker-centralus-2 Ready worker 16h v1.17.1 10.0.1.4 <none> Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa) 4.18.0-147.8.1.el8_1.x86_64 cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8 2. create a deployment with azure-file, make sure assign to a REHL worker. 3. pod should be running status. Actual results: pod is pending and volume mount failed. Expected results: pod should be running status with volume mount successful. Master Log: Node Log (of failed PODs): Kubelete log: Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: Mounting command: systemd-run Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/0c57c025-96b5-4139-a271-30f8a065f611/volumes/kubernetes.io~azure-file/pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 --scope -- mount -t cifs -o gid=1500,mfsymlinks,uid=1500,username=f8810d9cbb651421da713f2,password=1FsNHo+17jC/TMrpYBNOQ++elqZgUIw7pJF8Wouw7Tb48CAch2UG1SAXb8I/CaGlsTAt4rNodPIK0xkZwxQVXA==,file_mode=0777,dir_mode=0777,vers=3.0 //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 /var/lib/kubelet/pods/0c57c025-96b5-4139-a271-30f8a065f611/volumes/kubernetes.io~azure-file/pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: Output: Running scope as unit run-72149.scope. Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: mount: wrong fs type, bad option, bad superblock on //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786, Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: missing codepage or helper program, or other error Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: (for several filesystems (e.g. nfs, cifs) you might Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: need a /sbin/mount.<type> helper program) Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: In some cases useful info is found in syslog - try Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: dmesg | tail or so. Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: E0426 04:26:31.835127 1641 nestedpendingoperations.go:270] Operation for "\"kubernetes.io/azure-file/0c57c025-96b5-4139-a271 -30f8a065f611-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786\" (\"0c57c025-96b5-4139-a271-30f8a065f611\")" failed. No retries permitted until 2020-04-26 04:26:32.335048171 +0000 UTC m=+164 2.921325586 (durationBeforeRetry 500ms). Error: "MountVolume.SetUp failed for volume \"pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786\" (UniqueName: \"kubernetes.io/azure-file/0c57c025-96b 5-4139-a271-30f8a065f611-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786\") pod \"azurefile-deployment-02-5dfcc6fffd-t6vbx\" (UID: \"0c57c025-96b5-4139-a271-30f8a065f611\") : mount failed: exit status 32\nMounting command: systemd-run\nMounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/0c57c025-96b5-4139-a271-30f8a065f611/volumes/kub ernetes.io~azure-file/pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 --scope -- mount -t cifs -o gid=1500,mfsymlinks,uid=1500,username=f8810d9cbb651421da713f2,password=1FsNHo+17jC/TMrpYBNO Q++elqZgUIw7pJF8Wouw7Tb48CAch2UG1SAXb8I/CaGlsTAt4rNodPIK0xkZwxQVXA==,file_mode=0777,dir_mode=0777,vers=3.0 //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc- 3d3b140b-79fa-4f39-b322-4a278e1f8786 /var/lib/kubelet/pods/0c57c025-96b5-4139-a271-30f8a065f611/volumes/kubernetes.io~azure-file/pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786\nOutput: Run ning scope as unit run-72149.scope.\nmount: wrong fs type, bad option, bad superblock on //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39 -b322-4a278e1f8786,\n missing codepage or helper program, or other error\n (for several filesystems (e.g. nfs, cifs) you might\n need a /sbin/mount.<type> helper pr ogram)\n\n In some cases useful info is found in syslog - try\n dmesg | tail or so.\n" Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: I0426 04:26:31.835151 1641 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"wduan", Name:"azurefile-deployment- 02-5dfcc6fffd-t6vbx", UID:"0c57c025-96b5-4139-a271-30f8a065f611", APIVersion:"v1", ResourceVersion:"276140", FieldPath:""}): type: 'Warning' reason: 'FailedMount' MountVolume.SetUp failed for volume "pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786" : mount failed: exit status 32 dmesg: (Not sure it is related one) [ 1655.712829] Key type cifs.spnego registered [ 1655.715851] Key type cifs.idmap registered [ 1655.719186] Unable to determine destination address. [ 1656.321542] Unable to determine destination address. [ 1657.433242] Unable to determine destination address. PV Dump: [wduan@MINT 01_general]$ oc get pv pvc-bf6ae94e-987a-43bc-88b1-7c3843112962 -o yaml apiVersion: v1 kind: PersistentVolume metadata: annotations: kubernetes.io/createdby: azure-file-dynamic-provisioner pv.kubernetes.io/bound-by-controller: "yes" pv.kubernetes.io/provisioned-by: kubernetes.io/azure-file creationTimestamp: "2020-04-26T04:55:54Z" finalizers: - kubernetes.io/pv-protection name: pvc-bf6ae94e-987a-43bc-88b1-7c3843112962 resourceVersion: "284981" selfLink: /api/v1/persistentvolumes/pvc-bf6ae94e-987a-43bc-88b1-7c3843112962 uid: 8ad5d160-e57a-49f0-a1cf-c701ad29a60d spec: accessModes: - ReadWriteMany azureFile: secretName: azure-storage-account-f8810d9cbb651421da713f2-secret secretNamespace: wduan shareName: wduan425-rcbmd-dynamic-pvc-bf6ae94e-987a-43bc-88b1-7c3843112962 capacity: storage: 1Gi claimRef: apiVersion: v1 kind: PersistentVolumeClaim name: azurefile-ocp-01 namespace: wduan resourceVersion: "284966" uid: bf6ae94e-987a-43bc-88b1-7c3843112962 mountOptions: - uid=1500 - gid=1500 - mfsymlinks persistentVolumeReclaimPolicy: Delete storageClassName: azurefile-ocp volumeMode: Filesystem status: phase: Bound PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): [wduan@MINT 01_general]$ oc get sc azurefile-ocp -o yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: creationTimestamp: "2020-04-26T04:26:04Z" name: azurefile-ocp resourceVersion: "275981" selfLink: /apis/storage.k8s.io/v1/storageclasses/azurefile-ocp uid: d30d3dbd-c5f3-4c81-9182-aa778c3eecd9 mountOptions: - uid=1500 - gid=1500 - mfsymlinks parameters: skuName: Standard_LRS provisioner: kubernetes.io/azure-file reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer Additional info: I test the mount command on both RHEL and CoreOS node, Core OS is ok, but RHEL have the same error like kubelet.
After installed cifs-utils, can mount azure file manually. sh-4.2# mount -t cifs -o gid=1500,mfsymlinks,uid=1500,username=f8810d9cbb651421da713f2,password=1FsNHo+17jC/TMrpYBNOQ++elqZgUIwpJF8Wouw7Tb48CAch2UG1SAXb8I/CaGlsTAt4rNodPIK0xkZwxQVXA==,file_mode=0777,dir_mode=0777,vers=3.0 //f8810d9cbb651421da713f2.file.cre.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 /mnt sh-4.2# mount | grep mnt /dev/sdb1 on /mnt type ext4 (rw,relatime,seclabel,data=ordered) //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 on /mnt type cifs (rw,relatime,vers=3.0,cache=strict,username=f8810d9cbb651421da713f2,domain=X,uid=1500,forceuid,gid=1500,forcegid,addr=52.165.136.44,file_mode=0777,dir_mode=0777,soft,persistenthandles,nounix,serverino,mapposix,mfsymlinks,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)
Azure file could not mount is because we did not install package cifs-utils on RHEL worker. So we can see the error message 'mount: wrong fs type, bad option, bad superblock on //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786,' After we install package cifs-utils this issue is fixed. So please install cifs-utils during installation.
Workaround: install cifs-utils on RHEL7 workers.
Verified pass [wduan@MINT config]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-05-08-015855 True False 93m Cluster version is 4.5.0-0.nightly-2020-05-08-015855 sh-4.2# more /etc/redhat-release Red Hat Enterprise Linux Server release 7.8 (Maipo) sh-4.2# rpm -qa | grep -i cifs-utils cifs-utils-6.2-10.el7.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409