Bug 1827982 - Azure-file mount fail on RHEL worker due to missing cifs-utils package
Summary: Azure-file mount fail on RHEL worker due to missing cifs-utils package
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.4
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: 4.5.0
Assignee: Russell Teague
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks: 1845819
TreeView+ depends on / blocked
 
Reported: 2020-04-26 06:32 UTC by Wei Duan
Modified: 2020-07-13 17:31 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: On the Azure platform, the cifs-utils package is required to create volume mounts for pods. Consequence: Pod volume mounts fail if the package is not installed. Fix: Add cifs-utils to the list of packages installed for RHEL7 hosts when installing OpenShift. Result: Pod volume mounts are created successfully when deploying on the Azure platform.
Clone Of:
: 1845819 1845821 1845841 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:31:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 12151 0 None closed Bug 1827982: Install cifs-utils, required for azure-file mounts 2020-12-17 22:45:34 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:31:47 UTC

Description Wei Duan 2020-04-26 06:32:54 UTC
Description of problem:
After scale up RHEL worker on OCP4.4, when using azure-file volume to create pvc/pod, the volume cannot be mounted on the RHEL worker.   

Version-Release number of selected component (if applicable):
4.4.0-rc.11

How reproducible:
100%

Steps to Reproduce:
1. Scale up RHEL worker
[wduan@MINT 01_general]$ oc get node -o wide
NAME                                STATUS   ROLES    AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION                CONTAINER-RUNTIME
wduan425-rcbmd-master-0             Ready    master   16h    v1.17.1   10.0.0.6      <none>        Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64   cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8
wduan425-rcbmd-master-1             Ready    master   16h    v1.17.1   10.0.0.7      <none>        Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64   cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8
wduan425-rcbmd-master-2             Ready    master   16h    v1.17.1   10.0.0.8      <none>        Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64   cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8
wduan425-rcbmd-rhel-0               Ready    worker   126m   v1.17.1   10.0.1.6      <none>        Red Hat Enterprise Linux Server 7.8 (Maipo)                    3.10.0-1127.el7.x86_64        cri-o://1.17.4-8.dev.rhaos4.4.git5f5c5e4.el7
wduan425-rcbmd-rhel-1               Ready    worker   125m   v1.17.1   10.0.1.7      <none>        Red Hat Enterprise Linux Server 7.8 (Maipo)                    3.10.0-1127.el7.x86_64        cri-o://1.17.4-8.dev.rhaos4.4.git5f5c5e4.el7
wduan425-rcbmd-worker-centralus-1   Ready    worker   16h    v1.17.1   10.0.1.5      <none>        Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64   cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8
wduan425-rcbmd-worker-centralus-2   Ready    worker   16h    v1.17.1   10.0.1.4      <none>        Red Hat Enterprise Linux CoreOS 44.81.202004221930-0 (Ootpa)   4.18.0-147.8.1.el8_1.x86_64   cri-o://1.17.4-2.dev.rhaos4.4.gitfe61deb.el8

2. create a deployment with azure-file, make sure assign to a REHL worker. 

3. pod should be running status.

Actual results:
pod is pending and volume mount failed.

Expected results:
pod should be running status with volume mount successful.

Master Log:

Node Log (of failed PODs):
Kubelete log:
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: Mounting command: systemd-run
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/0c57c025-96b5-4139-a271-30f8a065f611/volumes/kubernetes.io~azure-file/pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 --scope -- mount -t cifs -o gid=1500,mfsymlinks,uid=1500,username=f8810d9cbb651421da713f2,password=1FsNHo+17jC/TMrpYBNOQ++elqZgUIw7pJF8Wouw7Tb48CAch2UG1SAXb8I/CaGlsTAt4rNodPIK0xkZwxQVXA==,file_mode=0777,dir_mode=0777,vers=3.0 //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 /var/lib/kubelet/pods/0c57c025-96b5-4139-a271-30f8a065f611/volumes/kubernetes.io~azure-file/pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: Output: Running scope as unit run-72149.scope.
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: mount: wrong fs type, bad option, bad superblock on //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786,
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: missing codepage or helper program, or other error
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: (for several filesystems (e.g. nfs, cifs) you might
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: need a /sbin/mount.<type> helper program)
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: In some cases useful info is found in syslog - try
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: dmesg | tail or so.
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: E0426 04:26:31.835127    1641 nestedpendingoperations.go:270] Operation for "\"kubernetes.io/azure-file/0c57c025-96b5-4139-a271        -30f8a065f611-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786\" (\"0c57c025-96b5-4139-a271-30f8a065f611\")" failed. No retries permitted until 2020-04-26 04:26:32.335048171 +0000 UTC m=+164        2.921325586 (durationBeforeRetry 500ms). Error: "MountVolume.SetUp failed for volume \"pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786\" (UniqueName: \"kubernetes.io/azure-file/0c57c025-96b        5-4139-a271-30f8a065f611-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786\") pod \"azurefile-deployment-02-5dfcc6fffd-t6vbx\" (UID: \"0c57c025-96b5-4139-a271-30f8a065f611\") : mount failed:         exit status 32\nMounting command: systemd-run\nMounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/0c57c025-96b5-4139-a271-30f8a065f611/volumes/kub        ernetes.io~azure-file/pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 --scope -- mount -t cifs -o gid=1500,mfsymlinks,uid=1500,username=f8810d9cbb651421da713f2,password=1FsNHo+17jC/TMrpYBNO        Q++elqZgUIw7pJF8Wouw7Tb48CAch2UG1SAXb8I/CaGlsTAt4rNodPIK0xkZwxQVXA==,file_mode=0777,dir_mode=0777,vers=3.0 //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-        3d3b140b-79fa-4f39-b322-4a278e1f8786 /var/lib/kubelet/pods/0c57c025-96b5-4139-a271-30f8a065f611/volumes/kubernetes.io~azure-file/pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786\nOutput: Run        ning scope as unit run-72149.scope.\nmount: wrong fs type, bad option, bad superblock on //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39        -b322-4a278e1f8786,\n       missing codepage or helper program, or other error\n       (for several filesystems (e.g. nfs, cifs) you might\n       need a /sbin/mount.<type> helper pr        ogram)\n\n       In some cases useful info is found in syslog - try\n       dmesg | tail or so.\n"
  Apr 26 04:26:31 wduan425-rcbmd-rhel-1 hyperkube[1641]: I0426 04:26:31.835151    1641 event.go:281] Event(v1.ObjectReference{Kind:"Pod", Namespace:"wduan", Name:"azurefile-deployment-        02-5dfcc6fffd-t6vbx", UID:"0c57c025-96b5-4139-a271-30f8a065f611", APIVersion:"v1", ResourceVersion:"276140", FieldPath:""}): type: 'Warning' reason: 'FailedMount' MountVolume.SetUp failed for volume "pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786" : mount failed: exit status 32

dmesg: (Not sure it is related one)
[ 1655.712829] Key type cifs.spnego registered
[ 1655.715851] Key type cifs.idmap registered
[ 1655.719186] Unable to determine destination address.
[ 1656.321542] Unable to determine destination address.
[ 1657.433242] Unable to determine destination address.


PV Dump:
[wduan@MINT 01_general]$ oc get pv pvc-bf6ae94e-987a-43bc-88b1-7c3843112962 -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    kubernetes.io/createdby: azure-file-dynamic-provisioner
    pv.kubernetes.io/bound-by-controller: "yes"
    pv.kubernetes.io/provisioned-by: kubernetes.io/azure-file
  creationTimestamp: "2020-04-26T04:55:54Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-bf6ae94e-987a-43bc-88b1-7c3843112962
  resourceVersion: "284981"
  selfLink: /api/v1/persistentvolumes/pvc-bf6ae94e-987a-43bc-88b1-7c3843112962
  uid: 8ad5d160-e57a-49f0-a1cf-c701ad29a60d
spec:
  accessModes:
  - ReadWriteMany
  azureFile:
    secretName: azure-storage-account-f8810d9cbb651421da713f2-secret
    secretNamespace: wduan
    shareName: wduan425-rcbmd-dynamic-pvc-bf6ae94e-987a-43bc-88b1-7c3843112962
  capacity:
    storage: 1Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: azurefile-ocp-01
    namespace: wduan
    resourceVersion: "284966"
    uid: bf6ae94e-987a-43bc-88b1-7c3843112962
  mountOptions:
  - uid=1500
  - gid=1500
  - mfsymlinks
  persistentVolumeReclaimPolicy: Delete
  storageClassName: azurefile-ocp
  volumeMode: Filesystem
status:
  phase: Bound


PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):
[wduan@MINT 01_general]$ oc get sc azurefile-ocp -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2020-04-26T04:26:04Z"
  name: azurefile-ocp
  resourceVersion: "275981"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/azurefile-ocp
  uid: d30d3dbd-c5f3-4c81-9182-aa778c3eecd9
mountOptions:
- uid=1500
- gid=1500
- mfsymlinks
parameters:
  skuName: Standard_LRS
provisioner: kubernetes.io/azure-file
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Additional info:
I test the mount command on both RHEL and CoreOS node, Core OS is ok, but RHEL have the same error like kubelet.

Comment 1 Chao Yang 2020-04-26 07:05:20 UTC
After installed cifs-utils, can mount azure file manually.
sh-4.2# mount -t cifs -o gid=1500,mfsymlinks,uid=1500,username=f8810d9cbb651421da713f2,password=1FsNHo+17jC/TMrpYBNOQ++elqZgUIwpJF8Wouw7Tb48CAch2UG1SAXb8I/CaGlsTAt4rNodPIK0xkZwxQVXA==,file_mode=0777,dir_mode=0777,vers=3.0 //f8810d9cbb651421da713f2.file.cre.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 /mnt
sh-4.2# mount | grep mnt
/dev/sdb1 on /mnt type ext4 (rw,relatime,seclabel,data=ordered)
//f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786 on /mnt type cifs (rw,relatime,vers=3.0,cache=strict,username=f8810d9cbb651421da713f2,domain=X,uid=1500,forceuid,gid=1500,forcegid,addr=52.165.136.44,file_mode=0777,dir_mode=0777,soft,persistenthandles,nounix,serverino,mapposix,mfsymlinks,rsize=1048576,wsize=1048576,echo_interval=60,actimeo=1)

Comment 2 Chao Yang 2020-04-26 07:39:03 UTC
Azure file could not mount is because we did not install package cifs-utils on RHEL worker.
So we can see the error message 'mount: wrong fs type, bad option, bad superblock on //f8810d9cbb651421da713f2.file.core.windows.net/wduan425-rcbmd-dynamic-pvc-3d3b140b-79fa-4f39-b322-4a278e1f8786,'

After we install package cifs-utils this issue is fixed.
So please install cifs-utils during installation.

Comment 3 Scott Dodson 2020-04-26 13:35:45 UTC
Workaround: install cifs-utils on RHEL7 workers.

Comment 6 Wei Duan 2020-05-09 09:26:02 UTC
Verified pass   

[wduan@MINT config]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.nightly-2020-05-08-015855   True        False         93m     Cluster version is 4.5.0-0.nightly-2020-05-08-015855

sh-4.2# more /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.8 (Maipo)

sh-4.2# rpm -qa | grep -i cifs-utils 
cifs-utils-6.2-10.el7.x86_64

Comment 7 errata-xmlrpc 2020-07-13 17:31:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.