+++ This bug was initially created as a clone of Bug #1965155 +++ Description of problem: Moutning XFS restored snapshot volume to same node is failed Upstream issue here: https://github.com/container-storage-interface/spec/issues/482 Version-Release number of selected component (if applicable): 4.8.0-0.nightly-2021-05-25-223219 How reproducible: Always Steps to Reproduce: 1.Create pvc/pod using below storageclass: oc get sc foo -o yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: annotations: storageclass.kubernetes.io/is-default-class: "false" creationTimestamp: "2021-05-27T01:55:45Z" name: foo resourceVersion: "536907" uid: 8d77991b-6b96-4995-b044-8b232168712e parameters: fsType: xfs provisioner: ebs.csi.aws.com reclaimPolicy: Delete volumeBindingMode: WaitForFirstConsumer 2.Create volumesnapshot oc get volumesnapshot NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE new-snapshot-test-1 true pvc1 1Gi gcp-snap-2 snapcontent-55a1ee74-6791-4506-aae7-50a093b8cb20 45m 45m 3.Create restored pvc/pod. Make sure pod is scheduled to same node. Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 13s default-scheduler Successfully assigned test/pod2 to ip-10-0-169-208.us-east-2.compute.internal Normal SuccessfulAttachVolume 11s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-b9076058-77c4-4496-a9be-22b7bfcddae7" Warning FailedMount 2s (x4 over 5s) kubelet MountVolume.MountDevice failed for volume "pvc-b9076058-77c4-4496-a9be-22b7bfcddae7" : rpc error: code = Internal desc = could not format "/dev/nvme3n1" and mount it at "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-b9076058-77c4-4496-a9be-22b7bfcddae7/globalmount": mount failed: exit status 32 Mounting command: mount Mounting arguments: -t xfs -o defaults /dev/nvme3n1 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-b9076058-77c4-4496-a9be-22b7bfcddae7/globalmount Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-b9076058-77c4-4496-a9be-22b7bfcddae7/globalmount: wrong fs type, bad option, bad superblock on /dev/nvme3n1, missing codepage or helper program, or other error. Actual results: Restored pod is failed due to xfs volume could not mount. Expected results: Restored pod should be running. Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info: --- Additional comment from Jan Safranek on 2021-06-03 14:13:15 UTC --- This affects also cloned volumes. Since the root cause (and the fix) are the same, let's track them in a single bug. --- Additional comment from Jan Safranek on 2021-06-03 14:14:11 UTC ---
AWS EBS CSI driver has been fixed in upstream release 1.2.0, https://github.com/kubernetes-sigs/aws-ebs-csi-driver/pull/913
Verified pass on 4.9.0-0.nightly-2021-08-19-184748 $ oc get pod -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES mypod-ori 1/1 Running 0 20m 10.129.2.22 ip-10-0-217-234.us-east-2.compute.internal <none> <none> mypod-res 1/1 Running 0 6m36s 10.129.2.29 ip-10-0-217-234.us-east-2.compute.internal <none> <none> mypod-res-1 1/1 Running 0 4m45s 10.129.2.33 ip-10-0-217-234.us-east-2.compute.internal <none> <none> mypod-res-2 1/1 Running 0 4m44s 10.129.2.34 ip-10-0-217-234.us-east-2.compute.internal <none> <none> $ oc rsh mypod-res-2 sh-4.4# /mnt/local/hello Hello OpenShift Storage sh-4.4# mount | grep local /dev/nvme4n1 on /mnt/local type xfs (rw,relatime,seclabel,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759