Bug 1963885 - [Clone] Mount cloned xfs-type volume failed when original and cloned pod are scheduled to the same node due to the duplicated filesystem UUID
Summary: [Clone] Mount cloned xfs-type volume failed when original and cloned pod are ...
Keywords:
Status: CLOSED DUPLICATE of bug 1965155
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: aos-storage-staff@redhat.com
QA Contact: Wei Duan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-24 10:13 UTC by Wei Duan
Modified: 2021-06-03 14:14 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-03 14:14:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Wei Duan 2021-05-24 10:13:51 UTC
Description of problem:
When cloned xfs-type volume and the original and cloned pod are scheduled to the same node, the mount is failed.

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-05-21-233425

How reproducible:
Always in such case

Steps to Reproduce:
1. create sc with fstype: xfs 
$ oc get sc standard-csi-test -o yaml
...
parameters:
  csi.storage.k8s.io/fstype: xfs
provisioner: cinder.csi.openstack.org
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

2. Create original pvc/pod, when original pod is running, create cloned pod,pvc, the pod is not running
$ oc get pvc
NAME          STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
mypvc-clone   Bound    pvc-4eadf8a2-9401-42c4-a1e0-2d47ed7bd8d7   1Gi        RWO            standard-csi-test   3h34m
mypvc-ori     Bound    pvc-0c304986-56f3-408f-97dc-301d7b46be62   1Gi        RWO            standard-csi-test   3h36m

$ oc get pv pvc-4eadf8a2-9401-42c4-a1e0-2d47ed7bd8d7 -o json | jq .spec.csi.fsType
"xfs"

$ oc get pv pvc-0c304986-56f3-408f-97dc-301d7b46be62 -o json | jq .spec.csi.fsType
"xfs"

$ oc get pod -o wide
NAME          READY   STATUS              RESTARTS   AGE     IP             NODE                                 NOMINATED NODE   READINESS GATES
mypod-clone   0/1     ContainerCreating   0          3h36m   <none>         wduan-0524a-o-7hknw-worker-0-vr7qr   <none>           <none>
mypod-ori     1/1     Running             0          3h38m   10.129.2.157   wduan-0524a-o-7hknw-worker-0-vr7qr   <none>           <none>

$ oc describe pod mypod-clone
Events:
  Type     Reason       Age                      From     Message
  ----     ------       ----                     ----     -------
  Warning  FailedMount  7m55s (x109 over 3h33m)  kubelet  MountVolume.MountDevice failed for volume "pvc-4eadf8a2-9401-42c4-a1e0-2d47ed7bd8d7" : rpc error: code = Internal desc = mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t xfs -o defaults /dev/disk/by-id/virtio-3628e70b-86b1-495f-a /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4eadf8a2-9401-42c4-a1e0-2d47ed7bd8d7/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4eadf8a2-9401-42c4-a1e0-2d47ed7bd8d7/globalmount: wrong fs type, bad option, bad superblock on /dev/vde, missing codepage or helper program, or other error.
  Warning  FailedMount  3m2s (x27 over 3h15m)       kubelet  Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[default-token-j6lq4 local]: timed out waiting for the condition
  Warning  FailedMount  <invalid> (x68 over 3h31m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[local], unattached volumes=[local default-token-j6lq4]: timed out waiting for the condition

4. Check in the node:
sh-4.4# lsblk 
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
...
vdb    252:16   0    1G  0 disk /var/lib/kubelet/pods/cafbb0d0-1c39-47ac-82b6-efaaf7746525/volumes/kubernetes.io~csi/pvc-42bfa345-9c34-44ea-861b-80c1fbef2452/mount
vdc    252:32   0    1G  0 disk /var/lib/kubelet/pods/1aa0be7f-decf-49bc-8700-be22b473415b/volumes/kubernetes.io~cinder/pvc-13777038-a9d3-489e-b4f9-a84f3d70a79d
vdd    252:48   0    1G  0 disk /var/lib/kubelet/pods/583ff008-270c-4dde-8290-108dcd64feb9/volumes/kubernetes.io~csi/pvc-0c304986-56f3-408f-97dc-301d7b46be62/mount
vde    252:64   0    1G  0 disk 

sh-4.4# dmesg
[44435.408560] XFS (vde): Filesystem has duplicate UUID fadf19ab-bbcc-4f40-8d4f-44550e822db1 - can't mount
[44557.612032] XFS (vde): Filesystem has duplicate UUID fadf19ab-bbcc-4f40-8d4f-44550e822db1 - can't mount

Actual results:
mypod-clone is not in running status

Expected results:
mypod-clone is in running status

Master Log:

Node Log (of failed PODs):

PV Dump:
  
PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 2 Jan Safranek 2021-05-26 10:24:56 UTC
I filed issue in CSI spec about who should handle XFS UUIDs: https://github.com/container-storage-interface/spec/issues/482

Comment 3 Mike Fedosin 2021-06-02 15:29:58 UTC
Assign this bug to Jan, because looks like it's a common issue across all CSI drivers and filesystems, and it should be fixed outside of Cinder CSI driver.

Comment 4 Jan Safranek 2021-06-03 14:14:11 UTC
It's the same bug as #1965155, which was originally about snapshots. I updated it to cover both clone and restored snapshot.

*** This bug has been marked as a duplicate of bug 1965155 ***


Note You need to log in before you can comment on or make changes to this bug.