Description of problem: GCP CSI driver provisioned volume with rox, when checked from the worker, mounted parameter is rw,relatime,seclabel Version-Release number of selected component (if applicable): 4.8.0-0.nightly-2021-06-03-221810 How reproducible: Always Steps to Reproduce: 1.oc describe pvc/pvc3 Name: pvc3 Namespace: openshift-cluster-csi-drivers StorageClass: standard-csi Status: Bound Volume: pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d Labels: <none> Annotations: pv.kubernetes.io/bind-completed: yes pv.kubernetes.io/bound-by-controller: yes volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io volume.kubernetes.io/selected-node: chaoyang64-flgbm-worker-a-xn27m.c.openshift-qe.internal Finalizers: [kubernetes.io/pvc-protection] Capacity: 2Gi Access Modes: ROX VolumeMode: Filesystem Used By: pod3 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal WaitForFirstConsumer 19m (x3 over 20m) persistentvolume-controller waiting for first consumer to be created before binding Normal ExternalProvisioning 19m (x2 over 19m) persistentvolume-controller waiting for a volume to be created, either by external provisioner "pd.csi.storage.gke.io" or manually created by system administrator Normal Provisioning 19m pd.csi.storage.gke.io_chaoyang64-flgbm-master-0.c.openshift-qe.internal_80871dfe-86ba-4881-9584-63c36274a831 External provisioner is provisioning volume for claim "openshift-cluster-csi-drivers/pvc3" Normal ProvisioningSucceeded 19m pd.csi.storage.gke.io_chaoyang64-flgbm-master-0.c.openshift-qe.internal_80871dfe-86ba-4881-9584-63c36274a831 Successfully provisioned volume pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d 2.oc describe pv/pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d Name: pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d Labels: <none> Annotations: pv.kubernetes.io/provisioned-by: pd.csi.storage.gke.io Finalizers: [kubernetes.io/pv-protection external-attacher/pd-csi-storage-gke-io] StorageClass: standard-csi Status: Bound Claim: openshift-cluster-csi-drivers/pvc3 Reclaim Policy: Delete Access Modes: ROX VolumeMode: Filesystem Capacity: 2Gi Node Affinity: Required Terms: Term 0: topology.gke.io/zone in [us-central1-a] Message: Source: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: pd.csi.storage.gke.io FSType: ext4 VolumeHandle: projects/openshift-qe/zones/us-central1-a/disks/pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d ReadOnly: false VolumeAttributes: storage.kubernetes.io/csiProvisionerIdentity=1622779448600-8081-pd.csi.storage.gke.io Events: <none> 3.mount | grep pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d /dev/sdf on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d/globalmount type ext4 (rw,relatime,seclabel) /dev/sdf on /var/lib/kubelet/pods/c50dc036-5634-4548-bebe-3e9f89598d26/volumes/kubernetes.io~csi/pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d/mount type ext4 (rw,relatime,seclabel) Actual results: GCP CSI driver provisioned volume with rox Expected results: GCP CSI driver should not provision volume with rox Master Log: Node Log (of failed PODs): PV Dump: PVC Dump: StorageClass Dump (if StorageClass used by PV/PVC): Additional info:
User asked for empty ReadOnlyMany volume and user got it :-). It's not very useful, but user may e.g. restore a snapshot there. @Chao, can you check the volume is really read-only? rw mount option is odd, but it can be still attached as read only. If it's writable we need to fix it.
Hi @jsafrane, We can write data to this volume oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE myclaim Bound pvc-d39ff6af-d4b8-4ad1-a63c-ba307ae2ec5b 2Gi ROX standard-csi 9m17s oc exec pod4 -ti -- bash [root@pod4 /]# ls /tmp1 lost+found test ls -lrt /var/lib/kubelet/pods/14e28cf2-8e11-4ea7-8595-4a3abb21c7e1/volumes/kubernetes.io~csi/vc-d39ff6af-d4b8-4ad1-a63c-ba307ae2ec5b/mount total 16 drwx------. 2 root root 16384 Jun 9 08:16 lost+found -rw-r--r--. 1 root root 0 Jun 9 08:16 test
Something in the cluster (kubelet? GCP CSI driver?) "forgets" to mount ReadOnlyMany volume as read only. Mustafa, reproduce the issue, and check logs of the CSI driver - how was NodeStage/NodePublish called? Their VolumeCapability.AccessMode should be MULTI_NODE_READER_ONLY and then the driver should mount the volume as read-only, in theory. https://github.com/container-storage-interface/spec/blob/486e6bdb2d5d814befb1d11744c39a33842af15f/csi.proto#L427 In addition, if you dynamically provision an empty ReadOnlyMany volume, the CSI driver should not even format the volume with ext4, it should be really read only and fail mounting it. It should succeed when you restore a snapshot of already formatted volume as a new PVC.
There is a an issue & PR upstream regarding this issue : https://github.com/kubernetes/kubernetes/issues/70505 PR : https://github.com/kubernetes-csi/external-provisioner/pull/469
This should have been fixed with rebase of the external CSI provisioner in OCP to version 3.0.0: moving manually to MODIFIED.
Failed on oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-09-23-210724 True False 3h5m Cluster version is 4.10.0-0.nightly-2021-09-23-210724 oc describe pv Name: pvc-97960186-529f-44a7-b887-ee13703f4395 Labels: <none> Annotations: pv.kubernetes.io/provisioned-by: pd.csi.storage.gke.io Finalizers: [kubernetes.io/pv-protection external-attacher/pd-csi-storage-gke-io] StorageClass: standard-csi Status: Bound Claim: default/myclaim1 Reclaim Policy: Delete Access Modes: ROX VolumeMode: Filesystem Capacity: 2Gi Node Affinity: Required Terms: Term 0: topology.gke.io/zone in [us-central1-c] Message: Source: Type: CSI (a Container Storage Interface (CSI) volume source) Driver: pd.csi.storage.gke.io FSType: ext4 VolumeHandle: projects/openshift-qe/zones/us-central1-c/disks/pvc-97960186-529f-44a7-b887-ee13703f4395 ReadOnly: false VolumeAttributes: storage.kubernetes.io/csiProvisionerIdentity=1632713817264-8081-pd.csi.storage.gke.io Events: <none> /dev/sdb on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-97960186-529f-44a7-b887-ee13703f4395/globalmount type ext4 (rw,relatime,seclabel) /dev/sdb on /var/lib/kubelet/pods/e447caa8-bd4b-48ed-9e04-cfc022e0568d/volumes/kubernetes.io~csi/pvc-97960186-529f-44a7-b887-ee13703f4395/mount type ext4 (rw,relatime,seclabel)
--- It is correct when try to provision and mount ro volumes. Warning FailedMount 16s (x7 over 51s) kubelet MountVolume.MountDevice failed for volume "pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7/globalmount") with fstype ("ext4") and options ([]): format of disk "/dev/disk/by-id/google-pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7" failed: type:("ext4") target:("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7/globalmount") options:("defaults") errcode:(exit status 1) output:(mke2fs 1.45.6 (20-Mar-2020) /dev/disk/by-id/google-pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7: Read-only file system while setting up superblock ) --- 1.Create rwo pvc/pod 2.Create snapshotclass 3.Create volumesnapshot oc get volumesnapshot NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE new-snapshot-test-1 true myclaim1 1Gi gcp-snap-2 snapcontent-320c7f7e-d651-47e7-a448-faaafa88b60b 3h10m 3h10m 4.Create restore pvc with rox oc get pvc/pvc1-restore -o yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io volume.kubernetes.io/selected-node: qe-chao-bug-4hclb-worker-c-gzlvv.c.openshift-qe.internal creationTimestamp: "2021-11-16T06:42:24Z" finalizers: - kubernetes.io/pvc-protection name: pvc1-restore namespace: test1 resourceVersion: "86025" uid: d06b7fe6-87ef-4d5f-8866-841416c66c3e spec: accessModes: - ReadOnlyMany dataSource: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: new-snapshot-test-1 resources: requests: storage: 1Gi storageClassName: standard-csi volumeMode: Filesystem volumeName: pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e status: accessModes: - ReadOnlyMany capacity: storage: 1Gi phase: Bound 5.oc get pods NAME READY STATUS RESTARTS AGE pod-restore 0/1 CreateContainerError 0 177m pod1 1/1 Running 0 3h10m oc describe pods/pod-restore Warning FailedMount 3m18s kubelet Unable to attach or mount volumes: unmounted volumes=[aws1], unattached volumes=[aws1 kube-api-access-rj8fz]: timed out waiting for the condition Warning FailedMount 64s (x2 over 5m34s) kubelet Unable to attach or mount volumes: unmounted volumes=[aws1], unattached volumes=[kube-api-access-rj8fz aws1]: timed out waiting for the condition Warning FailedMount 63s (x11 over 7m21s) kubelet MountVolume.MountDevice failed for volume "pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e/globalmount") with fstype ("ext4") and options ([]): mount failed: exit status 32 Mounting command: mount Mounting arguments: -t ext4 -o defaults /dev/disk/by-id/google-pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e/globalmount Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e/globalmount: cannot mount /dev/sdd read-only. 6.Tried on the node with `noload`, seems can mount to the node. mount -o ro,noload /dev/disk/by-id/google-pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e /mnt/test/ ls /mnt/test/ lost+found test oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2021-11-12-161948 True False 6h29m Cluster version is 4.10.0-0.nightly-2021-11-12-161948
I will try to reporoduce: do you have a spec for the pod-restore? I think it also needs to request a read-only mount for this thing to work.
I'm not sure the CSI to kubernetes volume mode mapping is complete and correct: I filed also https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/issues/872 upstream and will keep tinkering...
This requires rebase to upstream driver v1.4.0, we have 1.3.4 in 4.10.
It is ok that could not provision rox volume. Warning ProvisioningFailed 2s (x5 over 17s) pd.csi.storage.gke.io_qe-chaoyang66-gvpz2-master-0.c.openshift-qe.internal_9e3b8511-f5ed-4d40-b7d3-4cc18a0140ab failed to provision volume with StorageClass "standard-csi": rpc error: code = InvalidArgument desc = VolumeContentSource must be provided when AccessMode is set to read only
1.Create pvc/pod 2.Write some data into mounted volume oc exec pod1 -- ls -lrt /tmp1 total 4 -r--r--r--. 1 root root 13 Jun 8 11:10 test oc exec pod1 -- ls -lrt / | grep tmp1 dr--r--r--. 2 root root 4096 Jun 8 11:11 tmp1 3.Create volumesnapshot 4.Create restored pvc oc get pvc pvc2-restore -o yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io volume.kubernetes.io/selected-node: evakhoni-85461-2r9t4-worker-a-jmqtm.c.openshift-qe.internal volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io creationTimestamp: "2022-06-08T11:15:22Z" finalizers: - kubernetes.io/pvc-protection name: pvc2-restore namespace: default resourceVersion: "101081" uid: 02762e9f-58a2-41c2-925c-478b933884a7 spec: accessModes: - ReadOnlyMany dataSource: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: new-snapshot-test-1 resources: requests: storage: 2Gi storageClassName: standard-csi volumeMode: Filesystem volumeName: pvc-02762e9f-58a2-41c2-925c-478b933884a7 status: accessModes: - ReadOnlyMany capacity: storage: 2Gi phase: Bound 5.Create pod but container is error pod2 0/1 CreateContainerError 0 7m11s oc describe pod2 Warning Failed 5m13s (x12 over 7m13s) kubelet Error: relabel failed /var/lib/kubelet/pods/a101adb3-4505-4375-835d-17b178ef7a01/volumes/kubernetes.io~csi/pvc-02762e9f-58a2-41c2-925c-478b933884a7/mount: lsetxattr /var/lib/kubelet/pods/a101adb3-4505-4375-835d-17b178ef7a01/volumes/kubernetes.io~csi/pvc-02762e9f-58a2-41c2-925c-478b933884a7/mount: read-only file system @tsmetana can you help to check it?
Hello. This is what I got on 4.11.0-0.ci-2022-06-06-185917: Restored PVC: $ oc get pvc -o yaml apiVersion: v1 items: - apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io volume.kubernetes.io/selected-node: ci-ln-xkgf6w2-72292-j2hkc-worker-a-z94nq volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io creationTimestamp: "2022-06-08T13:44:20Z" finalizers: - kubernetes.io/pvc-protection name: pvc1-restore namespace: default resourceVersion: "33700" uid: 915ba939-95b7-4f0c-970e-a4487068113a spec: accessModes: - ReadOnlyMany dataSource: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: mysnap-1 dataSourceRef: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: mysnap-1 resources: requests: storage: 1Gi storageClassName: standard-csi volumeMode: Filesystem volumeName: pvc-915ba939-95b7-4f0c-970e-a4487068113a status: accessModes: - ReadOnlyMany capacity: storage: 1Gi phase: Bound kind: List metadata: resourceVersion: "" selfLink: "" The events from the pod using the PVC: $ oc describe pod pod-restore ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 16s default-scheduler Successfully assigned default/pod-restore to ci-ln-xkgf6w2-72292-j2hkc-worker-a-z94nq by ci-ln-xkgf6w2-72292-j2hkc-master-0 Normal SuccessfulAttachVolume 7s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-915ba939-95b7-4f0c-970e-a4487068113a" Warning FileSystemResizeFailed 6s kubelet MountVolume.NodeExpandVolume failed for volume "pvc-915ba939-95b7-4f0c-970e-a4487068113a" requested read-only file system Normal AddedInterface 4s multus Add eth0 [10.131.0.20/23] from openshift-sdn Normal Pulling 4s kubelet Pulling image "gcr.io/google_containers/busybox" Normal Pulled 3s kubelet Successfully pulled image "gcr.io/google_containers/busybox" in 225.537279ms Normal Created 3s kubelet Created container busybox Normal Started 3s kubelet Started container busybox The pod started just fine it seems. It's true that I can't do anything with the volume mounted to the pod ("Permission denied"), possibly because the relabeling did not happen, so even though the original bug looks to be fixed, the RWO feature is still somewhat useless in genral case. Your PVC is missing the dataSourceRef in spec, which looks suspicious. Was the VolumeSnapshot ReadyToUse when you tried to create the volume from it and use it in the pod?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069