Bug 1968253 - GCP CSI driver can provision volume with access mode ROX
Summary: GCP CSI driver can provision volume with access mode ROX
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.8
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.11.0
Assignee: Tomas Smetana
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-07 03:19 UTC by Chao Yang
Modified: 2022-08-10 10:36 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Release Note
Doc Text:
The provisioner sidecar now has an argument called `controller-publish-readonly` which sets the value of CSI PV spec `readonly` field value based on the PVC access mode. If this flag is set to `true` and the PVC access mode only contains the `ROX` access mode, the controller automatically sets `PersistentVolume.spec.CSIPersistentVolumeSource.readOnly` field to `true`.
Clone Of:
Environment:
Last Closed: 2022-08-10 10:36:25 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift gcp-pd-csi-driver-operator pull 36 0 None open Bug 1968253: Start provisioner with controller-publish-readonly option 2021-11-04 11:43:26 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 10:36:35 UTC

Description Chao Yang 2021-06-07 03:19:09 UTC
Description of problem:
GCP CSI driver provisioned volume with rox, when checked from the worker, mounted parameter is rw,relatime,seclabel

Version-Release number of selected component (if applicable):
4.8.0-0.nightly-2021-06-03-221810

How reproducible:
Always

Steps to Reproduce:
1.oc describe pvc/pvc3
Name:          pvc3
Namespace:     openshift-cluster-csi-drivers
StorageClass:  standard-csi
Status:        Bound
Volume:        pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
               volume.kubernetes.io/selected-node: chaoyang64-flgbm-worker-a-xn27m.c.openshift-qe.internal
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      2Gi
Access Modes:  ROX
VolumeMode:    Filesystem
Used By:       pod3
Events:
  Type    Reason                 Age                From                                                                                                          Message
  ----    ------                 ----               ----                                                                                                          -------
  Normal  WaitForFirstConsumer   19m (x3 over 20m)  persistentvolume-controller                                                                                   waiting for first consumer to be created before binding
  Normal  ExternalProvisioning   19m (x2 over 19m)  persistentvolume-controller                                                                                   waiting for a volume to be created, either by external provisioner "pd.csi.storage.gke.io" or manually created by system administrator
  Normal  Provisioning           19m                pd.csi.storage.gke.io_chaoyang64-flgbm-master-0.c.openshift-qe.internal_80871dfe-86ba-4881-9584-63c36274a831  External provisioner is provisioning volume for claim "openshift-cluster-csi-drivers/pvc3"
  Normal  ProvisioningSucceeded  19m                pd.csi.storage.gke.io_chaoyang64-flgbm-master-0.c.openshift-qe.internal_80871dfe-86ba-4881-9584-63c36274a831  Successfully provisioned volume pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d

2.oc describe pv/pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d
Name:              pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d
Labels:            <none>
Annotations:       pv.kubernetes.io/provisioned-by: pd.csi.storage.gke.io
Finalizers:        [kubernetes.io/pv-protection external-attacher/pd-csi-storage-gke-io]
StorageClass:      standard-csi
Status:            Bound
Claim:             openshift-cluster-csi-drivers/pvc3
Reclaim Policy:    Delete
Access Modes:      ROX
VolumeMode:        Filesystem
Capacity:          2Gi
Node Affinity:     
  Required Terms:  
    Term 0:        topology.gke.io/zone in [us-central1-a]
Message:           
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            pd.csi.storage.gke.io
    FSType:            ext4
    VolumeHandle:      projects/openshift-qe/zones/us-central1-a/disks/pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d
    ReadOnly:          false
    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=1622779448600-8081-pd.csi.storage.gke.io
Events:                <none>


3.mount | grep pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d
/dev/sdf on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d/globalmount type ext4 (rw,relatime,seclabel)
/dev/sdf on /var/lib/kubelet/pods/c50dc036-5634-4548-bebe-3e9f89598d26/volumes/kubernetes.io~csi/pvc-e75afa13-25d0-4bc1-9fe1-93260cc7c20d/mount type ext4 (rw,relatime,seclabel)

Actual results:
GCP CSI driver provisioned volume with rox

Expected results:
GCP CSI driver should not provision volume with rox 

Master Log:

Node Log (of failed PODs):

PV Dump:

PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Jan Safranek 2021-06-08 14:23:01 UTC
User asked for empty ReadOnlyMany volume and user got it :-). It's not very useful, but user may e.g. restore a snapshot there.

@Chao, can you check the volume is really read-only? rw mount option is odd, but it can be still attached as read only. If it's writable we need to fix it.

Comment 2 Chao Yang 2021-06-09 08:29:50 UTC
Hi @jsafrane,
We can write data to this volume 
oc get pvc
NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
myclaim   Bound    pvc-d39ff6af-d4b8-4ad1-a63c-ba307ae2ec5b   2Gi        ROX            standard-csi   9m17s

oc exec pod4 -ti -- bash
[root@pod4 /]# ls /tmp1
lost+found  test

ls -lrt /var/lib/kubelet/pods/14e28cf2-8e11-4ea7-8595-4a3abb21c7e1/volumes/kubernetes.io~csi/vc-d39ff6af-d4b8-4ad1-a63c-ba307ae2ec5b/mount
total 16
drwx------. 2 root root 16384 Jun  9 08:16 lost+found
-rw-r--r--. 1 root root     0 Jun  9 08:16 test

Comment 3 Jan Safranek 2021-06-15 11:25:53 UTC
Something in the cluster (kubelet? GCP CSI driver?) "forgets" to mount ReadOnlyMany volume as read only.
Mustafa, reproduce the issue, and check logs of the CSI driver - how was NodeStage/NodePublish called? Their VolumeCapability.AccessMode should be MULTI_NODE_READER_ONLY and then the driver should mount the volume as read-only, in theory.
https://github.com/container-storage-interface/spec/blob/486e6bdb2d5d814befb1d11744c39a33842af15f/csi.proto#L427

In addition, if you dynamically provision an empty ReadOnlyMany volume, the CSI driver should not even format the volume with ext4, it should be really read only and fail mounting it. It should succeed when you restore a snapshot of already formatted volume as a new PVC.

Comment 4 melbeher 2021-06-21 08:21:48 UTC
There is a an issue & PR upstream regarding this 

issue : https://github.com/kubernetes/kubernetes/issues/70505
 PR   : https://github.com/kubernetes-csi/external-provisioner/pull/469

Comment 5 Tomas Smetana 2021-09-10 13:59:07 UTC
This should have been fixed with rebase of the external CSI provisioner in OCP to version 3.0.0: moving manually to MODIFIED.

Comment 7 Chao Yang 2021-09-27 06:58:47 UTC
Failed on 
oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2021-09-23-210724   True        False         3h5m    Cluster version is 4.10.0-0.nightly-2021-09-23-210724

oc describe pv
Name:              pvc-97960186-529f-44a7-b887-ee13703f4395
Labels:            <none>
Annotations:       pv.kubernetes.io/provisioned-by: pd.csi.storage.gke.io
Finalizers:        [kubernetes.io/pv-protection external-attacher/pd-csi-storage-gke-io]
StorageClass:      standard-csi
Status:            Bound
Claim:             default/myclaim1
Reclaim Policy:    Delete
Access Modes:      ROX
VolumeMode:        Filesystem
Capacity:          2Gi
Node Affinity:     
  Required Terms:  
    Term 0:        topology.gke.io/zone in [us-central1-c]
Message:           
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            pd.csi.storage.gke.io
    FSType:            ext4
    VolumeHandle:      projects/openshift-qe/zones/us-central1-c/disks/pvc-97960186-529f-44a7-b887-ee13703f4395
    ReadOnly:          false
    VolumeAttributes:      storage.kubernetes.io/csiProvisionerIdentity=1632713817264-8081-pd.csi.storage.gke.io
Events:                <none>


/dev/sdb on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-97960186-529f-44a7-b887-ee13703f4395/globalmount type ext4 (rw,relatime,seclabel)
/dev/sdb on /var/lib/kubelet/pods/e447caa8-bd4b-48ed-9e04-cfc022e0568d/volumes/kubernetes.io~csi/pvc-97960186-529f-44a7-b887-ee13703f4395/mount type ext4 (rw,relatime,seclabel)

Comment 10 Chao Yang 2021-11-16 09:45:50 UTC
---
It is correct when try to provision and mount ro volumes.
  Warning  FailedMount             16s (x7 over 51s)  kubelet                  MountVolume.MountDevice failed for volume "pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7/globalmount") with fstype ("ext4") and options ([]): format of disk "/dev/disk/by-id/google-pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7" failed: type:("ext4") target:("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7/globalmount") options:("defaults") errcode:(exit status 1) output:(mke2fs 1.45.6 (20-Mar-2020)
/dev/disk/by-id/google-pvc-e4269d3a-3a2d-4920-8917-a30c0a0773e7: Read-only file system while setting up superblock
)
---
1.Create rwo pvc/pod
2.Create snapshotclass
3.Create volumesnapshot
oc get volumesnapshot
NAME                  READYTOUSE   SOURCEPVC   SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS   SNAPSHOTCONTENT                                    CREATIONTIME   AGE
new-snapshot-test-1   true         myclaim1                            1Gi           gcp-snap-2      snapcontent-320c7f7e-d651-47e7-a448-faaafa88b60b   3h10m          3h10m
4.Create restore pvc with rox
oc get pvc/pvc1-restore -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
    volume.kubernetes.io/selected-node: qe-chao-bug-4hclb-worker-c-gzlvv.c.openshift-qe.internal
  creationTimestamp: "2021-11-16T06:42:24Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: pvc1-restore
  namespace: test1
  resourceVersion: "86025"
  uid: d06b7fe6-87ef-4d5f-8866-841416c66c3e
spec:
  accessModes:
  - ReadOnlyMany
  dataSource:
    apiGroup: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: new-snapshot-test-1
  resources:
    requests:
      storage: 1Gi
  storageClassName: standard-csi
  volumeMode: Filesystem
  volumeName: pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e
status:
  accessModes:
  - ReadOnlyMany
  capacity:
    storage: 1Gi
  phase: Bound
5.oc get pods
NAME                                                          READY   STATUS                 RESTARTS   AGE
pod-restore                                                   0/1     CreateContainerError   0          177m
pod1                                                          1/1     Running                0          3h10m

oc describe pods/pod-restore
  Warning  FailedMount             3m18s                 kubelet                  Unable to attach or mount volumes: unmounted volumes=[aws1], unattached volumes=[aws1 kube-api-access-rj8fz]: timed out waiting for the condition
  Warning  FailedMount             64s (x2 over 5m34s)   kubelet                  Unable to attach or mount volumes: unmounted volumes=[aws1], unattached volumes=[kube-api-access-rj8fz aws1]: timed out waiting for the condition
  Warning  FailedMount             63s (x11 over 7m21s)  kubelet                  MountVolume.MountDevice failed for volume "pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e" : rpc error: code = Internal desc = Failed to format and mount device from ("/dev/disk/by-id/google-pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e/globalmount") with fstype ("ext4") and options ([]): mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/disk/by-id/google-pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e/globalmount
Output: mount: /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e/globalmount: cannot mount /dev/sdd read-only.
6.Tried on the node with `noload`, seems can mount to the node. 
mount -o ro,noload  /dev/disk/by-id/google-pvc-d06b7fe6-87ef-4d5f-8866-841416c66c3e /mnt/test/
ls /mnt/test/
lost+found  test

 oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2021-11-12-161948   True        False         6h29m   Cluster version is 4.10.0-0.nightly-2021-11-12-161948

Comment 11 Tomas Smetana 2021-11-16 12:19:49 UTC
I will try to reporoduce: do you have a spec for the pod-restore? I think it also needs to request a read-only mount for this thing to work.

Comment 14 Tomas Smetana 2021-11-30 11:58:54 UTC
I'm not sure the CSI to kubernetes volume mode mapping is complete and correct: I filed also https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver/issues/872 upstream and will keep tinkering...

Comment 16 Tomas Smetana 2022-02-17 14:21:26 UTC
This requires rebase to upstream driver v1.4.0, we have 1.3.4 in 4.10.

Comment 20 Chao Yang 2022-06-08 07:14:29 UTC
It is ok that could not provision rox volume.

  Warning  ProvisioningFailed    2s (x5 over 17s)      pd.csi.storage.gke.io_qe-chaoyang66-gvpz2-master-0.c.openshift-qe.internal_9e3b8511-f5ed-4d40-b7d3-4cc18a0140ab  failed to provision volume with StorageClass "standard-csi": rpc error: code = InvalidArgument desc = VolumeContentSource must be provided when AccessMode is set to read only

Comment 21 Chao Yang 2022-06-08 11:27:12 UTC
1.Create pvc/pod
2.Write some data into mounted volume
oc exec pod1 -- ls -lrt /tmp1
total 4
-r--r--r--. 1 root root 13 Jun  8 11:10 test

oc exec pod1 -- ls -lrt / | grep tmp1
dr--r--r--.   2 root root    4096 Jun  8 11:11 tmp1

3.Create volumesnapshot
4.Create restored pvc
oc get pvc pvc2-restore -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
    volume.kubernetes.io/selected-node: evakhoni-85461-2r9t4-worker-a-jmqtm.c.openshift-qe.internal
    volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
  creationTimestamp: "2022-06-08T11:15:22Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: pvc2-restore
  namespace: default
  resourceVersion: "101081"
  uid: 02762e9f-58a2-41c2-925c-478b933884a7
spec:
  accessModes:
  - ReadOnlyMany
  dataSource:
    apiGroup: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: new-snapshot-test-1
  resources:
    requests:
      storage: 2Gi
  storageClassName: standard-csi
  volumeMode: Filesystem
  volumeName: pvc-02762e9f-58a2-41c2-925c-478b933884a7
status:
  accessModes:
  - ReadOnlyMany
  capacity:
    storage: 2Gi
  phase: Bound
5.Create pod but container is error
pod2   0/1     CreateContainerError   0          7m11s

oc describe pod2
  Warning  Failed                  5m13s (x12 over 7m13s)  kubelet                  Error: relabel failed /var/lib/kubelet/pods/a101adb3-4505-4375-835d-17b178ef7a01/volumes/kubernetes.io~csi/pvc-02762e9f-58a2-41c2-925c-478b933884a7/mount: lsetxattr /var/lib/kubelet/pods/a101adb3-4505-4375-835d-17b178ef7a01/volumes/kubernetes.io~csi/pvc-02762e9f-58a2-41c2-925c-478b933884a7/mount: read-only file system

@tsmetana can you help to check it?

Comment 22 Tomas Smetana 2022-06-08 14:26:46 UTC
Hello. This is what I got on 4.11.0-0.ci-2022-06-06-185917:

Restored PVC:
$ oc get pvc -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: PersistentVolumeClaim
  metadata:
    annotations:
      pv.kubernetes.io/bind-completed: "yes"
      pv.kubernetes.io/bound-by-controller: "yes"
      volume.beta.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
      volume.kubernetes.io/selected-node: ci-ln-xkgf6w2-72292-j2hkc-worker-a-z94nq
      volume.kubernetes.io/storage-provisioner: pd.csi.storage.gke.io
    creationTimestamp: "2022-06-08T13:44:20Z"
    finalizers:
    - kubernetes.io/pvc-protection
    name: pvc1-restore
    namespace: default
    resourceVersion: "33700"
    uid: 915ba939-95b7-4f0c-970e-a4487068113a
  spec:
    accessModes:
    - ReadOnlyMany
    dataSource:
      apiGroup: snapshot.storage.k8s.io
      kind: VolumeSnapshot
      name: mysnap-1
    dataSourceRef:
      apiGroup: snapshot.storage.k8s.io
      kind: VolumeSnapshot
      name: mysnap-1
    resources:
      requests:
        storage: 1Gi
    storageClassName: standard-csi
    volumeMode: Filesystem
    volumeName: pvc-915ba939-95b7-4f0c-970e-a4487068113a
  status:
    accessModes:
    - ReadOnlyMany
    capacity:
      storage: 1Gi
    phase: Bound
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

The events from the pod using the PVC:

$ oc describe pod pod-restore 
...
Events:                                                                                                                                                                                                                               
  Type     Reason                  Age   From                     Message                                                                                                                                                             
  ----     ------                  ----  ----                     -------                                                                                                                                                             
  Normal   Scheduled               16s   default-scheduler        Successfully assigned default/pod-restore to ci-ln-xkgf6w2-72292-j2hkc-worker-a-z94nq by ci-ln-xkgf6w2-72292-j2hkc-master-0                                         
  Normal   SuccessfulAttachVolume  7s    attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-915ba939-95b7-4f0c-970e-a4487068113a"                                                                                 
  Warning  FileSystemResizeFailed  6s    kubelet                  MountVolume.NodeExpandVolume failed for volume "pvc-915ba939-95b7-4f0c-970e-a4487068113a" requested read-only file system                                           
  Normal   AddedInterface          4s    multus                   Add eth0 [10.131.0.20/23] from openshift-sdn                                                                                                                        
  Normal   Pulling                 4s    kubelet                  Pulling image "gcr.io/google_containers/busybox"                                                                                                                    
  Normal   Pulled                  3s    kubelet                  Successfully pulled image "gcr.io/google_containers/busybox" in 225.537279ms                                                                                        
  Normal   Created                 3s    kubelet                  Created container busybox                                                                                                                                           
  Normal   Started                 3s    kubelet                  Started container busybox

The pod started just fine it seems. It's true that I can't do anything with the volume mounted to the pod ("Permission denied"), possibly because the relabeling did not happen, so even though the original bug looks to be fixed, the RWO feature is still somewhat useless in genral case.

Your PVC is missing the dataSourceRef in spec, which looks suspicious. Was the VolumeSnapshot ReadyToUse when you tried to create the volume from it and use it in the pod?

Comment 24 errata-xmlrpc 2022-08-10 10:36:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.