Bug 2099214

Summary: Permission denied error while writing IO to ceph-rbd - fs - RWO based PVC
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Anant Malhotra <anamalho>
Component: csi-driverAssignee: Humble Chirammal <hchiramm>
Status: CLOSED INSUFFICIENT_DATA QA Contact: krishnaram Karthick <kramdoss>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.10CC: hchiramm, madam, muagarwa, ocs-bugs, odf-bz-bot, tdesala, ypadia
Target Milestone: ---Flags: ndevos: needinfo? (tdesala)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-04 02:22:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anant Malhotra 2022-06-20 10:29:41 UTC
Description of problem (please be detailed as possible and provide log
snippests):

The issue arose in longevity testing while running the following script 'tests/e2e/longevity/test_stage4.py' in the PR: https://github.com/red-hat-storage/ocs-ci/pull/5943

Getting Permission denied error while writing IO (or while creating any file) on the PVC of storage class - cephrbd, access mode - RWO and volume mode - FS.

```
~ $ fio --name=fio-rand-readwrite --filename=/mnt/fio_25 --readwrite=randrw --bs=4K --direct=0 --numjobs=1 --time_based=1 --runtime=20 --size=500M --iodepth=4 --invalidate=1 --fsync_on_close=1 --rwmixread=75 --i
oengine=libaio --rate=1m --rate_process=poisson --end_fsync=1
fio-rand-readwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=4
fio-3.28
Starting 1 process
fio-rand-readwrite: Laying out IO file (1 file / 500MiB)
fio: pid=0, err=13/file:filesetup.c:174, func=open, error=Permission denied


Run status group 0 (all jobs):
```

This issue is not just arising while FIO, it is also arising while creating a file.
```
~ $ touch /mnt/abc.txt
touch: /mnt/abc.txt: Permission denied
~ $ touch abc.txt
touch: abc.txt: Permission denied
```

```
~ $ ls -ld /mnt
drwxr-xr-x    3 root     root          4096 Jun 20 09:29 /mnt
```

The IO operation is working fine on all the other PVCs of type -
Cephfs - (RWO, RWX)
Cephrbd - (RWO-block, RWX-block)

The pod on which FIO is performed was created using the following yaml ->
---
apiVersion: v1
kind: Pod
metadata:
  name: perf-pod
  namespace: default
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
  containers:
   - name: performance
     image: quay.io/ocsci/perf:latest
     imagePullPolicy: IfNotPresent
     command: ['/bin/sh']
     stdin: true
     tty: true
     volumeMounts:
       - name: mypvc
         mountPath: /mnt
     securityContext:
       allowPrivilegeEscalation: false
       runAsNonRoot: true
       runAsUser: 1000
       capabilities:
         drop:
           - ALL
       seccompProfile:
         type: RuntimeDefault

  volumes:
   - name: mypvc
     persistentVolumeClaim:
       claimName: pvc
       readOnly: false



Version of all relevant components (if applicable):
OCP-4.10.15, ODF-4.10.3
OCP-4.11.0, ODF 4.11.0


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Not able to perform IO on the PVC of storage class - cephrbd, access mode - RWO and volume mode - FS.


Is there any workaround available to the best of your knowledge?
No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1


Can this issue reproducible?
Always

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Create a PVC of storage class - cephrbd, access mode - RWO and volume mode - FS
2. Create a POD using the yaml provided in the description and attach to this PVC 
3. Run IO on the POD.


Actual results:
Permission denied error while running IO on the PVC of storage class - cephrbd, access mode - RWO and volume mode - FS. The details of the error can be found in the description above.


Expected results:
IO should run completely without any error, on the PVC of storage class - cephrbd, access mode - RWO and volume mode - FS.

Additional info:
must-gather : http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/tdesala-long-testd/tdesala-long-testd_20220525T080711/logs/failed_testcase_ocs_logs_1655465100/test_longevity_stage4_ocs_logs/ocs_must_gather/

Comment 2 Humble Chirammal 2022-06-21 04:51:38 UTC
Looks like the extra edition you are making in pod spec via below (ie trying to use non privileged/completely restricted pod execution.) looks to be causing this : 

https://github.com/red-hat-storage/ocs-ci/pull/5943/files#diff-c3679703e785f8a8ae14abbe4b97f354fc00aab50332d5592a9750e929d51d55R8
https://github.com/red-hat-storage/ocs-ci/pull/5943/files#diff-c3679703e785f8a8ae14abbe4b97f354fc00aab50332d5592a9750e929d51d55R24

If you are running in the OCP setup for restricted or non privileged pod, the SCC ..etc has to be configured correctly. 

Can you try the longevity test without those changes in the pod yaml?

also, as a second thing, please add fsGroup* setting in the POD yaml and give a try: example snip can be found here:

https://bugzilla.redhat.com/show_bug.cgi?id=1988284#c2

The important part or the required part is `fsgroup` and `fsGroupChangePolicy` addition which match the `runAsUser`.  You can avoid `selinuxOptions` though.

[...]

securityContext:                               
     fsGroup: 1000510000                          
     fsGroupChangePolicy: OnRootMismatch          
     runAsUser: 1000510000                        
...

Comment 3 Mudit Agarwal 2022-06-21 07:30:25 UTC
Looks like a ci issue, moving to 4.12 while we work on RCA

Comment 4 Anant Malhotra 2022-06-21 12:35:08 UTC
After updating the yaml with fsgroup and fsgroupchangepolicy params, IO is running completely without any error on the PVC of storage class - cephrbd, access mode - RWO and volume mode - FS. Also IO is running fine on all the other PVCs as well.

Updated yaml ->
---
apiVersion: v1
kind: Pod
metadata:
  name: perf-pod
  namespace: default
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    fsGroup: 1000
    fsGroupChangePolicy: OnRootMismatch
  containers:
   - name: performance
     image: quay.io/ocsci/perf:latest
     imagePullPolicy: IfNotPresent
     command: ['/bin/sh']
     stdin: true
     tty: true
     volumeMounts:
       - name: mypvc
         mountPath: /mnt
     securityContext:
       allowPrivilegeEscalation: false
       runAsNonRoot: true
       runAsUser: 1000
       fsGroup: 1000
       fsGroupChangePolicy: OnRootMismatch
       capabilities:
         drop:
           - ALL
       seccompProfile:
         type: RuntimeDefault

  volumes:
   - name: mypvc
     persistentVolumeClaim:
       claimName: pvc
       readOnly: false

Comment 5 Prasad Desala 2022-06-21 12:47:42 UTC
@Humble,

Even without the below fsgroup entries in the securityContext, we are able to write IO on all other supported PVC types successfully without any issues/errors. The permission denied error is observed only with this specific volume: ceph-rbd-RWO. What could be the reason why we are seeing this issue only on ceph-rbd-RWO?

[...]

securityContext:                               
     fsGroup: 1000510000                          
     fsGroupChangePolicy: OnRootMismatch          
     runAsUser: 1000510000                        
...

Comment 6 Niels de Vos 2022-08-30 10:24:53 UTC
(In reply to Prasad Desala from comment #5)
> @Humble,
> 
> Even without the below fsgroup entries in the securityContext, we are able
> to write IO on all other supported PVC types successfully without any
> issues/errors. The permission denied error is observed only with this
> specific volume: ceph-rbd-RWO. What could be the reason why we are seeing
> this issue only on ceph-rbd-RWO?

Because this RBD volume has a filesystem on top, it is required to check for issues with the filesystem as well. In case the volume (or RBD connection to Ceph) had problems, it could cause the filesystem to become read-only. You would need to inspect the kernel logs from the time the issue occurred. Moving the Pod to an other node may show details about a corrupt filesystem too (mkfs execution in the csi-rbdplugin logs on the new node).

Logs of the node where the problem happened do not seem to be available, or at least I am not able to find the linked in this BZ.

Steps to reproduce this (get a volume into this error state) in an other environment or with an other volume would help.

Comment 7 Mudit Agarwal 2022-10-04 02:22:09 UTC
Please reopen when we have enough data to move ahead.