Bug 1745773

Summary: Delay in detecting a successful CSI mount or incorrect report of failed mount, by kubelet for Ceph CSI based volume (due to the recursive fsGroup setting)
Product: OpenShift Container Platform Reporter: Shyamsundar <srangana>
Component: StorageAssignee: Fabio Bertinatto <fbertina>
Status: CLOSED ERRATA QA Contact: Chao Yang <chaoyang>
Severity: urgent Docs Contact:
Priority: high    
Version: 4.2.0CC: aos-bugs, aos-storage-staff, asakala, bbennett, chaoyang, fbertina, hchiramm, jstrunk, mloriedo, mrajanna, ykaul
Target Milestone: ---Keywords: Performance
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: ocs-monkey
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1762658 (view as bug list) Environment:
Last Closed: 2020-01-23 11:05:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1762658    

Description Shyamsundar 2019-08-26 21:23:05 UTC
Description of problem:

There is a significant delay (from the app perspective) in between an attach and mount request completion by the ceph-csi nodeplugin driver and the kubelet realizing that the operation is complete.

The delay between the events seems to be about 2 minutes, causing the workload to report pod startup failures. Whereas the ceph-csi rbd nodeplugin has completed the request in about 3 seconds.

The intention of this bug is to understand why the delay, and if it is due to other reasons than failing to mount, why the events do not reflect the same.

Version-Release number of selected component (if applicable):

$ oc version
Client Version: version.Info{Major:"", Minor:"", GitVersion:"v4.2.0-alpha.0-27-gb833744", GitCommit:"b833744", GitTreeState:"clean", BuildDate:"2019-08-23T21:44:02Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.0+be7ae76", GitCommit:"be7ae76", GitTreeState:"clean", BuildDate:"2019-08-26T09:22:11Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
OpenShift Version: 4.2.0-0.ci-2019-08-26-110123

How reproducible:

It happens about 1 in 5 times the ocs-monkey workload is executed.

Steps to Reproduce:
1. Setup a OCP+OCS cluster, with Rook+Ceph and a storage class created for RBD (named csi-rbd)
2. Run the ocs-monkey tool (https://github.com/JohnStrunk/ocs-monkey) against this setup with the following CLI options:
  - ./workload_runner.py -l ./ -m ReadWriteOnce -n testws --oc /path/to/oc --ocs-namespace <rook-ceph namespace> -s csi-rbd
3. In about 2-10 minutes the workload runner will report failure and start gathering ocs and ocp must-gather logs
4. The failure would be along the lines of "Unhealthy: testws/osio-worker-524728073" which basically means pod did not report as "Running" within a certain timeout

Actual results:

Workload stops running due to failure to detect pod is in "Running" state within specified time limits (say around 30 seconds).

Expected results:

Workload should run without any such failures.

Master Log:

See to be attached logs (under directory "./ocs-monkey-82240754/must-gather-2/")

Node Log (of failed PODs):

See to be attached logs (under ./ocs-monkey-82240754/ip-10-0*.node.log)

PVC Dump:

$ oc get -n testws pvc/pvc-524728073 -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
  creationTimestamp: "2019-08-26T15:43:52Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: pvc-524728073
  namespace: testws
  resourceVersion: "24221"
  selfLink: /api/v1/namespaces/testws/persistentvolumeclaims/pvc-524728073
  uid: 4df1f65f-c818-11e9-9022-02f7b213f410
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 3Gi
  storageClassName: csi-rbd
  volumeMode: Filesystem
  volumeName: pvc-4df1f65f-c818-11e9-9022-02f7b213f410
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 3Gi
  phase: Bound

PV Dump:

$ oc get pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410 -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rook-ceph.rbd.csi.ceph.com
  creationTimestamp: "2019-08-26T15:43:55Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-4df1f65f-c818-11e9-9022-02f7b213f410
  resourceVersion: "24217"
  selfLink: /api/v1/persistentvolumes/pvc-4df1f65f-c818-11e9-9022-02f7b213f410
  uid: 4fa6d706-c818-11e9-b67a-0e9b9a4cfc94
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 3Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: pvc-524728073
    namespace: testws
    resourceVersion: "24151"
    uid: 4df1f65f-c818-11e9-9022-02f7b213f410
  csi:
    driver: rook-ceph.rbd.csi.ceph.com
    fsType: xfs
    nodeStageSecretRef:
      name: rook-ceph-csi
      namespace: rook-ceph
    volumeAttributes:
      clusterID: rook-ceph
      pool: replicapool
      storage.kubernetes.io/csiProvisionerIdentity: 1566833504844-8081-rook-ceph.rbd.csi.ceph.com
    volumeHandle: 0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010
  persistentVolumeReclaimPolicy: Delete
  storageClassName: csi-rbd
  volumeMode: Filesystem
status:
  phase: Bound

StorageClass Dump (if StorageClass used by PV/PVC):

$ oc get sc/csi-rbd -o yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"csi-rbd"},"parameters":{"clusterID":"rook-ceph","csi.storage.k8s.io/fstype":"xfs","csi.storage.k8s.io/node-stage-secret-name":"rook-ceph-csi","csi.storage.k8s.io/node-stage-secret-namespace":"rook-ceph","csi.storage.k8s.io/provisioner-secret-name":"rook-ceph-csi","csi.storage.k8s.io/provisioner-secret-namespace":"rook-ceph","pool":"replicapool"},"provisioner":"rook-ceph.rbd.csi.ceph.com","reclaimPolicy":"Delete"}
  creationTimestamp: "2019-08-26T15:29:29Z"
  name: csi-rbd
  resourceVersion: "15434"
  selfLink: /apis/storage.k8s.io/v1/storageclasses/csi-rbd
  uid: 4b8e1065-c816-11e9-b67a-0e9b9a4cfc94
parameters:
  clusterID: rook-ceph
  csi.storage.k8s.io/fstype: xfs
  csi.storage.k8s.io/node-stage-secret-name: rook-ceph-csi
  csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
  csi.storage.k8s.io/provisioner-secret-name: rook-ceph-csi
  csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
  pool: replicapool
provisioner: rook-ceph.rbd.csi.ceph.com
reclaimPolicy: Delete
volumeBindingMode: Immediate

Additional info:

Pod that timed out waiting for volume mount (with timestamp): testws/osio-worker-524728073
2019-08-26 15:48:25,706 osio - ERROR - Unhealthy: testws/osio-worker-524728073

Node on which pod was scheduled: ip-10-0-159-243.ec2.internal/10.0.159.243

ceph-csi node plugin pod on the above node:
csi-rbdplugin-n7ddn                           3/3     Running     0          95m   10.0.159.243   ip-10-0-159-243.ec2.internal   <none>           <none>

Logs from CSI nodeplugin for that PV demonstrating PV mount times (takes about ~3 seconds):
--------------------------------
2019-08-26T15:47:30.89587171Z I0826 15:47:30.887132   20545 utils.go:119] ID: 279 GRPC call: /csi.v1.Node/NodeStageVolume
I0826 15:47:30.887152   20545 utils.go:120] ID: 279 GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"xfs"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"rook-ceph","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1566833504844-8081-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010"}

I0826 15:47:33.580600   20545 nodeserver.go:161] ID: 279 rbd: successfully mounted volume 0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010 to stagingTargetPath /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount/0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010
2019-08-26T15:47:33.583885329Z I0826 15:47:33.580652   20545 utils.go:125] ID: 279 GRPC response: {}

2019-08-26T15:47:33.601456621Z I0826 15:47:33.600735   20545 utils.go:119] ID: 281 GRPC call: /csi.v1.Node/NodePublishVolume
I0826 15:47:33.600757   20545 utils.go:120] ID: 281 GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount","target_path":"/var/lib/kubelet/pods/bad60295-c818-11e9-9022-02f7b213f410/volumes/kubernetes.io~csi/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"xfs"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"rook-ceph","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1566833504844-8081-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010"}

I0826 15:47:33.714462   20545 nodeserver.go:261] ID: 281 rbd: successfully mounted stagingPath /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount/0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010 to targetPath /var/lib/kubelet/pods/bad60295-c818-11e9-9022-02f7b213f410/volumes/kubernetes.io~csi/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/mount
2019-08-26T15:47:33.715244475Z I0826 15:47:33.714488   20545 utils.go:125] ID: 281 GRPC response: {}
--------------------------------

Describe of failing pod:
--------------------------------
oc -n testws describe pod/osio-worker-524728073-848b9c5dd7-hf2h7
Name:               osio-worker-524728073-848b9c5dd7-hf2h7
Namespace:          testws
Priority:           0
PriorityClassName:  <none>
Node:               ip-10-0-159-243.ec2.internal/10.0.159.243
Start Time:         Mon, 26 Aug 2019 11:46:55 -0400
Labels:             deployment-id=524728073
                    pod-template-hash=848b9c5dd7
Annotations:        openshift.io/scc: restricted
Status:             Running
IP:                 10.128.2.52
Controlled By:      ReplicaSet/osio-worker-524728073-848b9c5dd7
Containers:
  osio-workload:
    Container ID:  cri-o://78dd7e7b08eaa61d6ecaadd4804203990d6310d68cf045612758daed781c4fc9
    Image:         quay.io/johnstrunk/osio-workload
    Image ID:      quay.io/johnstrunk/osio-workload@sha256:165563b41314f5bcd216c1973f3112d1013852fab085459fae3856b5e2d70b7e
    Port:          <none>
    Host Port:     <none>
    Args:
      --untar-rate
      10
      --rm-rate
      10
      --kernel-slots
      3
    State:          Running
      Started:      Mon, 26 Aug 2019 11:49:41 -0400
    Ready:          True
    Restart Count:  0
    Readiness:      exec [/health.sh] delay=5s timeout=1s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /data from data (rw)
      /status from temp (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-mtrzt (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  temp:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  <unset>
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-524728073
    ReadOnly:   false
  default-token-mtrzt:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-mtrzt
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                  From                                   Message
  ----     ------                  ----                 ----                                   -------
  Normal   Scheduled               78m                  default-scheduler                      Successfully assigned testws/osio-worker-524728073-848b9c5dd7-hf2h7 to ip-10-0-159-243.ec2.internal
  Warning  FailedAttachVolume      78m                  attachdetach-controller                Multi-Attach error for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" Volume is already used by pod(s) osio-worker-524728073-848b9c5dd7-7nwrq
  Normal   SuccessfulAttachVolume  77m                  attachdetach-controller                AttachVolume.Attach succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410"
  Warning  FailedMount             76m                  kubelet, ip-10-0-159-243.ec2.internal  Unable to mount volumes for pod "osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410)": timeout expired waiting for volumes to attach or mount for pod "testws"/"osio-worker-524728073-848b9c5dd7-hf2h7". list of unmounted volumes=[data]. list of unattached volumes=[temp data default-token-mtrzt]
  Normal   Pulling                 75m                  kubelet, ip-10-0-159-243.ec2.internal  Pulling image "quay.io/johnstrunk/osio-workload"
  Normal   Pulled                  75m                  kubelet, ip-10-0-159-243.ec2.internal  Successfully pulled image "quay.io/johnstrunk/osio-workload"
  Normal   Created                 75m                  kubelet, ip-10-0-159-243.ec2.internal  Created container osio-workload
  Normal   Started                 75m                  kubelet, ip-10-0-159-243.ec2.internal  Started container osio-workload
  Warning  Unhealthy               74s (x382 over 72m)  kubelet, ip-10-0-159-243.ec2.internal  Readiness probe errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
--------------------------------

Node logs around the time of the event:
--------------------------------
$ cat ip-10-0-159-243.ec2.internal.node.log | grep 4df1f65f-c818-11e9-9022-02f7b213f410
Aug 26 15:46:57 ip-10-0-159-243 hyperkube[1171]: I0826 15:46:57.534628    1171 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:46:57 ip-10-0-159-243 hyperkube[1171]: E0826 15:46:57.534779    1171 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\"" failed. No retries permitted until 2019-08-26 15:46:58.034734213 +0000 UTC m=+1549.181040282 (durationBeforeRetry 500ms). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-4df1f65f-c818-11e9-9022-02f7b213f410\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\") pod \"osio-worker-524728073-848b9c5dd7-hf2h7\" (UID: \"bad60295-c818-11e9-9022-02f7b213f410\") "
Aug 26 15:46:58 ip-10-0-159-243 hyperkube[1171]: I0826 15:46:58.046471    1171 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:46:58 ip-10-0-159-243 hyperkube[1171]: E0826 15:46:58.046720    1171 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\"" failed. No retries permitted until 2019-08-26 15:46:59.046675192 +0000 UTC m=+1550.192981258 (durationBeforeRetry 1s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-4df1f65f-c818-11e9-9022-02f7b213f410\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\") pod \"osio-worker-524728073-848b9c5dd7-hf2h7\" (UID: \"bad60295-c818-11e9-9022-02f7b213f410\") "
Aug 26 15:46:59 ip-10-0-159-243 hyperkube[1171]: I0826 15:46:59.128252    1171 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:47:00 ip-10-0-159-243 hyperkube[1171]: E0826 15:47:00.511586    1171 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\"" failed. No retries permitted until 2019-08-26 15:47:02.510765314 +0000 UTC m=+1553.657071479 (durationBeforeRetry 2s). Error: "Volume not attached according to node status for volume \"pvc-4df1f65f-c818-11e9-9022-02f7b213f410\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\") pod \"osio-worker-524728073-848b9c5dd7-hf2h7\" (UID: \"bad60295-c818-11e9-9022-02f7b213f410\") "
Aug 26 15:47:02 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:02.569563    1171 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:47:02 ip-10-0-159-243 hyperkube[1171]: E0826 15:47:02.574972    1171 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\"" failed. No retries permitted until 2019-08-26 15:47:06.574925711 +0000 UTC m=+1557.721231769 (durationBeforeRetry 4s). Error: "Volume not attached according to node status for volume \"pvc-4df1f65f-c818-11e9-9022-02f7b213f410\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\") pod \"osio-worker-524728073-848b9c5dd7-hf2h7\" (UID: \"bad60295-c818-11e9-9022-02f7b213f410\") "
Aug 26 15:47:06 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:06.604896    1171 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:47:06 ip-10-0-159-243 hyperkube[1171]: E0826 15:47:06.610498    1171 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\"" failed. No retries permitted until 2019-08-26 15:47:14.610450018 +0000 UTC m=+1565.756756129 (durationBeforeRetry 8s). Error: "Volume not attached according to node status for volume \"pvc-4df1f65f-c818-11e9-9022-02f7b213f410\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\") pod \"osio-worker-524728073-848b9c5dd7-hf2h7\" (UID: \"bad60295-c818-11e9-9022-02f7b213f410\") "
Aug 26 15:47:14 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:14.695811    1171 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:47:14 ip-10-0-159-243 hyperkube[1171]: E0826 15:47:14.701241    1171 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\"" failed. No retries permitted until 2019-08-26 15:47:30.701195236 +0000 UTC m=+1581.847501293 (durationBeforeRetry 16s). Error: "Volume not attached according to node status for volume \"pvc-4df1f65f-c818-11e9-9022-02f7b213f410\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010\") pod \"osio-worker-524728073-848b9c5dd7-hf2h7\" (UID: \"bad60295-c818-11e9-9022-02f7b213f410\") "
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.763967    1171 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.772337    1171 operation_generator.go:1339] Controller attach succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") device path: "csi-c1f9dd4c3e2e023972e28a995d315e242a9176fb0c13617f6e24cf62551d3dc5"
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.864573    1171 reconciler.go:252] operationExecutor.MountVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.865133    1171 operation_generator.go:587] MountVolume.WaitForAttach entering for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") DevicePath "csi-c1f9dd4c3e2e023972e28a995d315e242a9176fb0c13617f6e24cf62551d3dc5"
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.870098    1171 operation_generator.go:596] MountVolume.WaitForAttach succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") DevicePath "csi-c1f9dd4c3e2e023972e28a995d315e242a9176fb0c13617f6e24cf62551d3dc5"
Aug 26 15:47:32 ip-10-0-159-243 systemd[1]: Started Kubernetes transient mount for /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount/0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010.
Aug 26 15:47:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:33.584277    1171 operation_generator.go:623] MountVolume.MountDevice succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount"
Aug 26 15:49:36 ip-10-0-159-243 hyperkube[1171]: I0826 15:49:36.607093    1171 operation_generator.go:667] MountVolume.SetUp succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
--------------------------------
Further node logs from the same node filtered based on the pod

Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.763967    1171 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.772337    1171 operation_generator.go:1339] Controller attach succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") device path: "csi-c1f9dd4c3e2e023972e28a995d315e242a9176fb0c13617f6e24cf62551d3dc5"
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.864573    1171 reconciler.go:252] operationExecutor.MountVolume started for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.865133    1171 operation_generator.go:587] MountVolume.WaitForAttach entering for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") DevicePath "csi-c1f9dd4c3e2e023972e28a995d315e242a9176fb0c13617f6e24cf62551d3dc5"
Aug 26 15:47:30 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:30.870098    1171 operation_generator.go:596] MountVolume.WaitForAttach succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") DevicePath "csi-c1f9dd4c3e2e023972e28a995d315e242a9176fb0c13617f6e24cf62551d3dc5"
Aug 26 15:47:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:33.584277    1171 operation_generator.go:623] MountVolume.MountDevice succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount"
Aug 26 15:47:39 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:39.911681    1171 worker.go:179] Probe target container not found: osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410) - osio-workload
Aug 26 15:47:49 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:49.900863    1171 worker.go:179] Probe target container not found: osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410) - osio-workload
Aug 26 15:47:59 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:59.940155    1171 worker.go:179] Probe target container not found: osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410) - osio-workload
Aug 26 15:48:09 ip-10-0-159-243 hyperkube[1171]: I0826 15:48:09.902925    1171 worker.go:179] Probe target container not found: osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410) - osio-workload
Aug 26 15:48:19 ip-10-0-159-243 hyperkube[1171]: I0826 15:48:19.904842    1171 worker.go:179] Probe target container not found: osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410) - osio-workload
Aug 26 15:48:29 ip-10-0-159-243 hyperkube[1171]: I0826 15:48:29.899891    1171 worker.go:179] Probe target container not found: osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410) - osio-workload
Aug 26 15:48:39 ip-10-0-159-243 hyperkube[1171]: I0826 15:48:39.897517    1171 worker.go:179] Probe target container not found: osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410) - osio-workload
Aug 26 15:48:49 ip-10-0-159-243 hyperkube[1171]: I0826 15:48:49.910544    1171 worker.go:179] Probe target container not found: osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410) - osio-workload
Aug 26 15:48:58 ip-10-0-159-243 hyperkube[1171]: E0826 15:48:58.142687    1171 kubelet.go:1666] Unable to mount volumes for pod "osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410)": timeout expired waiting for volumes to attach or mount for pod "testws"/"osio-worker-524728073-848b9c5dd7-hf2h7". list of unmounted volumes=[data]. list of unattached volumes=[temp data default-token-mtrzt]; skipping pod

-----------------------------------

Comment 2 Fabio Bertinatto 2019-08-28 17:56:15 UTC
It seems like this is where it's taking the most time:

Aug 26 15:47:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:33.584277    1171 operation_generator.go:623] MountVolume.MountDevice succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount"
Aug 26 15:49:36 ip-10-0-159-243 hyperkube[1171]: I0826 15:49:36.607093    1171 operation_generator.go:667] MountVolume.SetUp succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")

There's a 2-minute delay during the MountVolume.SetUp call.

Assuming the NodePublishVolume call returns quickly (as confirmed by the logs), the delay seems to be happening on applying the volume ownership to the files:

https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/csi_mounter.go#L267

I might happen that a pod starts and creates many files in the volume. Then a second pod starts and tries to change the ownership of those files, taking some time to finish.

The application creates many files in the volume?

Comment 3 Shyamsundar 2019-08-29 01:33:45 UTC
(In reply to Fabio Bertinatto from comment #2)
> There's a 2-minute delay during the MountVolume.SetUp call.
> 
> Assuming the NodePublishVolume call returns quickly (as confirmed by the
> logs), the delay seems to be happening on applying the volume ownership to
> the files:
> 
> https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/
> csi_mounter.go#L267
> 
> I might happen that a pod starts and creates many files in the volume. Then
> a second pod starts and tries to change the ownership of those files, taking
> some time to finish.
> 
> The application creates many files in the volume?

The application untars a linux kernel source code, maintaining a max of 4 untar copies on the volume at any given time. It also periodically removes an entire untarred directory. This is a mock application to test functionality of a particular workload. More details here: https://github.com/JohnStrunk/ocs-monkey/blob/master/osio-workload/workload.py

So I would assume at any given time there maybe 1-4 copies of the linux source code, and that can be a lot of files in the volume.

Are there any prescribed ways to avoid walking the tree and setting the fsGroup? Or, ways in which this issue can be circumvented?

Comment 4 Humble Chirammal 2019-08-29 03:54:37 UTC
(In reply to Fabio Bertinatto from comment #2)
> It seems like this is where it's taking the most time:
> 
> Aug 26 15:47:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:33.584277   
> 1171 operation_generator.go:623] MountVolume.MountDevice succeeded for
> volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> "osio-worker-524728073-848b9c5dd7-hf2h7" (UID:
> "bad60295-c818-11e9-9022-02f7b213f410") device mount path
> "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-
> 02f7b213f410/globalmount"
> Aug 26 15:49:36 ip-10-0-159-243 hyperkube[1171]: I0826 15:49:36.607093   
> 1171 operation_generator.go:667] MountVolume.SetUp succeeded for volume
> "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> "osio-worker-524728073-848b9c5dd7-hf2h7" (UID:
> "bad60295-c818-11e9-9022-02f7b213f410")
> 
> There's a 2-minute delay during the MountVolume.SetUp call.
> 
> Assuming the NodePublishVolume call returns quickly (as confirmed by the
> logs), the delay seems to be happening on applying the volume ownership to
> the files:
> 
> https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/
> csi_mounter.go#L267
> 
> I might happen that a pod starts and creates many files in the volume. Then
> a second pod starts and tries to change the ownership of those files, taking
> some time to finish.
> 
> The application creates many files in the volume?

Nice catch fabio! I do think this is where the delay is introduced. 

Looking at FSGroup apply process and this workload case, I see more than one pod 
access this volume and good amount of volume/file operations happens in this workload test.

...
		accessModes := c.spec.PersistentVolume.Spec.AccessModes
		if c.spec.PersistentVolume.Spec.AccessModes == nil {
			klog.V(4).Info(log("mounter.SetupAt WARNING: skipping fsGroup, access modes not provided"))
			return nil
		}
		if !hasReadWriteOnce(accessModes) {
			klog.V(4).Info(log("mounter.SetupAt WARNING: skipping fsGroup, only support ReadWriteOnce access mode"))
			return nil
		}

...


This PVC is defined as "ReadWriteOnce" in this setup



--snip--

  volumeMode: Filesystem
  volumeName: pvc-4df1f65f-c818-11e9-9022-02f7b213f410
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 3Gi
  phase: Bound

PV Dump:

$ oc get pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410 -o yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rook-ceph.rbd.csi.ceph.com
  creationTimestamp: "2019-08-26T15:43:55Z"
  finalizers:
  - kubernetes.io/pv-protection
  name: pvc-4df1f65f-c818-11e9-9022-02f7b213f410
  resourceVersion: "24217"
  selfLink: /api/v1/persistentvolumes/pvc-4df1f65f-c818-11e9-9022-02f7b213f410
  uid: 4fa6d706-c818-11e9-b67a-0e9b9a4cfc94
spec:
  accessModes:
  - ReadWriteOnce


--/snip--


Considering this is any way multi access volume, can we define this PVC as "RWX" and perform this test?  this mode change could avoid volume ownership process and could workaround/Fix this issue. 

Fabio, any thoughts on my suggestion ?

Comment 5 Humble Chirammal 2019-08-29 03:56:43 UTC
(In reply to Shyamsundar from comment #3)
> (In reply to Fabio Bertinatto from comment #2)
> > There's a 2-minute delay during the MountVolume.SetUp call.
> > 
> > Assuming the NodePublishVolume call returns quickly (as confirmed by the
> > logs), the delay seems to be happening on applying the volume ownership to
> > the files:
> > 
> > https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/
> > csi_mounter.go#L267
> > 
> > I might happen that a pod starts and creates many files in the volume. Then
> > a second pod starts and tries to change the ownership of those files, taking
> > some time to finish.
> > 
> > The application creates many files in the volume?
> 
> The application untars a linux kernel source code, maintaining a max of 4
> untar copies on the volume at any given time. It also periodically removes
> an entire untarred directory. This is a mock application to test
> functionality of a particular workload. More details here:
> https://github.com/JohnStrunk/ocs-monkey/blob/master/osio-workload/workload.
> py
> 
> So I would assume at any given time there maybe 1-4 copies of the linux
> source code, and that can be a lot of files in the volume.
> 
> Are there any prescribed ways to avoid walking the tree and setting the
> fsGroup? Or, ways in which this issue can be circumvented?

Yes, this is indeed a heavy operation and one of the fix/workaround or isolation I can think of here is 
mentioned at https://bugzilla.redhat.com/show_bug.cgi?id=1745773#c4

Can we give a try?

Comment 6 Fabio Bertinatto 2019-08-29 08:49:59 UTC
(In reply to Humble Chirammal from comment #4)
> (In reply to Fabio Bertinatto from comment #2)
> > It seems like this is where it's taking the most time:
> > 
> > Aug 26 15:47:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:33.584277   
> > 1171 operation_generator.go:623] MountVolume.MountDevice succeeded for
> > volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> > "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> > 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> > "osio-worker-524728073-848b9c5dd7-hf2h7" (UID:
> > "bad60295-c818-11e9-9022-02f7b213f410") device mount path
> > "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-
> > 02f7b213f410/globalmount"
> > Aug 26 15:49:36 ip-10-0-159-243 hyperkube[1171]: I0826 15:49:36.607093   
> > 1171 operation_generator.go:667] MountVolume.SetUp succeeded for volume
> > "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> > "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> > 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> > "osio-worker-524728073-848b9c5dd7-hf2h7" (UID:
> > "bad60295-c818-11e9-9022-02f7b213f410")
> > 
> > There's a 2-minute delay during the MountVolume.SetUp call.
> > 
> > Assuming the NodePublishVolume call returns quickly (as confirmed by the
> > logs), the delay seems to be happening on applying the volume ownership to
> > the files:
> > 
> > https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/
> > csi_mounter.go#L267
> > 
> > I might happen that a pod starts and creates many files in the volume. Then
> > a second pod starts and tries to change the ownership of those files, taking
> > some time to finish.
> > 
> > The application creates many files in the volume?
> 
> Nice catch fabio! I do think this is where the delay is introduced. 
> 
> Looking at FSGroup apply process and this workload case, I see more than one
> pod 
> access this volume and good amount of volume/file operations happens in this
> workload test.
> 
> ...
> 		accessModes := c.spec.PersistentVolume.Spec.AccessModes
> 		if c.spec.PersistentVolume.Spec.AccessModes == nil {
> 			klog.V(4).Info(log("mounter.SetupAt WARNING: skipping fsGroup, access
> modes not provided"))
> 			return nil
> 		}
> 		if !hasReadWriteOnce(accessModes) {
> 			klog.V(4).Info(log("mounter.SetupAt WARNING: skipping fsGroup, only
> support ReadWriteOnce access mode"))
> 			return nil
> 		}
> 
> ...
> 
> 
> This PVC is defined as "ReadWriteOnce" in this setup
> 
> 
> 
> --snip--
> 
>   volumeMode: Filesystem
>   volumeName: pvc-4df1f65f-c818-11e9-9022-02f7b213f410
> status:
>   accessModes:
>   - ReadWriteOnce
>   capacity:
>     storage: 3Gi
>   phase: Bound
> 
> PV Dump:
> 
> $ oc get pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410 -o yaml
> apiVersion: v1
> kind: PersistentVolume
> metadata:
>   annotations:
>     pv.kubernetes.io/provisioned-by: rook-ceph.rbd.csi.ceph.com
>   creationTimestamp: "2019-08-26T15:43:55Z"
>   finalizers:
>   - kubernetes.io/pv-protection
>   name: pvc-4df1f65f-c818-11e9-9022-02f7b213f410
>   resourceVersion: "24217"
>   selfLink:
> /api/v1/persistentvolumes/pvc-4df1f65f-c818-11e9-9022-02f7b213f410
>   uid: 4fa6d706-c818-11e9-b67a-0e9b9a4cfc94
> spec:
>   accessModes:
>   - ReadWriteOnce
> 
> 
> --/snip--
> 
> 
> Considering this is any way multi access volume, can we define this PVC as
> "RWX" and perform this test?  this mode change could avoid volume ownership
> process and could workaround/Fix this issue. 
> 
> Fabio, any thoughts on my suggestion ?


Yes, using a different access mode would circumvent this. Does the CSI driver support RWX? It seems like RBD doesn't support it:

https://kubernetes.io/docs/concepts/storage/persistent-volumes/

Also, there's an on-going effort in upstream to set file ownership only after formatting:

https://github.com/kubernetes/kubernetes/issues/69699

Comment 7 Fabio Bertinatto 2019-08-29 09:01:07 UTC
Other than that, the pod is reported unhealthy because the readiness probe is failing.

The volume is eventually mounted and the pod starts, even with the 2-minute delay mounting the volume.

From the node logs:

Aug 26 15:52:14 ip-10-0-159-243 hyperkube[1171]: I0826 15:52:14.608187    1171 prober.go:125] Readiness probe for "osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410):osio-workload" succeeded
Aug 26 15:52:20 ip-10-0-159-243 hyperkube[1171]: I0826 15:52:20.053897    1171 prober.go:125] Readiness probe for "osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410):osio-workload" succeeded
Aug 26 15:52:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:52:33.263416    1171 prober.go:107] Readiness probe for "osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410):osio-workload" errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
Aug 26 15:52:44 ip-10-0-159-243 hyperkube[1171]: I0826 15:52:44.262419    1171 prober.go:107] Readiness probe for "osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410):osio-workload" errored: rpc error: code = Unknown desc = command error: command timed out, stdout: , stderr: , exit code -1
Aug 26 15:52:50 ip-10-0-159-243 hyperkube[1171]: I0826 15:52:50.001484    1171 prober.go:125] Readiness probe for "osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410):osio-workload" succeeded
Aug 26 15:53:01 ip-10-0-159-243 hyperkube[1171]: I0826 15:53:01.969155    1171 prober.go:125] Readiness probe for "osio-worker-524728073-848b9c5dd7-hf2h7_testws(bad60295-c818-11e9-9022-02f7b213f410):osio-workload" succeeded

This is the readiness command:

docker run --rm -it --entrypoint=/bin/sh quay.io/johnstrunk/osio-workload 
sh-4.2$ cat /health.sh 
#! /bin/bash

touch /data/writable && \
stat /status/running

Another option is to re-evaluate if this is the best way to assess the pod readiness in this situation.

Comment 8 Humble Chirammal 2019-08-29 09:04:12 UTC
> > 
> > This PVC is defined as "ReadWriteOnce" in this setup
> > 
> > 
> > 
> > --snip--
> > 
> >   volumeMode: Filesystem
> >   volumeName: pvc-4df1f65f-c818-11e9-9022-02f7b213f410
> > status:
> >   accessModes:
> >   - ReadWriteOnce
> >   capacity:
> >     storage: 3Gi
> >   phase: Bound
> > 
> > PV Dump:
> > 
> > $ oc get pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410 -o yaml
> > apiVersion: v1
> > kind: PersistentVolume
> > metadata:
> >   annotations:
> >     pv.kubernetes.io/provisioned-by: rook-ceph.rbd.csi.ceph.com
> >   creationTimestamp: "2019-08-26T15:43:55Z"
> >   finalizers:
> >   - kubernetes.io/pv-protection
> >   name: pvc-4df1f65f-c818-11e9-9022-02f7b213f410
> >   resourceVersion: "24217"
> >   selfLink:
> > /api/v1/persistentvolumes/pvc-4df1f65f-c818-11e9-9022-02f7b213f410
> >   uid: 4fa6d706-c818-11e9-b67a-0e9b9a4cfc94
> > spec:
> >   accessModes:
> >   - ReadWriteOnce
> > 
> > 
> > --/snip--
> > 
> > 
> > Considering this is any way multi access volume, can we define this PVC as
> > "RWX" and perform this test?  this mode change could avoid volume ownership
> > process and could workaround/Fix this issue. 
> > 
> > Fabio, any thoughts on my suggestion ?
> 
> 
> Yes, using a different access mode would circumvent this. Does the CSI
> driver support RWX? It seems like RBD doesn't support it:
> 
> https://kubernetes.io/docs/concepts/storage/persistent-volumes/

We support RWX mode for RBD when operating on 'filesystem mode' which is the case here:

https://github.com/ceph/ceph-csi/tree/master/examples#how-to-test-rbd-multi_node_multi_writer-block-feature

> 
> Also, there's an on-going effort in upstream to set file ownership only
> after formatting:
> 
> https://github.com/kubernetes/kubernetes/issues/69699

Thanks for the pointer, raising the priority there too.

Comment 9 Shyamsundar 2019-08-29 11:36:09 UTC
(In reply to Humble Chirammal from comment #4)
> (In reply to Fabio Bertinatto from comment #2)
> > It seems like this is where it's taking the most time:
> > 
> > Aug 26 15:47:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:33.584277   
> > 1171 operation_generator.go:623] MountVolume.MountDevice succeeded for
> > volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> > "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> > 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> > "osio-worker-524728073-848b9c5dd7-hf2h7" (UID:
> > "bad60295-c818-11e9-9022-02f7b213f410") device mount path
> > "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-
> > 02f7b213f410/globalmount"
> > Aug 26 15:49:36 ip-10-0-159-243 hyperkube[1171]: I0826 15:49:36.607093   
> > 1171 operation_generator.go:667] MountVolume.SetUp succeeded for volume
> > "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> > "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> > 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> > "osio-worker-524728073-848b9c5dd7-hf2h7" (UID:
> > "bad60295-c818-11e9-9022-02f7b213f410")
> > 
> > There's a 2-minute delay during the MountVolume.SetUp call.
> > 
> > Assuming the NodePublishVolume call returns quickly (as confirmed by the
> > logs), the delay seems to be happening on applying the volume ownership to
> > the files:
> > 
> > https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/
> > csi_mounter.go#L267
> > 
> > I might happen that a pod starts and creates many files in the volume. Then
> > a second pod starts and tries to change the ownership of those files, taking
> > some time to finish.
> > 
> > The application creates many files in the volume?

> Considering this is any way multi access volume, can we define this PVC as
> "RWX" and perform this test?  this mode change could avoid volume ownership
> process and could workaround/Fix this issue. 

This is not a multi access volume, or at least the way I understand the term. This volume is used by a single running application pod.

Further RBD use is what is desired, and RBD supporting RWX mode is when the application can coordinate access to the device, not for non-clustered filesystem on top of the image anyway, and hence is not an option.

Comment 10 Fabio Bertinatto 2019-08-29 13:38:41 UTC
> This is not a multi access volume, or at least the way I understand the term. This volume is used by a single running application pod.

I'm not sure if this is the case here, I found two different pods from two different nodes trying to mount the volume roughly at the same time.


$ ag "MountVolume\.[M|S].+pvc-4df1f65f-c818-11e9-9022-02f7b213f410" ip-10-0-162-206.ec2.internal.node.log ip-10-0-159-243.ec2.internal.node.log

ip-10-0-162-206.ec2.internal.node.log
18198:Aug 26 15:44:45 ip-10-0-162-206 hyperkube[1156]: I0826 15:44:45.645142    1156 operation_generator.go:623] MountVolume.MountDevice succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-9n8kw" (UID: "5f7de71a-c818-11e9-9022-02f7b213f410") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount"
18214:Aug 26 15:44:45 ip-10-0-162-206 hyperkube[1156]: I0826 15:44:45.688119    1156 operation_generator.go:667] MountVolume.SetUp succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-9n8kw" (UID: "5f7de71a-c818-11e9-9022-02f7b213f410")
19798:Aug 26 15:45:49 ip-10-0-162-206 hyperkube[1156]: I0826 15:45:49.153107    1156 operation_generator.go:623] MountVolume.MountDevice succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-7nwrq" (UID: "936c319e-c818-11e9-9022-02f7b213f410") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount"
19814:Aug 26 15:45:49 ip-10-0-162-206 hyperkube[1156]: I0826 15:45:49.206486    1156 operation_generator.go:667] MountVolume.SetUp succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-7nwrq" (UID: "936c319e-c818-11e9-9022-02f7b213f410")

ip-10-0-159-243.ec2.internal.node.log
23020:Aug 26 15:47:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:33.584277    1171 operation_generator.go:623] MountVolume.MountDevice succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-02f7b213f410/globalmount"
25577:Aug 26 15:49:36 ip-10-0-159-243 hyperkube[1171]: I0826 15:49:36.607093    1171 operation_generator.go:667] MountVolume.SetUp succeeded for volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod "osio-worker-524728073-848b9c5dd7-hf2h7" (UID: "bad60295-c818-11e9-9022-02f7b213f410")

Comment 12 Shyamsundar 2019-08-29 14:47:50 UTC
*** Bug 1744958 has been marked as a duplicate of this bug. ***

Comment 13 John Strunk 2019-08-29 17:34:30 UTC
Reading through the comments, I think I need to clarify a few points:

1) This is a single access volume (RWO). The workload generator frequently scales the pod's Deployment 0 <=> 1, causing the logs seen in c#10. This is by design to stress exactly the type of issues this BZ and the other couple BZs have been exposing. Moving to RWX would imply switching to cephfs instead of rbd... at this time, the requirement is for rbd-based volumes.

2) Regarding readiness from c#7... This checks (1) whether the PV is writable and (2) whether the workload script in the pod is active (I don't check via `ps` for other reasons). The specific requirement is for pod startup to succeed within a fixed time as the application will have user interaction components, making startup delay directly visible as an unresponsive UI. Based on that, the workload generator has a timeout to achieve READY starting from the Deployment scale being set to 1.
I understand that there are 2 components to this: (1) CSI doing the ownership change and (2) that ownership change taking too long in RBD given the number of files. An open question is whether (2) can be fixed or whether we need to wait for (1).

Comment 14 Shyamsundar 2019-08-29 17:54:08 UTC
(In reply to John Strunk from comment #13)
> 1) This is a single access volume (RWO). The workload generator frequently
> scales the pod's Deployment 0 <=> 1, causing the logs seen in c#10. This is
> by design to stress exactly the type of issues this BZ and the other couple
> BZs have been exposing. Moving to RWX would imply switching to cephfs
> instead of rbd... at this time, the requirement is for rbd-based volumes.

This should clarify the NEEDINFO on comment #10. @fabio do get back if there are still any gotchas that I missed.

Comment 15 Humble Chirammal 2019-08-30 05:25:20 UTC
(In reply to Fabio Bertinatto from comment #10)
> > This is not a multi access volume, or at least the way I understand the term. This volume is used by a single running application pod.
> 
> I'm not sure if this is the case here, I found two different pods from two
> different nodes trying to mount the volume roughly at the same time.
> 
> 
> $ ag "MountVolume\.[M|S].+pvc-4df1f65f-c818-11e9-9022-02f7b213f410"
> ip-10-0-162-206.ec2.internal.node.log ip-10-0-159-243.ec2.internal.node.log
> 
> ip-10-0-162-206.ec2.internal.node.log
> 18198:Aug 26 15:44:45 ip-10-0-162-206 hyperkube[1156]: I0826 15:44:45.645142
> 1156 operation_generator.go:623] MountVolume.MountDevice succeeded for
> volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> "osio-worker-524728073-848b9c5dd7-9n8kw" (UID:
> "5f7de71a-c818-11e9-9022-02f7b213f410") device mount path
> "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-
> 02f7b213f410/globalmount"
> 18214:Aug 26 15:44:45 ip-10-0-162-206 hyperkube[1156]: I0826 15:44:45.688119
> 1156 operation_generator.go:667] MountVolume.SetUp succeeded for volume
> "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> "osio-worker-524728073-848b9c5dd7-9n8kw" (UID:
> "5f7de71a-c818-11e9-9022-02f7b213f410")
> 19798:Aug 26 15:45:49 ip-10-0-162-206 hyperkube[1156]: I0826 15:45:49.153107
> 1156 operation_generator.go:623] MountVolume.MountDevice succeeded for
> volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> "osio-worker-524728073-848b9c5dd7-7nwrq" (UID:
> "936c319e-c818-11e9-9022-02f7b213f410") device mount path
> "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-
> 02f7b213f410/globalmount"
> 19814:Aug 26 15:45:49 ip-10-0-162-206 hyperkube[1156]: I0826 15:45:49.206486
> 1156 operation_generator.go:667] MountVolume.SetUp succeeded for volume
> "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> "osio-worker-524728073-848b9c5dd7-7nwrq" (UID:
> "936c319e-c818-11e9-9022-02f7b213f410")
> 
> ip-10-0-159-243.ec2.internal.node.log
> 23020:Aug 26 15:47:33 ip-10-0-159-243 hyperkube[1171]: I0826 15:47:33.584277
> 1171 operation_generator.go:623] MountVolume.MountDevice succeeded for
> volume "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> "osio-worker-524728073-848b9c5dd7-hf2h7" (UID:
> "bad60295-c818-11e9-9022-02f7b213f410") device mount path
> "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-4df1f65f-c818-11e9-9022-
> 02f7b213f410/globalmount"
> 25577:Aug 26 15:49:36 ip-10-0-159-243 hyperkube[1171]: I0826 15:49:36.607093
> 1171 operation_generator.go:667] MountVolume.SetUp succeeded for volume
> "pvc-4df1f65f-c818-11e9-9022-02f7b213f410" (UniqueName:
> "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-
> 0000000000000001-4e253305-c818-11e9-be8f-0a580a830010") pod
> "osio-worker-524728073-848b9c5dd7-hf2h7" (UID:
> "bad60295-c818-11e9-9022-02f7b213f410")

I believe above ( mount call on 2 nodes at same time ) has to be root caused or clarified. Considering this is a RWO volume, this shouldnt have happened.
Again from above , it looks to me that, actually its 'staging' operation happens on both nodes in this kind of scenario and not really "publish" .
If thats the case,  Isnt it the same case reported at https://bugzilla.redhat.com/show_bug.cgi?id=1745776 ?

Comment 16 Humble Chirammal 2019-08-30 09:57:46 UTC
I would like to point out one of the recent analysis here
https://bugzilla.redhat.com/show_bug.cgi?id=1745776#c4 which could be 
the reason for RWO pod to be 'staged' on more than a NODE at a time.

Comment 17 Shyamsundar 2019-09-06 14:23:35 UTC
Removed other noise from the test case (IOW, pod rescheduling etc.) and ran a simple pod restart, on an otherwise idle cluster, that has a PVC filled with data like before.

The time taken for CSI Stage and Publish to complete was more than last time due to an internal failure, it took 2:06 minutes.

The time spent post the CSI operations was still significant and stood at 5:06 minutes.

There seems to be a new log message in the code, that helps demarcate the CSI operations for the post CSI operations, and helped get times better.
Message being: "operation_generator.go:623] MountVolume.MountDevice succeeded for volume ..."
Version being: "OpenShift Version: 4.2.0-0.ci-2019-09-05-105207"

So the issue does happen due to the recursive fsGroup setting as diagnosed earlier.

Following is the log analysis, interspersed with CSI logs as well to show a possible linear event history.

<relavent kube journal from the host>
Sep 06 13:41:00 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:00.057906    1634 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84")
Sep 06 13:41:00 ip-10-0-131-197 hyperkube[1634]: E0906 13:41:00.058043    1634 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\"" failed. No retries permitted until 2019-09-06 13:41:00.557995296 +0000 UTC m=+62487.070977791 (durationBeforeRetry 500ms). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\") pod \"osio-worker-1\" (UID: \"f6286169-d0ab-11e9-8b8c-0a41e590bc84\") "
Sep 06 13:41:00 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:00.565264    1634 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84")
Sep 06 13:41:00 ip-10-0-131-197 hyperkube[1634]: E0906 13:41:00.565444    1634 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\"" failed. No retries permitted until 2019-09-06 13:41:01.565388905 +0000 UTC m=+62488.078371474 (durationBeforeRetry 1s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\") pod \"osio-worker-1\" (UID: \"f6286169-d0ab-11e9-8b8c-0a41e590bc84\") "
Sep 06 13:41:01 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:01.572642    1634 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84")
Sep 06 13:41:01 ip-10-0-131-197 hyperkube[1634]: E0906 13:41:01.572805    1634 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\"" failed. No retries permitted until 2019-09-06 13:41:03.572756716 +0000 UTC m=+62490.085739296 (durationBeforeRetry 2s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\") pod \"osio-worker-1\" (UID: \"f6286169-d0ab-11e9-8b8c-0a41e590bc84\") "
Sep 06 13:41:03 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:03.586890    1634 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84")
Sep 06 13:41:03 ip-10-0-131-197 hyperkube[1634]: E0906 13:41:03.587053    1634 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\"" failed. No retries permitted until 2019-09-06 13:41:07.587001727 +0000 UTC m=+62494.099984522 (durationBeforeRetry 4s). Error: "Volume has not been added to the list of VolumesInUse in the node's volume status for volume \"pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\") pod \"osio-worker-1\" (UID: \"f6286169-d0ab-11e9-8b8c-0a41e590bc84\") "
Sep 06 13:41:07 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:07.616236    1634 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84")
Sep 06 13:41:07 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:07.623086    1634 operation_generator.go:1339] Controller attach succeeded for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84") device path: "csi-6d4834d91c3e9daa7486f6b0c1c11f4028eb932be56d946c78c87723718d6965"
Sep 06 13:41:07 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:07.717075    1634 reconciler.go:252] operationExecutor.MountVolume started for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84")
Sep 06 13:41:07 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:07.719954    1634 operation_generator.go:587] MountVolume.WaitForAttach entering for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84") DevicePath "csi-6d4834d91c3e9daa7486f6b0c1c11f4028eb932be56d946c78c87723718d6965"
Sep 06 13:41:07 ip-10-0-131-197 hyperkube[1634]: I0906 13:41:07.725615    1634 operation_generator.go:596] MountVolume.WaitForAttach succeeded for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84") DevicePath "csi-6d4834d91c3e9daa7486f6b0c1c11f4028eb932be56d946c78c87723718d6965"

---> <start> intersperced CSI call logs <---
===> request to stage (this request times out and also fails within ceph-csi, but is not of concern for this bug) (track ID: 23 response for clarity)
I0906 13:41:07.745318   99360 utils.go:119] ID: 23 GRPC call: /csi.v1.Node/NodeStageVolume
I0906 13:41:07.745342   99360 utils.go:120] ID: 23 GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"xfs"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"rook-ceph","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1567714858772-8081-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230"}
---> <end> intersperced CSI calls <---

===> back to kube logs, where the first request fails
Sep 06 13:43:07 ip-10-0-131-197 hyperkube[1634]: E0906 13:43:07.729137    1634 csi_mounter.go:433] kubernetes.io/csi: failed to remove dir [/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount]: remove /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount: directory not empty
Sep 06 13:43:07 ip-10-0-131-197 hyperkube[1634]: E0906 13:43:07.729153    1634 csi_attacher.go:322] kubernetes.io/csi: attacher.MountDevice failed to remove mount dir after errir [/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount]: remove /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount: directory not empty
Sep 06 13:43:07 ip-10-0-131-197 hyperkube[1634]: E0906 13:43:07.729285    1634 nestedpendingoperations.go:278] Operation for "\"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\"" failed. No retries permitted until 2019-09-06 13:43:08.229236489 +0000 UTC m=+62614.742219063 (durationBeforeRetry 500ms). Error: "MountVolume.MountDevice failed for volume \"pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a\" (UniqueName: \"kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230\") pod \"osio-worker-1\" (UID: \"f6286169-d0ab-11e9-8b8c-0a41e590bc84\") : rpc error: code = DeadlineExceeded desc = context deadline exceeded"
Sep 06 13:43:07 ip-10-0-131-197 hyperkube[1634]: I0906 13:43:07.729736    1634 event.go:209] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ocs-monkey", Name:"osio-worker-1", UID:"f6286169-d0ab-11e9-8b8c-0a41e590bc84", APIVersion:"v1", ResourceVersion:"562895", FieldPath:""}): type: 'Warning' reason: 'FailedMount' MountVolume.MountDevice failed for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" : rpc error: code = DeadlineExceeded desc = context deadline exceeded

===> Second request starting from kube
Sep 06 13:43:08 ip-10-0-131-197 hyperkube[1634]: I0906 13:43:08.321644    1634 reconciler.go:252] operationExecutor.MountVolume started for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84")
Sep 06 13:43:08 ip-10-0-131-197 hyperkube[1634]: I0906 13:43:08.322177    1634 operation_generator.go:587] MountVolume.WaitForAttach entering for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84") DevicePath "csi-6d4834d91c3e9daa7486f6b0c1c11f4028eb932be56d946c78c87723718d6965"
Sep 06 13:43:08 ip-10-0-131-197 hyperkube[1634]: I0906 13:43:08.326371    1634 operation_generator.go:596] MountVolume.WaitForAttach succeeded for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84") DevicePath "csi-6d4834d91c3e9daa7486f6b0c1c11f4028eb932be56d946c78c87723718d6965"

---> <start> intersperced CSI call logs <---
I0906 13:43:08.337859   99360 utils.go:119] ID: 29 GRPC call: /csi.v1.Node/NodeStageVolume
I0906 13:43:08.337882   99360 utils.go:120] ID: 29 GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"xfs"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"rook-ceph","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1567714858772-8081-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230"}
===> Blocked on lock that is held for the same request under processing by ID: 23 request

E0906 13:43:13.283242   99360 utils.go:123] ID: 23 GRPC error: rpc error: code = Internal desc = rbd: map failed exit status 110, rbd output: rbd: wait failed
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (110) Connection timed out
===> ID: 23 request finally fails, releases the lock
===> ID: 29 suceeds
I0906 13:43:13.628974   99360 utils.go:125] ID: 29 GRPC response: {}
I0906 13:43:13.642323   99360 utils.go:119] ID: 31 GRPC call: /csi.v1.Node/NodePublishVolume
I0906 13:43:13.642341   99360 utils.go:120] ID: 31 GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount","target_path":"/var/lib/kubelet/pods/f6286169-d0ab-11e9-8b8c-0a41e590bc84/volumes/kubernetes.io~csi/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"xfs"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"rook-ceph","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1567714858772-8081-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230"}
I0906 13:43:13.678720   99360 utils.go:125] ID: 31 GRPC response: {}
---> <end> intersperced CSI calls <---

===> So CSI took 2:06 minutes to complete its operation (due to an internal failure, but irrelevant to this bug)

Sep 06 13:43:13 ip-10-0-131-197 systemd[1]: Started Kubernetes transient mount for /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount/0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230.

===> kube detects succesful mount (this is a new message from the last version under test, just an observation)
Sep 06 13:43:13 ip-10-0-131-197 hyperkube[1634]: I0906 13:43:13.629848    1634 operation_generator.go:623] MountVolume.MountDevice succeeded for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84") device mount path "/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a/globalmount"

===> Final success message below, took 5:06 minutes, and this is possibly (from code reading) due to the recursive fsGroup setting
Sep 06 13:48:19 ip-10-0-131-197 hyperkube[1634]: I0906 13:48:19.525830    1634 operation_generator.go:667] MountVolume.SetUp succeeded for volume "pvc-929d9440-d01c-11e9-84fe-02b79d8eaf4a" (UniqueName: "kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com^0001-0009-rook-ceph-0000000000000001-92b4f24c-d01c-11e9-9996-0a580a810230") pod "osio-worker-1" (UID: "f6286169-d0ab-11e9-8b8c-0a41e590bc84")

Comment 29 Chao Yang 2019-11-06 02:12:27 UTC
Thanks @srangana
Update the bug to Verified according to https://bugzilla.redhat.com/show_bug.cgi?id=1745773#c28

Comment 31 errata-xmlrpc 2020-01-23 11:05:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062