Bug 1762658 - Delay in detecting a successful CSI mount or incorrect report of failed mount, by kubelet for Ceph CSI based volume (due to the recursive fsGroup setting)
Summary: Delay in detecting a successful CSI mount or incorrect report of failed mount...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.2.z
Assignee: Fabio Bertinatto
QA Contact: Chao Yang
Whiteboard: ocs-monkey
Depends On: 1745773
TreeView+ depends on / blocked
Reported: 2019-10-17 07:59 UTC by Fabio Bertinatto
Modified: 2019-12-11 22:36 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1745773
Last Closed: 2019-12-11 22:36:06 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Github openshift origin pull 24006 'None' closed [release-4.2] Bug 1762658: UPSTREAM: 83747: Improve efficiency of csiMountMgr.GetAttributes 2020-05-27 13:18:31 UTC
Red Hat Product Errata RHBA-2019:4093 None None None 2019-12-11 22:36:16 UTC

Comment 9 Shyamsundar 2019-12-03 15:24:52 UTC
The time taken to attach still seems higher (or at par) without the fix.

The time taken prior to the fix and post are as follows,
CSI Provider  file   Pre-fix Post-fix
              count  time    time
csi-ceph-rbd  9500   83s      78s
              37300  116s     146s
csi-ebs-gp2   9500   85s      79s
              37300  111s     131s

### Tested with the following versions:

$ oc version
Client Version: v4.2.0-alpha.0-90-g420c3d6
Server Version: 4.2.0-0.ci-2019-11-30-132200
Kubernetes Version: v1.14.6+a9e953f

### On checking the page https://openshift-release.svc.ci.openshift.org/releasestream/4.2.0-0.ci/release/4.2.0-0.ci-2019-11-30-132200 I see the bug listed under the list of fixes.

I had tested the patch by building out my own hyperkube post applying the patch and using it in my test cluster. That build showed improvements like the data in https://bugzilla.redhat.com/show_bug.cgi?id=1745773#c28

The patch was applied on top the following commit:
commit ff885f256566a93dc2e42cc40b34bb9a9ca0ffa8
Author: Fabio Bertinatto <fbertina@redhat.com>
Date:   Thu Nov 7 10:22:18 2019 +0100

    UPSTREAM: 83747: Improve efficiency of csiMountMgr.GetAttributes

commit ccf0c2733f7f5cc76e220c1fe9bea909593512f1
Merge: 0517d42 99af4b7
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Mon Nov 4 06:23:03 2019 +0100

    Merge pull request #23910 from p0lyn0mial/fix-ns-conditions-integration-test-4-2
    Bug 1766365: TestNamespaceCondition integration test fails


I may try again today or tomorrow picking up the latest CI builds and rerunning my tests, but any clarification on which build the fix is present from would also help in validating the fix.

Comment 10 Shyamsundar 2019-12-03 17:33:19 UTC
Retested with the following versions, and the expected performance gains are noted.

$ oc version
Client Version: v4.2.0-alpha.0-92-g57a203c
Server Version: 4.2.0-0.ci-2019-12-03-064353
Kubernetes Version: v1.14.6+a9e953f

Attach speed-up information:
CSI Provider  file   Pre-fix Post-fix
              count  time    time
csi-ceph-rbd  9500   83s      12s
              37300  116s     31s
csi-ebs-gp2   9500   85s      13s
              37300  111s     19s

Comment 12 Chao Yang 2019-12-06 01:40:11 UTC
Update the status based on above comments

Comment 14 errata-xmlrpc 2019-12-11 22:36:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.