Bug 2053156 - Avoid worldwide permission mode setting at time of nodestage of CephFS share
Summary: Avoid worldwide permission mode setting at time of nodestage of CephFS share
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: csi-driver
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ODF 4.10.0
Assignee: Humble Chirammal
QA Contact: Rachael
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-10 15:53 UTC by Humble Chirammal
Modified: 2023-08-09 16:37 UTC (History)
8 users (show)

Fixed In Version: 4.10.0-160
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-13 18:53:05 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-csi pull 2847 0 None Merged cephfs: dont set explicit permissions on the volume 2022-02-10 15:55:23 UTC
Github red-hat-storage ceph-csi pull 79 0 None open Bug 2053156: cephfs: dont set explicit permissions on the volume 2022-02-14 06:23:50 UTC
Red Hat Product Errata RHSA-2022:1372 0 None None None 2022-04-13 18:53:18 UTC

Description Humble Chirammal 2022-02-10 15:53:43 UTC
Description of problem (please be detailed as possible and provide log
snippests):

At present Ceph CSI does provide or set 0777 permission at time of staging  a Cephfs share which is not the correct thing to do. CSI should leave the validation and adjustment to CO/kubelet based on the FSGroup Change policy in place. 


Version of all relevant components (if applicable):

ODF 4.10

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.
2.
3.


Actual results:


Expected results:


Additional info:

Comment 2 Humble Chirammal 2022-02-11 08:15:23 UTC
Karthick, Racheal, can you help to get QE ack on this?

Comment 3 krishnaram Karthick 2022-02-11 09:26:45 UTC
Humble - could you please update us on what additional tests would be needed to cover the validation?

Comment 4 Humble Chirammal 2022-02-11 11:33:49 UTC
As far as verification goes, we are good as long as all the existing operations and CI tests pass I believe.

Comment 7 Humble Chirammal 2022-02-11 11:44:08 UTC
Rook PR to change the default Policy in light of above BUG fix in Ceph CSI: https://github.com/rook/rook/pull/9729

Comment 11 Humble Chirammal 2022-03-07 07:49:13 UTC
It looks like even in release 4.9, the default was not changed to NONE and been kept as "onrootmismatch", if thats the case, no extra verification required on this from this bugzilla report pov.

Comment 12 Humble Chirammal 2022-03-07 07:50:36 UTC
(In reply to Humble Chirammal from comment #11)
> It looks like even in release 4.9, the default was not changed to NONE and
> been kept as "onrootmismatch", if thats the case, no extra verification
> required on this from this bugzilla report pov.

Discard above, it was meant for https://bugzilla.redhat.com/show_bug.cgi?id=2059248.

Comment 13 Humble Chirammal 2022-03-10 11:53:12 UTC
Looking some more further into this ( on Racheal++'s test setup)  it seems that, eventhough the nodestage chmod'g of 777 has been avoided, we are still connecting to go-ceph with 777 mode while subvolumes are created, while that in place, the nodestage changes becomes NOOP and that need to be corrected too for the fix completion. 

https://github.com/ceph/ceph-csi/blob/devel/internal/cephfs/core/volume.go#L234

Comment 14 Humble Chirammal 2022-03-10 14:54:04 UTC
(In reply to Humble Chirammal from comment #13)
> Looking some more further into this ( on Racheal++'s test setup)  it seems
> that, eventhough the nodestage chmod'g of 777 has been avoided, we are still
> connecting to go-ceph with 777 mode while subvolumes are created, while that
> in place, the nodestage changes becomes NOOP and that need to be corrected
> too for the fix completion. 
> 
> https://github.com/ceph/ceph-csi/blob/devel/internal/cephfs/core/volume.
> go#L234

This bug report is about the ceph csi driver interception at time of node staging for changing permission and it has been fixed already. 
The scenario mentioned in my above comment is bit different and may be not a good idea to mix it with this bug.
with this thought process, I am reverting or flipping the status back to ON_QA.

Comment 16 Humble Chirammal 2022-03-15 07:49:56 UTC
Thanks a lot rachael for following the tests  based on the discussions across  different ODF clusters!

Comment 18 errata-xmlrpc 2022-04-13 18:53:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1372


Note You need to log in before you can comment on or make changes to this bug.