Description of problem (please be detailed as possible and provide log snippests): At present Ceph CSI does provide or set 0777 permission at time of staging a Cephfs share which is not the correct thing to do. CSI should leave the validation and adjustment to CO/kubelet based on the FSGroup Change policy in place. Version of all relevant components (if applicable): ODF 4.10 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Karthick, Racheal, can you help to get QE ack on this?
Humble - could you please update us on what additional tests would be needed to cover the validation?
As far as verification goes, we are good as long as all the existing operations and CI tests pass I believe.
Rook PR to change the default Policy in light of above BUG fix in Ceph CSI: https://github.com/rook/rook/pull/9729
It looks like even in release 4.9, the default was not changed to NONE and been kept as "onrootmismatch", if thats the case, no extra verification required on this from this bugzilla report pov.
(In reply to Humble Chirammal from comment #11) > It looks like even in release 4.9, the default was not changed to NONE and > been kept as "onrootmismatch", if thats the case, no extra verification > required on this from this bugzilla report pov. Discard above, it was meant for https://bugzilla.redhat.com/show_bug.cgi?id=2059248.
Looking some more further into this ( on Racheal++'s test setup) it seems that, eventhough the nodestage chmod'g of 777 has been avoided, we are still connecting to go-ceph with 777 mode while subvolumes are created, while that in place, the nodestage changes becomes NOOP and that need to be corrected too for the fix completion. https://github.com/ceph/ceph-csi/blob/devel/internal/cephfs/core/volume.go#L234
(In reply to Humble Chirammal from comment #13) > Looking some more further into this ( on Racheal++'s test setup) it seems > that, eventhough the nodestage chmod'g of 777 has been avoided, we are still > connecting to go-ceph with 777 mode while subvolumes are created, while that > in place, the nodestage changes becomes NOOP and that need to be corrected > too for the fix completion. > > https://github.com/ceph/ceph-csi/blob/devel/internal/cephfs/core/volume. > go#L234 This bug report is about the ceph csi driver interception at time of node staging for changing permission and it has been fixed already. The scenario mentioned in my above comment is bit different and may be not a good idea to mix it with this bug. with this thought process, I am reverting or flipping the status back to ON_QA.
Thanks a lot rachael for following the tests based on the discussions across different ODF clusters!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.10.0 enhancement, security & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1372