Bug 1969216

Summary: Rook may recreate a file system with existing pools
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Patrick Donnelly <pdonnell>
Component: rookAssignee: Subham Rai <srai>
Status: CLOSED ERRATA QA Contact: Avi Liani <alayani>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.6CC: assingh, dfuller, kbg, madam, muagarwa, ocs-bugs, odf-bz-bot, rperiyas, shan, sostapov, srai, tdesala, vumrao
Target Milestone: ---Keywords: AutomationBackLog
Target Release: ODF 4.9.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
.Multiple file systems are not created with existing pools With this update, after you create the `filesystem.yaml`, multiple file systems with the existing pool are not created even if you delete or recreate the `filesystem.yaml`. This avoids data loss.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-13 17:44:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2011326    

Comment 2 Sébastien Han 2021-06-09 15:36:42 UTC
Unless we make it a blocker this will be addressed in 4.9 since 4.8 freeze is tomorrow and we won't have time to address this.

Comment 3 Travis Nielsen 2021-06-09 16:16:16 UTC
There is a related discussion in the upstream issue here: https://github.com/rook/rook/issues/8059#issuecomment-856951862

Basically, there is already a fix in 4.7 that will preserve the filesystem by default when the filesystem CR is deleted, which will help avoid the fundamental problem. The follow-up fix still needed is only to stop calling --force, which should be a no-op anyway since 4.7. Fixing this in 4.9 should be sufficient.

Comment 5 Subham Rai 2021-08-16 09:03:51 UTC
After creating Ceph Cluster, creating the Ceph Filesystem multiple times(more than once), it will not create a filesystem after the first filesystem is created. In the rook operator logs it will show something like `ceph-file-controller: filesystem "myfs" already exists`)

Comment 12 Avi Liani 2021-10-20 06:50:01 UTC
@pdonnell can you provide reproduce procedure so we can verify this BZ ?
from explanation above i can not understand how to verify.

Comment 13 Ramakrishnan Periyasamy 2021-10-25 12:10:06 UTC
@srai Could you add the steps to verify this bz, due to lack of verification steps this bz verification is getting delayed.

Comment 14 Subham Rai 2021-10-25 12:45:11 UTC
(In reply to Ramakrishnan Periyasamy from comment #13)
> @srai Could you add the steps to verify this bz, due to lack of
> verification steps this bz verification is getting delayed.
Earlier, there were changes that Rook may create filesystem multiple times. To verify this just delete `cephfilesystem` and try to create again, it should not re-create the `filesystem`. If you will check Rook logs it will say `ceph-file-controller: filesystem "myfs" already exists`.

I already mentioned the same in the comment in C5 https://bugzilla.redhat.com/show_bug.cgi?id=1969216#c5

Let me know if that makes sense.

Comment 15 Avi Liani 2021-10-26 11:38:35 UTC
verified on versions:

        OCP versions
        ==============

                NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
                version   4.9.0-0.nightly-2021-10-26-041726   True        False         28m     Cluster version is 4.9.0-0.nightly-2021-10-26-041726
                
        OCS versions
        ==============

                NAME                     DISPLAY                       VERSION   REPLACES   PHASE
                noobaa-operator.v4.9.0   NooBaa Operator               4.9.0                Succeeded
                ocs-operator.v4.9.0      OpenShift Container Storage   4.9.0                Succeeded
                odf-operator.v4.9.0      OpenShift Data Foundation     4.9.0                Succeeded
                
        Rook versions
        ===============

                2021-10-26 11:22:50.703585 I | op-flags: failed to set flag "logtostderr". no such flag -logtostderr
                rook: 4.9-208.3e7cb20.release_4.9
                go: go1.16.6
                
        Ceph versions
        ===============

                ceph version 16.2.0-140.el8cp (747f7a0286d51abc59b3a3a1a7cb17ec7a35754e) pacific (stable)
                

Create CephFilesystem using (few times):

---
apiVersion: ceph.rook.io/v1
kind: CephFilesystem
metadata:
  name: myfs
  namespace: openshift-storage
spec:
  metadataPool:
    replicated:
      size: 3
  dataPools:
    - failureDomain: host
      replicated:
        size: 3
  metadataServer:
    activeCount: 1
    activeStandby: true

after the first time created as expected, and after that creation failed - as expected - with the error : error when creating "CephFileSystem.yaml": cephfilesystems.ceph.rook.io "myfs" already exists

(ocs-ci) avili@vm-11-87 [DC27CL] (master) # oc create -f CephFileSystem.yaml
cephfilesystem.ceph.rook.io/myfs created
(ocs-ci) avili@vm-11-87 [DC27CL] (master) # oc get CephFilesystem -n openshift-storage
NAME                                ACTIVEMDS   AGE   PHASE
myfs                                1           36s   Ready
ocs-storagecluster-cephfilesystem   1           30m   Ready
(ocs-ci) avili@vm-11-87 [DC27CL] (master) # oc create -f CephFileSystem.yaml
Error from server (AlreadyExists): error when creating "CephFileSystem.yaml": cephfilesystems.ceph.rook.io "myfs" already exists
(ocs-ci) avili@vm-11-87 [DC27CL] (master) # oc create -f CephFileSystem.yaml
Error from server (AlreadyExists): error when creating "CephFileSystem.yaml": cephfilesystems.ceph.rook.io "myfs" already exists

Comment 16 Patrick Donnelly 2021-10-26 17:48:01 UTC
I think to reproduce this with fidelity, you should delete the file system using `ceph fs rm ...` in the toolbox container and then try to recreate the filesystem through rook using `CephFileSsytem.yaml`.

Comment 17 Mudit Agarwal 2021-11-16 13:27:21 UTC
Please add doc text

Comment 18 Subham Rai 2021-11-17 04:43:42 UTC
(In reply to Mudit Agarwal from comment #17)
> Please add doc text

Doc text added. Thanks!

Comment 22 errata-xmlrpc 2021-12-13 17:44:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Data Foundation 4.9.0 enhancement, security, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:5086