Created attachment 2015969 [details] Two StorageSystems Description of problem (please be detailed as possible and provide log snippets): The current state of ODF is it’s still able to provision PVCs, it actually seems to be operating just fine, however, the customer had a mirroring issue in their disconnected environment and upon further inspection, they thought the root cause of their issue was emanating from the fact they had two storagesystems (one deleting/being held up by NooBaa finalizer). After some tests, ODF Support confirmed that this wasn’t the case and that ODF is functioning, but this situation does need to be remedied. ODF Support is opening this BugZilla case for two reasons. The first reason is the customer believes this to be a bug. They stand firmly that this was automated and wasn't executed by their staff (creating the second storagesystem and deleting the initial storagesystem), however, ODF Support’s research disagrees with this assertion; but is worth pursuing if their statements are true. The second reason is that this first/initial storagesystem will need to be reconciled/deleted/uninstalled. ODF Support is comfortable with the patching of finalizers to delete/purge this storagesystem that is being deleted, however, there are some unknowns as to whether this will cause data loss so we’re seeking the guidance of Engineering to facilitate this process and green light Support’s steps. Regarding a more descriptive analysis of the problem along with log snippets. ODF Support’s conclusion after analyzing the must-gathers/logs will be in the Private Comment of this BZ. Note that the analysis was given to the customer as a case comment as well. Version of all relevant components (if applicable): OCP: NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.12.27 True False 137d Cluster version is 4.12.27 ODF: NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.12.7-rhodf NooBaa Operator 4.12.7-rhodf mcg-operator.v4.12.6-rhodf Succeeded ocs-operator.v4.12.7-rhodf OpenShift Container Storage 4.12.7-rhodf ocs-operator.v4.12.6-rhodf Succeeded odf-csi-addons-operator.v4.12.7-rhodf CSI Addons 4.12.7-rhodf odf-csi-addons-operator.v4.12.6-rhodf Succeeded odf-operator.v4.12.7-rhodf OpenShift Data Foundation 4.12.7-rhodf odf-operator.v4.12.6-rhodf Succeeded quay-operator.v3.8.11 Red Hat Quay 3.8.11 quay-operator.v3.8.10 Succeeded Ceph: { "mon": { "ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)": 1 }, "osd": { "ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)": 6 }, "mds": { "ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)": 2 }, "overall": { "ceph version 16.2.10-187.el8cp (5d6355e2bccd18b5c6457a34cb666d773f21823d) pacific (stable)": 12 } } Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Doesn’t look to have any impact, just worried about data loss when getting rid of the first storagesystem. Is there any workaround available to the best of your knowledge? Deleting the storagesystem that is in the “Terminating” state. Unsure if this will cause data loss. Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 3 Can this issue be reproducible? Yes Can this issue reproduce from the UI? No Steps to Reproduce: 1. With the First storagesystem created deployed/healthy, create a second storagesystem using the steps outlined in: https://access.redhat.com/articles/5692201#create-cluster-11 2. Note, this CR needs to have the EXACT configuration as the first storagesytem, but a different name. 3. Create the storagesystem with $ oc create -f storagecluster.yaml 4. This will place the newly created in a progressing/creating state as it can’t create due to a storagesystem/storagecluster already existing. 5. Delete the first/original storagesytem. 6. The first/original storagesystem will most likely get stuck in “Terminating/Deleting” state being hung up by a finalizer/admissions web hook (e.g. NooBaa). 7. Because the first/original storagesystem is stuck in a “Terminating” state, ODF then allows the creation/reconciliation of the second/newly created storagesystem to transition from “Progressing/Creating” to provisioned.