Bug 2207619

Summary: StorageSystem status for IBM FlashSystem stuck on "Progressing" even though FSC is in "Ready" phase
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Alon Firestein <alon.firestein>
Component: odf-operatorAssignee: Nitin Goyal <nigoyal>
Status: MODIFIED --- QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.13CC: bvered, muagarwa, odf-bz-bot
Target Milestone: ---Flags: nigoyal: needinfo? (alon.firestein)
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ZIP file containing screenshots to explain bug none

Description Alon Firestein 2023-05-16 11:06:32 UTC
Created attachment 1964895 [details]
ZIP file containing screenshots to explain bug

Description of problem (please be detailed as possible and provide log
snippests):
After installing ODF, and selecting IBM FlashSystem Storage as the chosen platform, occasionally the StorageSystem created for the FlashSystem storage will have its status stuck on "Progressing" (attached image 1) even though the FlashSystemCluster object is in a "Ready" phase (attached image 2) and finished successfully.
This happens due to a minor differences in the time it takes for installation and therefore occasionally the FlashSystemCluster CRD is not found and then the "VendorSystemPresent" condition for the StorageSystem enters an error status with the following message: no matches for Kind "FlashSystemCluster" in version "odf.ibm.com/v1alpha1". (attached image 3)
Even though the CRD does exist.
We found that in order to fix this status from being stuck in the "Progressing" phase is to delete the odf-operator-controller-manager pod in order for it to now recognize the "FlashSystemCluster" kind and correctly change the StorageSystem status to "Available".
And without deleting that pod, the status will never change to "Available" as it should be. 


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?
Yes, as explained, deleting the odf-operator-controller-manager pod was able for the StorageSystem to recognize the "FlashSystemCluster" kind and change its status to "Available".

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install ODF and choose IBM FlashSystem Storage as the Storage Platform.


Actual results:
After waiting for it to install, occasionally the Status for the FlashSystem StorageSystem that was created will be stuck on "Progressing".

Expected results:
The status should be on "Available" due to the fact that the FlashSystemCluster resource finished successfully and is in "Ready" phase. (attached image 2) 

Additional info:

Comment 2 Mudit Agarwal 2023-05-23 06:18:54 UTC
Not a 4.13 blocker, we have a workaround. Will look into it shortly.

Comment 3 Nitin Goyal 2023-05-29 03:54:10 UTC
Can I get a must gather?

Comment 4 Vered Berenstein Paz 2023-05-30 07:21:08 UTC
Must gather logs were uploaded to box - https://ibm.box.com/s/mmolbcrf003u6cv6ulfo787k7r6u6swu

Comment 6 Nitin Goyal 2023-08-08 08:34:27 UTC
Hello, we did update the client APIs, the problem should be fixed now. Moving it to Modified.