Bug 1705400

Summary: [RFE] upgrade playbook should not update already upgraded glusterfs pods.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nitin Goyal <nigoyal>
Component: cns-ansibleAssignee: John Mulligan <jmulligan>
Status: CLOSED WONTFIX QA Contact: Nitin Goyal <nigoyal>
Severity: high Docs Contact:
Priority: unspecified    
Version: ocs-3.11CC: asambast, hchiramm, jarrpa, knarra, kramdoss, madam, rhs-bugs, rtalur, sarumuga
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-06 16:22:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1703695    

Comment 4 Yaniv Kaul 2019-05-02 13:32:34 UTC
Why is the severity urgent? What's the issue?

Comment 6 RamaKasturi 2019-05-03 06:18:53 UTC
(In reply to Yaniv Kaul from comment #4)
> Why is the severity urgent? What's the issue?

Hello Yaniv,

    When ever an upgrade playbook is rerun we see that glusterfs pods are getting upgraded even though they are already upgraded instead upgrade playbook should simply check if the pod is upgraded to the latest and if yes it should not try to upgrade again. 

  Due to the above it might take longer for the upgrade procedure to complete if things fail in between which the customer might really do not want to happen, IMO. 

Thanks
kasturi

Comment 7 Yaniv Kaul 2019-05-03 07:17:14 UTC
Thanks, certainly not Urgent severity then.

Comment 8 Jose A. Rivera 2019-05-03 19:15:21 UTC
After some discussion, I have come to understand that this is definitely undesirable behavior. However, there is no danger to data integrity by just restarting the same pods over and over, as the playbook takes care to wait for the cluster to heal before proceeding. This does introduce considerable delay to the process, though, so it should definitely be addressed at some point.

Marking this as an RFE for the next release.

Comment 9 Ashmitha Ambastha 2019-06-26 09:31:02 UTC
Hi Jose, 

If the upgrade playbook fails while upgrading glusterfs pods, re-running the playbook will result in playbook failing again. 

Upgrade playbook is supposed to check for the state of the cluster before starting the upgrade. That is, it should check if all the OCS pods (glusterfs, gluster-block provisioner and heketi pod) are in 1/1 Running state and all the nodes should be in Ready state. 

Hence, if the playbook fails at 1st try during glusterfs pod upgrades. When we re-run the playbook, the playbook will fail during the pre-checks itself because heketi and glusterblock-prov pods are not available. 

We'll need to state that the admin needs to bring the cluster back to it's 'before-1st-upgrade' attempt state and only then re-run the playbook in such cases. 
This is important we need to decide on how we'll be handling such scenarios.

Comment 10 Jose A. Rivera 2019-06-26 13:42:03 UTC
Have you tested to make sure this is the actual behavior that's currently implemented? Best I can tell we ONLY check teh health of the GlusterFS pods. If the heketi or glusterblock-provisioner pods are not present we shuld just proceed as normal and recreate them.