Bug 1417349 - [RFE] support for Upgrade with Instance HA running
Summary: [RFE] support for Upgrade with Instance HA running
Keywords:
Status: CLOSED DUPLICATE of bug 1264181
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Michele Baldessari
QA Contact: Arik Chernetsky
URL:
Whiteboard:
Depends On: 1264181
Blocks: 1419948 1458798
TreeView+ depends on / blocked
 
Reported: 2017-01-28 03:20 UTC by arkady kanevsky
Modified: 2019-09-09 13:17 UTC (History)
23 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-18 19:55:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description arkady kanevsky 2017-01-28 03:20:40 UTC
Description of problem:
Need support for upgrade & update with instance HA turned on and customer VM running utilizing it.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 arkady kanevsky 2017-01-28 03:24:52 UTC
Umbrella BZ to track all instance HA upgrade work.

Sean,
not sure which component it should be under.
Hoping to land it for JS-7.0, so backported to OSP10 once done upstream.
Hopefully in Ocata not Pike.

Comment 2 Andrew Beekhof 2017-01-30 00:22:55 UTC
Since (unfortunately) everyone deploys IHA slightly differently, we cannot create a single script for everybody.  However this is an outline of the basic procedure that JT is adapting for JS.

1. Remove resources targeted for compute nodes
    Starting point for finding compute resources:
     # pcs resource show | grep compute | grep -v -e Stopped: -e
Started: -e disabled -e remote | awk '{print $3}'

    For each resource run:
    # pcs resource cleanup ${resource}
    # pcs --force resource delete ${resource}

2. Remove the evacuation resource
    Starting point for finding the evacuation resource:
    # pcs resource show | grep NovaEvacuate | awk '{print $1}'

    # pcs resource cleanup nova-evacuate
    # pcs --force resource delete nova-evacuate

3. Remove the remote resources that represent the compute ndoes
    Starting point for finding remote compute resources:
       pcs resource show | grep :remote | awk '{print $1}'

   For each node/resource run:
    # pcs resource cleanup ${resource}
    # pcs --force resource delete ${resource}

4. Erase the status entries corresponding to the compute nodes
    Starting point for finding compute nodes:
     # cibadmin -Q | grep "<node_state" | grep -v -e control

    For each compute node
    # cibadmin --delete --xml-text "<node id='${node}'/>"
    # cibadmin --delete --xml-text "<node_state id=\${node}'/>"

5. Re-enable the systemd versions of the cluster services we deleted in step 1

6. Now it is safe to update/upgrade the installation

Once the installation is complete, re-apply the following steps from
the IHA installation procedure:
   1, 4, 6-8, 13, 17, 18

Comment 3 Sean Merrow 2017-01-30 20:29:30 UTC
Setting component to Nova as I think that is where Instance HA would belong. 

I'm not sure what Andrew's comments mean for this RFE. Does it mean that the RFE won't happen, or should we consult with PM?

@Arkady, I welcome your thoughts on Andrew's comments as well.

Thanks folks!

Comment 4 Sean Merrow 2017-01-30 20:50:58 UTC
Looks like this has been requested for a long time and by several partners now, and is a duplicate of the following BZ, which is currently targeted for OSP 12:

https://bugzilla.redhat.com/show_bug.cgi?id=1264181

Comment 5 Stephen Gordon 2017-01-30 21:01:56 UTC
(In reply to Sean Merrow from comment #3)
> Setting component to Nova as I think that is where Instance HA would belong. 

The openstack-nova component refers specifically to the Nova services themselves as packaged in the openstack-nova-* RPMs. Nova provides all the API calls required for Instance HA (since RHOSP 8 IIRC) but the requirement is that TripleO deploy and configure both Nova and the supporting pacemaker infrastructure per the existing KBase for manual deployment to enable Instance HA out of the box.

> I'm not sure what Andrew's comments mean for this RFE. Does it mean that the
> RFE won't happen, or should we consult with PM?

The RFE for this to be integrated into Director is BZ#1264181, Andrew's comment refers to the manual steps required to manage the upgrade in lieu of Director being able to do it that would need to be orchestrate elsewhere as a workaround.

Comment 6 arkady kanevsky 2017-06-05 17:45:45 UTC
Sorry for late delay.
I think JT and Andrew worked thru the steps and JT created a sceript that one need to run before the upgrade to turn off IHA and then turn it back on after upgrade.
This is acceptable short term but does not really deliver upgrade with IHA running.

Not sure if we need a separate BZ for it.

Comment 7 Sean Merrow 2018-05-18 19:55:44 UTC
Closing as a dup of 1264181. That RFE is on-track currently for OSP 13.

*** This bug has been marked as a duplicate of bug 1264181 ***


Note You need to log in before you can comment on or make changes to this bug.