Bug 1182956

Summary: [RFE] Support for High Availability on Red Hat OpenStack Platform (RHEL 8)
Product: Red Hat Enterprise Linux 8 Reporter: Paul Needle <pneedle>
Component: pacemakerAssignee: Chris Feist <cfeist>
Status: CLOSED CURRENTRELEASE QA Contact: cluster-qe <cluster-qe>
Severity: high Docs Contact: Steven J. Levine <slevine>
Priority: high    
Version: 8.3CC: aherr, astupnik, berrange, bperkins, cfeist, chjones, chrisbro, cluster-maint, cswanson, dasmith, djansa, ealcaniz, eglynn, fdinitto, hbiswas, kchamart, kgaillot, kshawcro, lmiccini, lyarwood, markmc, mfuruta, mschuppe, nwahl, rbryant, rfreire, sbauza, sbradley, sgordon, shardy, slevine, sputhenp, srevivo, vcojot, vromanso
Target Milestone: rcKeywords: FutureFeature, Reopened, TestOnly, Tracking, Triaged
Target Release: 8.5   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
.Support for High Availability on Red Hat OpenStack platform You can now configure a high availability cluster on the Red Hat OpenStack platform. In support of this feature, Red Hat provides the following new cluster agents: * `fence_openstack`: fencing agent for HA clusters on OpenStack * `openstack-info`: resource agent to configure the `openstack-info` cloned resource, which is required for an HA cluster on OpenStack * `openstack-virtual-ip`: resource agent to configure a virtual IP address resource * `openstack-floating-ip`: resource agent to configure a floating IP address resource * `openstack-cinder-volume`: resource agent to configure a block storage resource
Story Points: ---
Clone Of:
: 2121838 (view as bug list) Environment:
Last Closed: 2022-11-09 15:37:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1264181, 1886074, 1908146, 1908147, 1908148, 1949114    
Bug Blocks: 1891054    

Description Paul Needle 2015-01-16 10:23:18 UTC
This Bugzilla is being used to track details regarding a feature request for full end-to-end high-availability for Red Hat OpenStack Platform guests.

Comment 5 Perry Myers 2015-02-18 14:25:22 UTC
Note, the current approach we're considering for Compute Node HA and guest/VM HA is roughly described here:
http://blog.russellbryant.net/2014/10/15/openstack-instance-ha-proposal

Comment 6 Fabio Massimo Di Nitto 2015-02-18 16:07:55 UTC

*** This bug has been marked as a duplicate of bug 1185030 ***

Comment 7 Russell Bryant 2015-02-19 19:14:47 UTC
I believe this was closed as a duplicate, though the requests are not quite the same.  The work in bug 1185030 is about failures of compute nodes themselves, while this request is more about responding to failures of applications in guests.

Comment 8 Andrew Beekhof 2015-02-19 20:05:18 UTC
The hurdle for HA inside the guest is always people's willingness to have another daemon running there. *cough* matahari *cough*

pacemaker-remoted can certainly fill this role however this option is currently incompatible with using pacemaker-remoted for the compute-node (nested pacemaker-remoted is not currently on the roadmap).  There is also the issue of where the guest's HA configuration should live.

UNLESS

- guest HA was limited to services inside a single guest (no cross-talk or co-ordination required between guests)
- the guests' HA configuration lived inside the guest (in a yet to be determined form)

If this is the case, we could use pacemaker-remoted inside the guest in stand-alone mode (no communication to the outside world).


Other options:

1. We implement the method for scaling corosync as discussed pre-DevConf this year.
  This would allow the compute nodes to be full members of the cluster and pacemaker-remoted to be used for guests.

  - Drawback, the configuration will get quite large as it will contain all compute nodes and all HA service configuration for guests.

2. We allow crosstalk between guests on a single compute node by leaving each as a single-node cluster and including the guest service configuration in the compute node's cluster.

  - Dubious benefit here, failure of the compute node would take down the entire virtual cluster.

3. For pure monitoring of guest services, we could make use of nagios agents (which normally run from outside the target host). This is already supported.

Comment 11 Stephen Gordon 2016-06-09 18:52:18 UTC
Bulk update to reflect scope of Red Hat OpenStack Platform 9 and Red Hat OpenStack Platform does not include this issue (No pm_ack+).

Comment 13 Andrew Beekhof 2017-08-04 01:28:50 UTC
Bumping to 7.6 for now but this may become a priority for OSP13 depending on how the Ceph folks implement their per-pool NFS servers.

Comment 30 Brandon Perkins 2020-12-16 15:42:26 UTC
*** Bug 1908099 has been marked as a duplicate of this bug. ***

Comment 38 Chris Feist 2022-11-09 15:37:59 UTC
Closing this CURRENTRELEASE as this is now supported as of 8.7

Comment 39 Red Hat Bugzilla 2023-09-18 00:11:04 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days