Bug 1259599 - [RFE] [Undercloud] Undercloud High Availability
Summary: [RFE] [Undercloud] Undercloud High Availability
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: unspecified
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Angus Thomas
QA Contact: Udi Shkalim
URL:
Whiteboard:
: 1337935 (view as bug list)
Depends On:
Blocks: 1188000 1476902 1592486
TreeView+ depends on / blocked
 
Reported: 2015-09-03 07:07 UTC by Takashi Aosawa
Modified: 2024-03-25 14:55 UTC (History)
27 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-18 17:40:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-23633 0 None None None 2023-03-24 13:34:57 UTC
Red Hat Knowledge Base (Solution) 2470691 0 None None None 2016-11-23 06:42:36 UTC

Description Takashi Aosawa 2015-09-03 07:07:14 UTC
1. Bug Overview:
a) Description of bug report: 

[RHEL-OSP 7.0]: Undercloud high availability

b) Bug Description:

Currently RHEL OSP Director does not provide a way
to make the Undercloud high availability

Version-Release number of selected component:
 RHEL OSP Director 7.0 GA
 
2. Bug Details:

I would like to know how to make the Undercloud high availability
composition.

The Undercloud is an OpenStack instance.  Therefore following
documents may be helpful as references.

  OpenStack High Availability Guide
  http://docs.openstack.org/high-availability-guide/content/index.html

  High availability with Red Hat Enterprise Linux OpenStack Platform 4
  http://www.redhat.com/ja/resources/high-availability-red-hat-enterprise-linux-openstack-platform-4

But I am not sure these are enough for the undercloud
and could recommend them to our customers.

According to the roadmap of the Director, Red Hat seems to have
a plan of high availability Undercloud.  I guess it because
3 nodes are figured for the Undercloud nodes.  (See p.32, p.39)

  RHEL OpenStack Platform director Overview and Roadmap
  http://videos.cdn.redhat.com/summit2015/presentations/13790_red-hat-enterprise-linux-openstack-platform-deployment-tool-roadmap.pdf

If Red Hat has the plan and it is realized in a short term,
we can recommend customers to wait for updating the Director.

If it takes time that the Director supports the Undercloud
high availability, alternative way should be presented to customers.

  
3. Business Justification

Generally speaking, production environments always require
high availability composition to remove SPOF.
The Undercloud node can be an SPOF.


4. Primary contact at Red Hat, email, phone (chat)
  ykawada (Yo kawada)

5. Primary contact at Partner, email, phone (chat)
  t-aosawa.nec.com (Takashi Aosawa)

Comment 6 Mike Burns 2016-04-07 20:50:54 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 10 Fabio Massimo Di Nitto 2016-10-12 04:02:55 UTC
*** Bug 1337935 has been marked as a duplicate of this bug. ***

Comment 15 Yoshiki Ohmura 2017-02-03 01:07:08 UTC
Hi Aosawa-san,

Other partner also need to track this feature and requested to make this visible for them.
I believe this BZ doesn't contain confidential information, but I'd like to ask you if you can make this RFE public or make this RFE visible from other partner.

Best Regards,
Yoshiki

Comment 17 Yoshiki Ohmura 2017-02-03 08:30:27 UTC
Hi Aosasa-san,

Thank you for your comment via email.
Now I made this one as public RFE.

Best Regards,
Yoshiki

Comment 19 Red Hat Bugzilla Rules Engine 2017-02-06 19:31:14 UTC
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 23 Dan Prince 2017-07-14 14:50:44 UTC
Agree with Fabio. This task most likely fits with the Deployment framework DFG.

----

Some general thoughts here on the approach we take to implement this. I would not recommend using instack-undercloud as a basis. I would much rather see us leveral some of the new containers deployment bits with Ansible to drive a multi-node undercloud installer.

The new tripleo-heat-templates underclouds gets us part of the way here. This along with deployed-server and some ansible orchestration around it could do the trick I think quite nicely. There are some up-front network considerations that need to be worked out for VIPs, etc (and perhaps we will consult PIDONE for these details) but in general I think most of this is deployment architecture work.

Comment 26 Jaromir Coufal 2017-08-10 19:18:25 UTC
Moving out of OSP13. Delivering containerized undercloud is having higher priority.

Comment 29 Jaromir Coufal 2018-02-22 20:19:58 UTC
This feature is heavily depending on undercloud containerization and will be delayed until the undercloud is containerized. As for the moment the containerization is targeted for RHOSP 14, once we are sure about this dependency being satisfied and the final architecture lands, we will be able to review scoping of this feature and update targeting release.

For customers requesting this feature, please get in touch with me (jcoufal) and we can discuss further specific requirements and possible mitigation.

Best regards
-- Jarda

Comment 30 Sudeep Batra 2019-02-04 03:46:03 UTC
I suggest we utilize Kubernetes Cluster for building an undercloud setup and also an overcloud setup.
This will enable us to utilize the rich features of Kubernetes like auto-scaling and high-availability.

The undercloud and the overcloud containerized components will then become seamlessly resilient and we will have a resolution to this 5 year old requirement ?

Comment 32 Jaromir Coufal 2019-05-15 12:22:24 UTC
This request is from the days when undercloud had metering service on it, possible the only service which would justify extra 2 nodes for HA capabilities. Since this service is no longer on the undercloud node for quite some time, there is no critical service which would require HA of the management node for environment performance (therefore undercloud is not a single point of failure) and given how easy and fast it is to backup and restore the node, we did not receive enough requests from our customers & partners.

Given the above, this feature does not have as high priority as other important enhancements in the undercloud node. Therefor it has been pushed further and it is not on the list for OSP16 at the moment and we don't have a target release for its implementation.

If this is a critical feature, please provide more details of the use case and we will be happy to revisit.

Suggestion for Kubernetes driven deployments is not so simple. There are several services in OpenStack which are not written even in cloud-native fashion, autoscaling features of K8s become less relevant at that point and questions benefits brought by K8s. There are backwards compatibility reasons as well and we are protecting our customers & partners from highly disruptive changes which would break their environments and require hard migration unless the benefits are justifiable.

Comment 33 Kota Akatsuka 2019-05-17 01:00:30 UTC
> If this is a critical feature, please provide more details of the use case and we will be happy to revisit.

There are several usual task which required undercloud after deployment.

 1. Minor updating Overcloud
 2. Scaling Overcloud Compute
 3. Updating configuration of Overcloud

Currently, once undercloud is broken during these tasks, Overcloud might be become unstable state.
In addition, it's hard to recover undercloud from backup since backup is taken before these tasks and not containing latest situation of overcloud.
Therefore, recovering undercloud might take a long time.
Worst case, we might not be able to recover the undercloud to fit with current Overcloud.

How does Red Hat think about handling these situation?

Furthermore, there is following recommendation in backup and restore document.

  https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/back_up_and_restore_the_director_undercloud/index

  If you cannot have data loss at all, you should include high availability in your deployment strategy, in addition to using backups. 

Does Red Hat think Backup/Restore is truly enough?

Comment 40 Kevin Carter 2020-11-18 17:40:24 UTC
The undercloud is expected to be a single node with no high availability, which can be pseudo ephemeral. This feature is not a current priority and is not expected to be on our road map at this time.

Comment 41 Red Hat Bugzilla 2023-09-18 00:11:48 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.