Bug 1127824 - [Docs] [Installer] [RFE] Need documentation for replacing a dead HA controller node
Summary: [Docs] [Installer] [RFE] Need documentation for replacing a dead HA controlle...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation
Version: 5.0 (RHEL 7)
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: z2
: 5.0 (RHEL 7)
Assignee: Dan Macpherson
QA Contact: RHOS Documentation Team
URL:
Whiteboard:
Depends On:
Blocks: 1163445
TreeView+ depends on / blocked
 
Reported: 2014-08-07 15:40 UTC by Mike Burns
Modified: 2016-03-09 05:29 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of: 1123513
Environment:
Last Closed: 2016-03-09 05:29:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mike Burns 2014-08-07 15:40:42 UTC
+++ This bug was initially created as a clone of Bug #1123513 +++

Description of problem:

For node replacement, Installer should provide a way to specify which node in compute cluster should be replaced, it could be down already, or still active.
But new node should attain identity of replaced node, including its network connections and its IP addresses. For this RFE assume that there no user VMs or containers on the node. (for the future we should add RFE for replacing that have user VMs or containers). Start with KVM hypervisor first. 

We need a manual process documented prior to incorporating into automated script

For node replacement, for non-HA - not allowed.
For HA is is analogous to taking a node down and adding a new node that assumes it personality. Thus, allows avoiding updating configuration info in other nodes of controller cluster and on Foreman node.


--- Additional comment from arkady kanevsky on 2014-07-26 17:34:19 EDT ---

For node replacement, do not forget to handle populating new node with Ceph keyring if Ceph is used for block backend.

Comment 3 Andrew Dahms 2014-09-22 10:27:22 UTC
Hi Mike,

What are the key steps involved in taking a node down and adding a new node that assumes the personality of that node? Is it enough for users to remove that node from the environment and provision a new node all by itself using the same parameters as the previous node?

Kind regards,

Andrew

Comment 4 Mike Burns 2014-10-02 18:50:02 UTC
Fabio, do you have this information? Or have someone who can provide it?

Comment 5 Fabio Massimo Di Nitto 2014-10-03 07:02:34 UTC
(In reply to Mike Burns from comment #4)
> Fabio, do you have this information? Or have someone who can provide it?

I can´t speak for foreman here, but from a pacemaker cluster perspective, as long as:

- old node is shutdown
- new node has same host name / ip address

It´s enough to re-exchange pcs certificates (pcs cluster auth ...) and pcs cluster setup (on the local node, without --all or copy /etc/corosync.conf from a running node to the new one) and start pacemaker.

Clearly all openstack services have to be pre-configured but i would expect that to happen from puppet deployment.

Comment 6 Mike Burns 2014-10-22 16:04:28 UTC
Andrew, can we get the content from Comment 5 into the docs for A2?

Comment 7 Andrew Dahms 2014-10-22 22:51:51 UTC
Hi Mike,

I'll work on some content and see if I can get a draft to you soon.

Kind regards,

Andrew

Comment 8 Dan Macpherson 2015-04-20 03:20:03 UTC
Hi Andrew,

Just wanted to follow up on comment #7. Did you manage to put together any draft materials on this item?

I'm thinking of putting together an article on this item, partially based on this scaling process in the Installer Guide: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/6/html/Installer_and_Foreman_Guide/chap-Scaling_the_Environment.html

Comment 9 Andrew Dahms 2015-04-20 04:07:31 UTC
Hi Dan,

This one was right around the handover time, so I am afraid I do not. My apologies for not adding a comment to the bug at the time to clarify this.

With regards to the article - sounds good. BZ#1172289 also runs along a similar theme. I'll assign that over as well for review.

Kind regards,

Andrew

Comment 10 Dan Macpherson 2016-03-09 05:29:26 UTC
So this content now exists and is supported in OSP 7:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Scaling_the_Overcloud.html#Replacing_Controller_Nodes

Spoke with adahms. Agreed that we should focus efforts toward OSP 7 and 8. As a result, closing this BZ.


Note You need to log in before you can comment on or make changes to this bug.