Bug 1351753 - Controller node replacement with same hostname and ips does not work
Summary: Controller node replacement with same hostname and ips does not work
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 8.0 (Liberty)
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-30 18:04 UTC by Ben Nemec
Modified: 2020-06-11 12:57 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-21 11:55:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
journal entries from failed galera node (6.12 KB, text/plain)
2016-06-30 18:04 UTC, Ben Nemec
no flags Details
mariadb log showing restart loop (7.92 KB, text/plain)
2016-06-30 18:05 UTC, Ben Nemec
no flags Details

Description Ben Nemec 2016-06-30 18:04:26 UTC
Created attachment 1174703 [details]
journal entries from failed galera node

Description of problem: Following the documented procedure to replace a controller node in an HA setup fails if predictable hostnames and ips are in use and the same hostname and ip are reused for the replacement node.


Version-Release number of selected component (if applicable):


How reproducible: Both times I've attempted to do this it has failed in basically the same way.


Steps to Reproduce:
1. Deploy HA overcloud with predictable hostnames and ips
2. Attempt to replace one of the controllers with another controller using the same hostname and ips.
3.

Actual results: Galera does not start properly on the new node.  Mariadb seems to be in a restart loop and the journal shows messages from Galera like:

ERROR: MySQL is not running


Expected results: Successful controller node replacement.


Additional info:

Comment 2 Ben Nemec 2016-06-30 18:05:36 UTC
Created attachment 1174704 [details]
mariadb log showing restart loop

Comment 3 Emilien Macchi 2016-12-22 19:02:13 UTC
Moving to Pidone since it affects HA/Galera.

Comment 4 Damien Ciabrini 2016-12-23 14:16:48 UTC
Could it be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1326507 ?

Before https://bugzilla.redhat.com/show_bug.cgi?id=1338623 was fixed, the old controller node replacement procedure would fail due to an attribute never showing in pacemaker, and the resource agent would keep on retrying to recover the last sequence number.

Any chance you retry the latest replacement procedure with up to date packages to confirm this is already fixed?

Comment 5 Angela Soni 2017-02-10 18:25:30 UTC
Following up to see if there are any updates and if someone could verify the above comment of testing this with latest packages?

Comment 6 Angela Soni 2017-03-08 17:53:33 UTC
I have gone through  https://bugzilla.redhat.com/show_bug.cgi?id=1326507 ?
and this bug could be a dup, but someone from QA needs to verify if assigning the same hostname and IP as previous controller node would work or not? Replacing a controller node is not a issue, replacing with same hostname and IP is. Can QA test this w/ latest packages and see if that works?

- Angela

Comment 7 Ben Nemec 2017-04-27 16:13:43 UTC
I finally was able to complete a node replacement with the same hostname and ips, so it appears this is indeed fixed.  I did find one additional issue with the documented procedure for OSP 8 and 9, but it's unrelated to the same hostname and ip procedure so I've opened a new bug for it: https://bugzilla.redhat.com/show_bug.cgi?id=1446307


Note You need to log in before you can comment on or make changes to this bug.