Bug 1313529 - rhel-osp-director: [Doc] Replacing Controller Nodes for 8.0 is missing.
rhel-osp-director: [Doc] Replacing Controller Nodes for 8.0 is missing.
Product: Red Hat OpenStack
Classification: Red Hat
Component: documentation (Show other bugs)
8.0 (Liberty)
Unspecified Unspecified
high Severity unspecified
: ---
: 8.0 (Liberty)
Assigned To: Dan Macpherson
Alexander Chuzhoy
: Documentation
: 1313528 (view as bug list)
Depends On:
Blocks: 1286302
  Show dependency treegraph
Reported: 2016-03-01 14:37 EST by Alexander Chuzhoy
Modified: 2016-06-16 00:41 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2016-06-16 00:41:11 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Alexander Chuzhoy 2016-03-01 14:37:25 EST
rhel-osp-director: [Doc] Replacing Controller Nodes for 8.0 is missing.

Tried to follow the steps for 7:

The pcs cluster was in maintenance mode, so I couldn't even start the keystone.

After manually disabling the maintenance mode on  pcs cluster, still wasn't able to successfully complete re-deployment of the overcloud as mentioned in step 7 in the guide above.
Comment 2 Mike Burns 2016-03-01 14:43:38 EST
*** Bug 1313528 has been marked as a duplicate of this bug. ***
Comment 4 Dan Macpherson 2016-04-04 13:23:53 EDT
Sorry this took so long. I've only just been able to successfully deploy a HA Overcloud on OSP 8.

Sasha, I ran into the same error you did but I think I managed to figure it out. Essentially, there's still some rouge entries for overcloud-controller-1 still in Pacemaker/Corosync, which is causing pcs to auth in a loop. However, once I cleared them away, the deployment seems to be progressing fine. I'll update the procedure tomorrow and that way you should be to test it out.
Comment 5 Dan Macpherson 2016-04-05 15:23:24 EDT
Okay, tested it out. It seems to be the failed node entry in corosync.conf that is stopping the deployment from continuing. I've got that step closer to the end, but I'll bump it up to before step 6 (restarting pacemaker and corosync). Based on my testing, this should work.
Comment 7 Dan Macpherson 2016-05-12 01:51:30 EDT
I think this might be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1327701

In any case, I'll switch this over the ON_QA too. As mentioned int he other BZ, here's the latest draft:


Sasha, is there anything else needed for this procedure?
Comment 8 Alexander Chuzhoy 2016-05-12 09:09:47 EDT

This section of the doc looks good.
Comment 9 Dan Macpherson 2016-06-16 00:41:11 EDT
Changes now live on the customer portal.

Note You need to log in before you can comment on or make changes to this bug.