1564218 – [osp13][controller replacement] Cannot add new node to pcs cluster - Error: Error connecting to controller-3 (HTTP error: 408)

Bug 1564218 - [osp13][controller replacement] Cannot add new node to pcs cluster - Error: Error connecting to controller-3 (HTTP error: 408)

Summary: [osp13][controller replacement] Cannot add new node to pcs cluster - Error: E...

Keywords:
Status:	CLOSED DUPLICATE of bug 1600169
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	rhosp-director
Sub Component:
Version:	13.0 (Queens)
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	ga
Target Release:	---
Assignee:	Damien Ciabrini
QA Contact:	Artem Hrechanychenko
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-04-05 17:11 UTC by Artem Hrechanychenko
Modified:	2018-11-21 20:57 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-11-21 20:57:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Artem Hrechanychenko 2018-04-05 17:11:19 UTC

Description of problem:

During Controller replacement procedure we need to add new controller to pcs cluster
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/12/html-single/director_installation_and_usage/#sect-Replacing_Controller_Nodes

10.4.3.6

Add the new node to the cluster: 

Sometimes when I do that I got HTTP 408

[heat-admin@controller-0 ~]$ sudo pcs cluster node add controller-3 --wait=90
Disabling SBD service...
controller-3: sbd disabled
Sending 'corosync authkey' to 'controller-3'
controller-3: successful distribution of the file 'corosync authkey'
Sending remote node configuration files to 'controller-3'
Error: Error connecting to controller-3 (HTTP error: 408)
Error: Errors have occurred, therefore pcs is unable to continue

heat-admin@controller-0 ~]$ sudo pcs cluster node add controller-3 --wait=500
Disabling SBD service...
controller-3: sbd disabled
Sending 'corosync authkey' to 'controller-3'
controller-3: successful distribution of the file 'corosync authkey'
Sending remote node configuration files to 'controller-3'
controller-3: successful distribution of the file 'pacemaker_remote authkey'
controller-0: Corosync updated
controller-2: Corosync updated
Setting up corosync...
controller-3: Succeeded
Synchronizing pcsd certificates on nodes controller-3...
controller-3: Success
Restarting pcsd on the nodes in order to reload the certificates...
controller-3: Success

Version-Release number of selected component (if applicable):
OSP13
[heat-admin@controller-0 ~]$ sudo rpm -qa "*pacemaker*"
pacemaker-cli-1.1.18.notifyfix-11.el7.x86_64
ansible-pacemaker-1.0.4-0.20180220234310.0e4d7c0.el7ost.noarch
pacemaker-cluster-libs-1.1.18.notifyfix-11.el7.x86_64
pacemaker-libs-1.1.18.notifyfix-11.el7.x86_64
pacemaker-1.1.18.notifyfix-11.el7.x86_64
puppet-pacemaker-0.7.2-0.20180301221314.2d2d877.el7ost.noarch
pacemaker-nagios-plugins-metadata-1.1.18.notifyfix-11.el7.x86_64
pacemaker-remote-1.1.18.notifyfix-11.el7.x86_64

[heat-admin@controller-0 ~]$ sudo rpm -qa "*pcs*"
pcs-0.9.162-5.el7.x86_64

How reproducible:
in 70% of test

Steps to Reproduce:
1.Deploy OSP13
2.Try to replace controller using official documentation


Actual results:
Error: Error connecting to controller-3 (HTTP error: 408)
Error: Errors have occurred, therefore pcs is unable to continue

Expected results:
controller-3: successful distribution of the file 'corosync authkey'
Sending remote node configuration files to 'controller-3'
controller-3: successful distribution of the file 'pacemaker_remote authkey'
controller-0: Corosync updated
controller-2: Corosync updated
Setting up corosync...
controller-3: Succeeded
Synchronizing pcsd certificates on nodes controller-3...
controller-3: Success
Restarting pcsd on the nodes in order to reload the certificates...
controller-3: Success


Additional info:

Comment 3 Damien Ciabrini 2018-04-18 12:24:22 UTC

Hey Artem, we lack sosreports for that one, could you please try and rerun the controller replacement procedure and let us know if we can access the env or where to get sosreports?

Thanks!

Comment 4 Artem Hrechanychenko 2018-05-01 14:54:32 UTC

wasn't reproduce in two last attempts

Comment 5 Artem Hrechanychenko 2018-05-03 13:07:57 UTC

works in passed_phase2 puddle - 2018-04-26.3

Comment 6 Artem Hrechanychenko 2018-10-25 16:33:37 UTC

Reproduced
 OSP13 puddle - 2018-10-18.1

The reports should be available here: http://rhos-release.virt.bos.redhat.com/log/bz1564218


[heat-admin@controller-0 ~]$ sudo pcs cluster node add controller-3
Disabling SBD service...
controller-3: sbd disabled
Sending 'corosync authkey' to 'controller-3'
controller-3: successful distribution of the file 'corosync authkey'
Sending remote node configuration files to 'controller-3'
Error: Error connecting to controller-3 (HTTP error: 408)
Error: Errors have occurred, therefore pcs is unable to continue

Comment 7 Michele Baldessari 2018-11-21 20:57:26 UTC

This has been fixed with pcs-0.9.165-2.el7

*** This bug has been marked as a duplicate of bug 1600169 ***

Note You need to log in before you can comment on or make changes to this bug.