1420206 – Regarding CFME 4.2 HA doc

Bug 1420206 - Regarding CFME 4.2 HA doc

Summary: Regarding CFME 4.2 HA doc

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	Documentation
Sub Component:
Version:	5.7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	GA
Target Release:	5.7.2
Assignee:	Dayle Parker
QA Contact:	Chris Budzilowicz
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1430409 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-02-08 07:02 UTC by tachoi
Modified:	2020-04-15 15:14 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-03-16 23:29:36 UTC
Category:	---
Cloudforms Team:	---
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description tachoi 2017-02-08 07:02:29 UTC

Document URL:
https://access.redhat.com/documentation/en/red-hat-cloudforms/4.2/single/configuring-high-availability/

Section Number and Name:
5.1 Reintroducing the failed node

Describe the issue:
1. It would be better to have $ cursor for postgres user. normally # means root.
ex)
#su - postgres
$pg_rewind -D ...

2. "pg_rewind" will return an error for time is not synched between master and standby node, so NTP sync step is needed before executing pg_rewind command for all related db nodes(other appliance nodes either)
edit /etc/ntp.conf with valid ntp server info
#systemctl disable chronyd.service
#systemctl stop chronyd.service
#systemctl enable ntpd.service
#systemctl start ntpd.service

3. Step 4. "copy over /var/lib/pgsql/.pgpass", it also needs to be owned by postgres user and the file permission must be 600.
#chown postgres:postgres /var/lib/pgsql/.pgpass
#chmod 600 /var/lib/pgsql/.pgpass

4. Step 5. NOTE " If the follow command times out and ... to re-add the node"
=> However this operation will be failing because new master server is having same a record(node-id) already for the failed node(previous master node).
=> Need to have a procedure to clean up previous node id or forcely add failed node with the same node id.

Error msg will be like
########################
"Configuring Replication Standby Server...
[2017-02-07 23:51:41] [ERROR] Unable to create node record
ERROR: duplicate key value violates unique constraint "repl_nodes_pkey"
DETAIL: Key (id)=(1) already exists."

Suggestions for improvement:

Additional information:

Comment 2 Andrew Dahms 2017-02-28 05:48:26 UTC

Assigning to Dayle for review.

Dayle - would you be able to take a look at the above and incorporate this feedback?

Comment 6 Dayle Parker 2017-03-10 08:01:14 UTC

*** Bug 1430409 has been marked as a duplicate of this bug. ***

Comment 17 Dayle Parker 2017-03-16 23:29:36 UTC

These changes are now live on the Customer Portal:

https://access.redhat.com/documentation/en-us/red_hat_cloudforms/4.2/html-single/configuring_high_availability/

Thanks to Taeho, Nick, and Suyog for your reviews on this one!

Note You need to log in before you can comment on or make changes to this bug.