Bug 1284404

Summary: make restarting pcsd a synchronous operation
Product: Red Hat Enterprise Linux 7 Reporter: Tomas Jelinek <tojeline>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2CC: cfeist, cluster-maint, idevat, omular, rsteiger, tlavigne, tojeline
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.158-6.el7 Doc Type: Bug Fix
Doc Text:
Cause: Pcs restarts pcsd after distributing SSL certificates to the cluster nodes in order to reload the certificates. Consequence: Pcs does not wait for the restart to finish. Following pcs commands may exit with an error if they hit them moment pcsd is still being restarted. Fix: Make restarting pcsd on the nodes a synchronous operation. Result: Pcs waits for pcsd on the nodes to fully start.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 18:22:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
proposed fix
none
proposed fix (part2) none

Description Tomas Jelinek 2015-11-23 09:20:19 UTC
We need to make restarting pcsd a synchronous operation so scripts calling pcs can reliably wait for the restart to finish and continue their execution. Otherwise a script may run a command when pcsd is being restarted which may cause the command fail.

Comment 5 Tomas Jelinek 2017-01-12 15:36:01 UTC
Created attachment 1239988 [details]
proposed fix

Test:

[root@rh73-node1:~]# pcs cluster setup --name test rh73-node1 rh73-node2 && pcs cluster start --all --wait
Destroying cluster on nodes: rh73-node1, rh73-node2...
rh73-node1: Stopping Cluster (pacemaker)...
rh73-node2: Stopping Cluster (pacemaker)...
rh73-node2: Successfully destroyed cluster
rh73-node1: Successfully destroyed cluster

Sending cluster config files to the nodes...
rh73-node1: Succeeded
rh73-node2: Succeeded

Synchronizing pcsd certificates on nodes rh73-node1, rh73-node2...
rh73-node1: Success
rh73-node2: Success
Restarting pcsd on the nodes in order to reload the certificates...
rh73-node1: Success
rh73-node2: Success
rh73-node1: Starting Cluster...
rh73-node2: Starting Cluster...
Waiting for node(s) to start...
rh73-node2: Started
rh73-node1: Started

'pcs cluster start --all --wait' does not crash and successfully waits for the nodes to start.

Comment 7 Ivan Devat 2017-02-20 08:30:56 UTC
After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.156-1.el7.x86_64

[vm-rhel72-1 ~] $ pcs cluster setup --name=devcluster vm-rhel72-1 vm-rhel72-3 && pcs cluster start --all --wait
Destroying cluster on nodes: vm-rhel72-1, vm-rhel72-3...
vm-rhel72-1: Stopping Cluster (pacemaker)...
vm-rhel72-3: Stopping Cluster (pacemaker)...
vm-rhel72-1: Successfully destroyed cluster
vm-rhel72-3: Successfully destroyed cluster

Sending cluster config files to the nodes...
vm-rhel72-1: Succeeded
vm-rhel72-3: Succeeded

Synchronizing pcsd certificates on nodes vm-rhel72-1, vm-rhel72-3...
vm-rhel72-3: Success
vm-rhel72-1: Success
Restarting pcsd on the nodes in order to reload the certificates...
vm-rhel72-3: Success
vm-rhel72-1: Success
vm-rhel72-1: Starting Cluster...
vm-rhel72-3: Starting Cluster...
Waiting for node(s) to start...
vm-rhel72-1: Started
vm-rhel72-3: Started

Comment 11 Tomas Jelinek 2017-05-30 08:12:48 UTC
Creating a new cluster from GUI fails with an error.

The GUI runs pcs cluster setup over network on one of nodes. The command restarts pcsd daemon on all nodes in the new cluster. Therefore the pcsd daemon, which was asked by GUI to run the command, gets restarted. This returns HTTP 400 to pcsd daemon on the GUI node. The GUI daemon returns an error to GUI and does not add the new cluster to a list of clusters.

Comment 12 Ivan Devat 2017-05-31 12:45:38 UTC
Created attachment 1283769 [details]
proposed fix (part2)

Comment 13 Ivan Devat 2017-05-31 12:46:55 UTC
After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.158-3.el7.x86_64

Open gui. Create a new cluster. Wait. The new cluster is added to a list of clusters.

Comment 20 errata-xmlrpc 2017-08-01 18:22:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1958