Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1284404

Summary:

make restarting pcsd a synchronous operation

Product:

Red Hat Enterprise Linux 7

Reporter:

Tomas Jelinek <tojeline>

Component:

pcs

Assignee:

Tomas Jelinek <tojeline>

Status:

CLOSED ERRATA

QA Contact:

cluster-qe <cluster-qe>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

7.2

CC:

cfeist, cluster-maint, idevat, omular, rsteiger, tlavigne, tojeline

Target Milestone:

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

pcs-0.9.158-6.el7

Doc Type:

Bug Fix

Doc Text:

Cause: Pcs restarts pcsd after distributing SSL certificates to the cluster nodes in order to reload the certificates. Consequence: Pcs does not wait for the restart to finish. Following pcs commands may exit with an error if they hit them moment pcsd is still being restarted. Fix: Make restarting pcsd on the nodes a synchronous operation. Result: Pcs waits for pcsd on the nodes to fully start.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2017-08-01 18:22:57 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
proposed fix	none
proposed fix (part2)	none

Description Tomas Jelinek 2015-11-23 09:20:19 UTC

We need to make restarting pcsd a synchronous operation so scripts calling pcs can reliably wait for the restart to finish and continue their execution. Otherwise a script may run a command when pcsd is being restarted which may cause the command fail.

Comment 5 Tomas Jelinek 2017-01-12 15:36:01 UTC

Created attachment 1239988 [details]
proposed fix

Test:

[root@rh73-node1:~]# pcs cluster setup --name test rh73-node1 rh73-node2 && pcs cluster start --all --wait
Destroying cluster on nodes: rh73-node1, rh73-node2...
rh73-node1: Stopping Cluster (pacemaker)...
rh73-node2: Stopping Cluster (pacemaker)...
rh73-node2: Successfully destroyed cluster
rh73-node1: Successfully destroyed cluster

Sending cluster config files to the nodes...
rh73-node1: Succeeded
rh73-node2: Succeeded

Synchronizing pcsd certificates on nodes rh73-node1, rh73-node2...
rh73-node1: Success
rh73-node2: Success
Restarting pcsd on the nodes in order to reload the certificates...
rh73-node1: Success
rh73-node2: Success
rh73-node1: Starting Cluster...
rh73-node2: Starting Cluster...
Waiting for node(s) to start...
rh73-node2: Started
rh73-node1: Started

'pcs cluster start --all --wait' does not crash and successfully waits for the nodes to start.

Comment 7 Ivan Devat 2017-02-20 08:30:56 UTC

After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.156-1.el7.x86_64

[vm-rhel72-1 ~] $ pcs cluster setup --name=devcluster vm-rhel72-1 vm-rhel72-3 && pcs cluster start --all --wait
Destroying cluster on nodes: vm-rhel72-1, vm-rhel72-3...
vm-rhel72-1: Stopping Cluster (pacemaker)...
vm-rhel72-3: Stopping Cluster (pacemaker)...
vm-rhel72-1: Successfully destroyed cluster
vm-rhel72-3: Successfully destroyed cluster

Sending cluster config files to the nodes...
vm-rhel72-1: Succeeded
vm-rhel72-3: Succeeded

Synchronizing pcsd certificates on nodes vm-rhel72-1, vm-rhel72-3...
vm-rhel72-3: Success
vm-rhel72-1: Success
Restarting pcsd on the nodes in order to reload the certificates...
vm-rhel72-3: Success
vm-rhel72-1: Success
vm-rhel72-1: Starting Cluster...
vm-rhel72-3: Starting Cluster...
Waiting for node(s) to start...
vm-rhel72-1: Started
vm-rhel72-3: Started

Comment 11 Tomas Jelinek 2017-05-30 08:12:48 UTC

Creating a new cluster from GUI fails with an error.

The GUI runs pcs cluster setup over network on one of nodes. The command restarts pcsd daemon on all nodes in the new cluster. Therefore the pcsd daemon, which was asked by GUI to run the command, gets restarted. This returns HTTP 400 to pcsd daemon on the GUI node. The GUI daemon returns an error to GUI and does not add the new cluster to a list of clusters.

Comment 12 Ivan Devat 2017-05-31 12:45:38 UTC

Created attachment 1283769 [details]
proposed fix (part2)

Comment 13 Ivan Devat 2017-05-31 12:46:55 UTC

After Fix:

[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.158-3.el7.x86_64

Open gui. Create a new cluster. Wait. The new cluster is added to a list of clusters.

Comment 16 Ondrej Mular 2017-06-13 14:36:57 UTC

Additional fix:
https://github.com/ClusterLabs/pcs/commit/4028b0b50c17d44359c7d5ddbcddc6f417

Comment 20 errata-xmlrpc 2017-08-01 18:22:57 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1958