Bug 831722
Summary: | corosync and pacemaker start should have a delay in between | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Jaroslav Kortus <jkortus> |
Component: | pacemaker | Assignee: | Andrew Beekhof <abeekhof> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.0 | CC: | abeekhof |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | pacemaker-1.1.8-3.el7 | Doc Type: | Bug Fix |
Doc Text: |
Cause:
The logic for reattempting the initial connection to the cluster was incorrect.
Consequence:
Pacemaker gave up immediately if the initial connection attempt failed.
Fix:
Correctly interpret the connection result
Result:
Pacemaker waits for corosync to become fully active.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2014-06-16 06:36:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jaroslav Kortus
2012-06-13 16:06:21 UTC
I got to the bottom of this one a few days ago. Our retry logic was borked. Andrew Beekhof (10 days ago) 7276f67: High: mcp: Correctly retry the connection to corosync on failure https://github.com/beekhof/pacemaker/commit/7276f67 I'll include this in a pacemaker build next week. Actually, having said that, there are rare occasions when corosync simply fails to start ("service corosync start" fails and corosync is automatically stopped). Not much pcs or pacemaker can do in that situation though. Re-assigning to abeekhof for verification since it sounds like this was fixed a few months back in pacemaker. |