Bug 1193499
Summary: | member weirdness when adding/removing nodes | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Radek Steiger <rsteiger> |
Component: | pacemaker | Assignee: | Andrew Beekhof <abeekhof> |
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 6.7 | CC: | abeekhof, cfeist, cluster-maint, cluster-qe, fdinitto, jkortus, kgaillot, kwenning |
Target Milestone: | rc | ||
Target Release: | 6.8 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | pacemaker-1.1.14-1.1.el6 | Doc Type: | Bug Fix |
Doc Text: |
Cause: Removed nodes were not consistently purged from all Pacemaker components' peer caches.
Consequence: Removing and adding nodes can result in a node ID being recycled, which should be OK but caused daemon crashes due to conflicting information from the former node not being purged from the peer cache.
Fix: Peer cache management has been overhauled so that the libcluster library handles node reaping itself rather than relying on the individual components to do it correctly.
Result: Recycling node IDs should not cause any problems.
|
Story Points: | --- |
Clone Of: | 1162727 | Environment: | |
Last Closed: | 2016-05-10 23:51:13 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1162727 | ||
Bug Blocks: |
Description
Radek Steiger
2015-02-17 13:25:44 UTC
> Version-Release number of selected component (if applicable):
pacemaker-1.1.12-4.el6.x86_64
pcs-0.9.138-1.el6.x86_64
corosync-1.4.7-1.el6.x86_64
cman-3.0.12.1-68.el6.x86_64
Bug #1162727 has the list of patches required here. Patches: 0eb41da: Fix: attrd: Remove offline nodes from node cache for "peer-remove" requests ba8d3cd: Fix: membership: Prevent use-after-free in reap_crm_member() 68d5738: Fix: cluster: Remove unknown offline nodes with conflicting unames from node cache c97575b: Fix: crmd: Remove state of unknown nodes with conflicting unames from CIB 50ffa21: Fix: crmd: Remove unknown nodes with conflicting unames from CIB ddccf97: Fix: Membership: Detect and resolve nodes that change their ID 371e79c: Fix: attrd: Clean out the node cache when requested by the admin b658b2b: Fix: attrd: Simplify how node deletions happen bf15d36: Fix: cib: Avoid nodeid conflicts we don't care about 30a1ba9: Fix: fencing: Allow nodes to be purged from the member cache c8b413f: Fix: crm_node: Correctly remove nodes from the CIB by nodeid 0b98ef1: Fix: stonith-ng: Correctly track node state 72b3a9a: Fix: stonith-ng: No reply is needed for CRM_OP_RM_NODE_CACHE e48a7a0: Fix: cib: Correctly track node state f51c05d: Fix: cluster: Invoke crm_remove_conflicting_peer() only when the new node's uname is being assigned in the node cache and the lib/cluster portion of: 8727a4f: Feature: Allow fail-counts to be removed en-mass when the new attrd is in operation A fix for upstream is pending testing but will not make it in time for 6.7. Fixed upstream as of commit 49fd91f. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0856.html |