Bug 1186692
Summary: | cluster node removal should verify possible loss of quorum | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Radek Steiger <rsteiger> | ||||
Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | ||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 7.1 | CC: | cfeist, cluster-maint, tojeline | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | pcs-0.9.140-1.el7 | Doc Type: | Bug Fix | ||||
Doc Text: |
Cause:
User removes a node from a cluster where some nodes are not running.
Consequence:
Cluster loses a quorum.
Fix:
Detect whether removing a node will result in a loss of the quorum and do not remove the node if so.
Result:
User is informed that by removing the node the cluster will lose the quorum. User has to run the command with --force flag in order to remove the node.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2015-11-19 09:34:26 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1180506 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
Radek Steiger
2015-01-28 10:22:25 UTC
Created attachment 997107 [details]
proposed fix
Test: [root@rh70-node1:~]# pcs status nodes both Corosync Nodes: Online: rh70-node1 rh70-node2 rh70-node3 Offline: Pacemaker Nodes: Online: rh70-node1 rh70-node2 rh70-node3 Standby: Offline: [root@rh70-node1:~]# pcs cluster stop rh70-node2 rh70-node2: Stopping Cluster (pacemaker)... rh70-node2: Stopping Cluster (corosync)... [root@rh70-node1:~]# pcs cluster node remove rh70-node3 Error: Removing the node will cause a loss of the quorum, use --force to override [root@rh70-node1:~]# echo $? 1 [root@rh70-node1:~]# pcs status nodes both Corosync Nodes: Online: rh70-node1 rh70-node3 Offline: rh70-node2 Pacemaker Nodes: Online: rh70-node1 rh70-node3 Standby: Offline: rh70-node2 [root@rh70-node1:~]# pcs cluster node remove rh70-node3 --force rh70-node3: Stopping Cluster (pacemaker)... rh70-node3: Successfully destroyed cluster rh70-node1: Corosync updated rh70-node2: Corosync updated [root@rh70-node1:~]# echo $? 0 [root@rh70-node1:~]# pcs status nodes both Corosync Nodes: Online: rh70-node1 Offline: rh70-node2 Pacemaker Nodes: Online: rh70-node1 Standby: Offline: rh70-node2 Before Fix: [root@rh71-node1 ~]# rpm -q pcs pcs-0.9.137-13.el7_1.2.x86_64 [root@rh71-node1:~]# pcs status nodes both Corosync Nodes: Online: rh71-node1 rh71-node2 rh71-node3 Offline: Pacemaker Nodes: Online: rh71-node1 rh71-node2 rh71-node3 Standby: Offline: [root@rh71-node1:~]# pcs cluster stop rh71-node2 rh71-node2: Stopping Cluster (pacemaker)... rh71-node2: Stopping Cluster (corosync)... [root@rh71-node1:~]# pcs cluster node remove rh71-node3 rh71-node3: Stopping Cluster (pacemaker)... rh71-node3: Successfully destroyed cluster rh71-node1: Corosync updated rh71-node2: Corosync updated After Fix: [root@rh71-node1:~]# rpm -q pcs pcs-0.9.140-1.el6.x86_64 [root@rh71-node1:~]# pcs status nodes both Corosync Nodes: Online: rh71-node1 rh71-node2 rh71-node3 Offline: Pacemaker Nodes: Online: rh71-node1 rh71-node2 rh71-node3 Standby: Offline: [root@rh71-node1:~]# pcs cluster stop rh71-node2 rh71-node2: Stopping Cluster (pacemaker)... rh71-node2: Stopping Cluster (corosync)... [root@rh71-node1:~]# pcs cluster node remove rh71-node3 Error: Removing the node will cause a loss of the quorum, use --force to override [root@rh71-node1:~]# echo $? 1 [root@rh71-node1:~]# pcs status nodes both Corosync Nodes: Online: rh71-node1 rh71-node3 Offline: rh71-node2 Pacemaker Nodes: Online: rh71-node1 rh71-node3 Standby: Offline: rh71-node2 [root@rh71-node1:~]# pcs cluster node remove rh71-node3 --force rh71-node3: Stopping Cluster (pacemaker)... rh71-node3: Successfully destroyed cluster rh71-node1: Corosync updated rh71-node2: Corosync updated [root@rh71-node1:~]# echo $? 0 [root@rh71-node1:~]# pcs status nodes both Corosync Nodes: Online: rh71-node1 Offline: rh71-node2 Pacemaker Nodes: Online: rh71-node1 Standby: Offline: rh71-node2 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-2290.html |