Hide Forgot
Description of problem: It is not possible to remove a node from a cluster if pcsd is not running on the node or the node itself is not running. Version-Release number of selected component (if applicable): pcs-0.9.137-15.el7 How reproducible: always Steps to Reproduce: 1. create a cluster 2. shutdown a node 3. try to remove the node from the cluster using 'pcs cluster node remove <nodename>' Actual results: Error: pcsd is not running on <nodename> Expected results: Node is removed from the cluster. We probably want to warn user first and allow removal of the node only when --force switch is used. Additional info: workaround: 1. run 'pcs cluster localnode remove <nodename>' on all remaining nodes 2. run 'pcs cluster reload corosync' on one node 3. run 'crm_node -R <nodename> --force' on one node
Hi, I also hit the bug, thanks for the workaround. Regards,
Created attachment 1181676 [details] proposed fix Test: > Let's have a three node cluster [root@rh72-node1:~]# pcs status nodes both Corosync Nodes: Online: rh72-node1 rh72-node2 rh72-node3 Offline: Pacemaker Nodes: Online: rh72-node1 rh72-node2 rh72-node3 Standby: Maintenance: Offline: Pacemaker Remote Nodes: Online: Standby: Maintenance: Offline: > Power off one node ... [root@rh72-node1:~]# pcs status nodes both Corosync Nodes: Online: rh72-node1 rh72-node2 Offline: rh72-node3 Pacemaker Nodes: Online: rh72-node1 rh72-node2 Standby: Maintenance: Offline: rh72-node3 Pacemaker Remote Nodes: Online: Standby: Maintenance: Offline: > ... and remove it from the cluster [root@rh72-node1:~]# pcs cluster node remove rh72-node3 Error: pcsd is not running on rh72-node3, use --force to override [root@rh72-node1:~]# pcs cluster node remove rh72-node3 --force rh72-node3: Unable to connect to rh72-node3 ([Errno 113] No route to host) rh72-node3: Unable to connect to rh72-node3 ([Errno 113] No route to host) Warning: unable to destroy cluster rh72-node3: Unable to connect to rh72-node3 ([Errno 113] No route to host) rh72-node2: Corosync updated rh72-node1: Corosync updated [root@rh72-node1:~]# pcs status nodes both Corosync Nodes: Online: rh72-node1 rh72-node2 Offline: Pacemaker Nodes: Online: rh72-node1 rh72-node2 Standby: Maintenance: Offline: Pacemaker Remote Nodes: Online: Standby: Maintenance: Offline:
Created attachment 1181958 [details] proposed fix web UI fix for web UI
Setup: [vm-rhel72-1 ~] $ pcs status nodes both Corosync Nodes: Online: vm-rhel72-1 vm-rhel72-2 vm-rhel72-3 Offline: Pacemaker Nodes: Online: vm-rhel72-1 vm-rhel72-2 vm-rhel72-3 Standby: Maintenance: Offline: Pacemaker Remote Nodes: Online: Standby: Maintenance: Offline: > Power off one node ... [vm-rhel72-1 ~] $ pcs status nodes both Corosync Nodes: Online: vm-rhel72-1 vm-rhel72-3 Offline: vm-rhel72-2 Pacemaker Nodes: Online: vm-rhel72-1 vm-rhel72-3 Standby: Maintenance: Offline: vm-rhel72-2 Pacemaker Remote Nodes: Online: Standby: Maintenance: Offline: Before Fix: [vm-rhel72-1 ~] $ rpm -q pcs pcs-0.9.152-4.el7.x86_64 [vm-rhel72-1 ~] $ pcs cluster node remove vm-rhel72-2 Error: pcsd is not running on vm-rhel72-2 After Fix: [vm-rhel72-1 ~] $ rpm -q pcs pcs-0.9.152-5.el7.x86_64 [vm-rhel72-1 ~] $ pcs cluster node remove vm-rhel72-2 Error: pcsd is not running on vm-rhel72-2, use --force to override [vm-rhel72-1 ~] $ pcs cluster node remove vm-rhel72-2 --force vm-rhel72-2: Unable to connect to vm-rhel72-2 ([Errno 111] Connection refused) vm-rhel72-2: Unable to connect to vm-rhel72-2 ([Errno 111] Connection refused) Warning: unable to destroy cluster vm-rhel72-2: Unable to connect to vm-rhel72-2 ([Errno 111] Connection refused) vm-rhel72-1: Corosync updated vm-rhel72-3: Corosync updated [vm-rhel72-1 ~] $ pcs status nodes both Corosync Nodes: Online: vm-rhel72-1 vm-rhel72-3 Offline: Pacemaker Nodes: Online: vm-rhel72-1 vm-rhel72-3 Standby: Maintenance: Offline: Pacemaker Remote Nodes: Online: Standby: Maintenance: Offline:
*** Bug 1376209 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2016-2596.html