+++ This bug was initially created as a clone of Bug #1376556 +++ (This is supported for cluster nodes as of 7.5; this bz is to request the same support for Pacemaker Remote nodes.) Description of problem: There are many instances where you might know that you don't want pacemaker to handle any management of resources when you start the cluster, and instead would prefer to have the node you're starting come up in standby mode instantly. I encounter this often if I discover I've horribly broken the configuration for some resource and rather than trying to untangle the web of what is messed up, I might prefer to reboot and start fresh; but when the nodes get done booting, I still have to deal with the problem that when I start that node, its just going to start trying to manage resources again. I've also encountered this with customers where they may be in a maintenance window and are bringing nodes back online, but they want to be able to control when they start bringing applications up. Having the cluster start with either maintenance-mode or some/all nodes in standby would be ideal, but there's no straightforward way to do that other than hoping to get a standby request processed quickly enough to avoid any other work happening. So, it'd be nice if pcs could offer some mechanism to start a node with automatic standby mode, and/or possibly to start with the cluster in maintenance-mode. Version-Release number of selected component (if applicable): All releases of pacemaker How reproducible: Steps to Reproduce: 1. Want to bring up the cluster on one or all nodes without managing any resources Actual results: Can't do it Expected results: Have the ability to run a single command and a node joins the cluster in standby mode. Additional info: --- Additional comment from Ken Gaillot on 2017-02-20 12:06:47 EST --- FYI, partial support has been merged upstream: https://github.com/ClusterLabs/pacemaker/pull/1141 Currently, only cluster nodes are supported. We should be able to get that into 7.4, though we probably shouldn't advertise or support it until remote node support is added (which is planned, but no time frame is available yet). --- Additional comment from Ken Gaillot on 2017-10-16 13:18:00 EDT --- This will be supported in 7.5 for cluster nodes (only). I will clone this bz to request remote node support in a future release. QA: To test, create a cluster, then stop one node, or prepare a machine to be added as a new node. Add PCMK_node_start_state to the node's /etc/sysconfig with one of these values: * "default" (unsurprisingly, the default) will use the current value of the node's "standby" node attribute (the only behavior supported by previous releases). * "online" will force the node to join the cluster in online mode, even if it previously was put into standby mode before being stopped. * "standby" will force the node to join the cluster in standby node.
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.
This is still a goal, but it will be tracked via the upstream bz.
Support for remote nodes was added in upstream main branch as of commit 76bd508cc
Version of pacemaker: > [root@virt-248:~]# rpm -q pacemaker > pacemaker-2.1.6-3.el8.x86_64 Setting of 5-node cluster -> 2 nodes and 3 remote nodes: > [root@virt-248:~]#pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Mon Jul 17 14:35:21 2023 on virt-248 > * Last change: Mon Jul 17 14:35:04 2023 by root via cibadmin on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * online: [ virt-248 virt-249 ] > * Remoteonline: [ virt-256 virt-257 virt-261 ] > > Full List of Resources: > * fence-virt-248 (stonith:fence_xvm): Started virt-249 > * fence-virt-249 (stonith:fence_xvm): Started virt-248 > * fence-virt-256 (stonith:fence_xvm): Started virt-249 > * fence-virt-257 (stonith:fence_xvm): Started virt-249 > * fence-virt-261 (stonith:fence_xvm): Started virt-248 > * virt-256 (ocf::pacemaker:remote): Started virt-248 > * virt-257 (ocf::pacemaker:remote): Started virt-249 > * virt-261 (ocf::pacemaker:remote): Started virt-248 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled ___________________________________________________________________________________________________________ OPTION 1: remote node to standby change PCMK_node_start_state expected status after before disable? on remote node? enable remote node ===================== ============================ ======================= yes no connect as standby Setting standby state with pcs node standby: > [root@virt-248:~]# pcs node standby virt-256 virt-257 virt-261 > [root@virt-248:~]# pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Mon Jul 17 22:06:12 2023 on virt-248 > * Last change: Mon Jul 17 22:06:07 2023 by root via cibadmin on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * RemoteNode virt-256: standby > * RemoteNode virt-257: standby > * RemoteNode virt-261: standby > * online: [ virt-248 virt-249 ] Disabling remote nodes: > [root@virt-248:~]# pcs resource disable virt-256 > [root@virt-248:~]# pcs resource disable virt-257 > [root@virt-248:~]# pcs resource disable virt-261 > [root@virt-248:~]# pcs status > ... > Node List: > * online: [ virt-248 virt-249 ] > * RemoteOFFLIno: [ virt-256 virt-257 virt-261 ] > ... Enabling remote nodes: > [root@virt-248:~]# pcs resource enable virt-257 > [root@virt-248:~]# pcs resource enable virt-256 > [root@virt-248:~]# pcs resource enable virt-261 > [root@virt-248:~]# pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Mon Jul 17 22:10:55 2023 on virt-248 > * Last change: Mon Jul 17 22:10:50 2023 by root via cibadmin on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * RemoteNode virt-256: standby > * RemoteNode virt-257: standby > * RemoteNode virt-261: standby > * online: [ virt-248 virt-249 ] RESULT: It is working well for this option -> all remote nodes have standby state. ___________________________________________________________________________________________________________ OPTION 2: remote node to standby change PCMK_node_start_state expected status after before disable? on remote node? enable remote node ===================== ============================ ======================= yes yes, to "online" connect as online yes yes, to "standby" connect as standby yes yes, to "default" connect as standby Setting standby state with pcs node standby: > [root@virt-248:~]# pcs node standby virt-256 virt-257 virt-261 > [root@virt-248:~]# pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Mon Jul 17 23:29:17 2023 on virt-248 > * Last change: Mon Jul 17 23:29:13 2023 by root via cibadmin on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * RemoteNode virt-256: standby > * RemoteNode virt-257: standby > * RemoteNode virt-261: standby > * Online: [ virt-248 virt-249 ] Disabling remote nodes: > [root@virt-248:~]# pcs resource disable virt-261 > [root@virt-248:~]# pcs resource disable virt-257 > [root@virt-248:~]# pcs resource disable virt-256 > [root@virt-248:~]# pcs status > ... > Node List: > * online: [ virt-248 virt-249 ] > * RemoteOFFLIno: [ virt-256 virt-257 virt-261 ] > ... Changing PCMK_node_start_state in /etc/sysconfig/pacemaker for each remote nodes different values - "online", "default" and "standby": > [root@virt-261 ~]# vim /etc/sysconfig/pacemaker > VALGRIND_OPTS="--leak-check=full --trace-children=no --vgdb=no --num-callers=25" > VALGRIND_OPTS="$VALGRIND_OPTS --log-file=/var/lib/pacemaker/valgrind-%p" > VALGRIND_OPTS="$VALGRIND_OPTS --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions" > VALGRIND_OPTS="$VALGRIND_OPTS --gen-suppressions=all" > PCMK_node_start_state="default" > [root@virt-261 ~]# systemctl restart pacemaker-remote > [root@virt-257 ~]# vim /etc/sysconfig/pacemaker > VALGRIND_OPTS="--leak-check=full --trace-children=no --vgdb=no --num-callers=25" > VALGRIND_OPTS="$VALGRIND_OPTS --log-file=/var/lib/pacemaker/valgrind-%p" > VALGRIND_OPTS="$VALGRIND_OPTS --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions" > VALGRIND_OPTS="$VALGRIND_OPTS --gen-suppressions=all" > PCMK_node_start_state="online" > [root@virt-257 ~]# systemctl restart pacemaker-remote > [root@virt-256 ~]# vim /etc/sysconfig/pacemaker > VALGRIND_OPTS="--leak-check=full --trace-children=no --vgdb=no --num-callers=25" > VALGRIND_OPTS="$VALGRIND_OPTS --log-file=/var/lib/pacemaker/valgrind-%p" > VALGRIND_OPTS="$VALGRIND_OPTS --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions" > VALGRIND_OPTS="$VALGRIND_OPTS --gen-suppressions=all" > PCMK_node_start_state="standby" > [root@virt-256 ~]# systemctl restart pacemaker-remote Enabling remote nodes: > [root@virt-248:~]# pcs resource enable virt-261 > [root@virt-248:~]# pcs resource enable virt-256 > [root@virt-248:~]# pcs resource enable virt-257 > [root@virt-248:~]# pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Mon Jul 17 23:33:05 2023 on virt-248 > * Last change: Mon Jul 17 23:33:01 2023 by root via cibadmin on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * RemoteNode virt-256: standby > * RemoteNode virt-261: standby > * Online: [ virt-248 virt-249 ] > * RemoteOnline: [ virt-257 ] RESULT: It is working well for this option -> 2 remote nodes have standby state (RNs with "default" and "standby") and 1 remote node is online (RN with "online"). ___________________________________________________________________________________________________________ OPTION 3: remote node to standby change PCMK_node_start_state expected status after before disable? on remote node? enable remote node ===================== ============================ ======================= no no connect as online Setting of cluster -> all remote nodes = online: > [root@virt-248:~]# pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Mon Jul 17 22:37:24 2023 on virt-248 > * Last change: Mon Jul 17 22:35:11 2023 by root via cibadmin on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * online: [ virt-248 virt-249 ] > * Remoteonline: [ virt-256 virt-257 virt-261 ] Disabling remote nodes: > [root@virt-248:~]# pcs resource disable virt-261 > [root@virt-248:~]# pcs resource disable virt-257 > [root@virt-248:~]# pcs resource disable virt-256 > [root@virt-248:~]# pcs status > ... > Node List: > * online: [ virt-248 virt-249 ] > * RemoteOFFLIno: [ virt-256 virt-257 virt-261 ] > ... Enabling remote nodes: > [root@virt-248:~]# pcs resource enable virt-256 > [root@virt-248:~]# pcs resource enable virt-257 > [root@virt-248:~]# pcs resource enable virt-261 > [root@virt-248:~]# pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Mon Jul 17 22:39:03 2023 on virt-248 > * Last change: Mon Jul 17 22:39:00 2023 by root via cibadmin on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * online: [ virt-248 virt-249 ] > * Remoteonline: [ virt-256 virt-257 virt-261 ] RESULT: It is working well for this option -> all remote nodes are online. ___________________________________________________________________________________________________________ OPTION 4: remote node to standby change PCMK_node_start_state expected status after before disable? on remote node? enable remote node ===================== ============================ ======================= no yes, to "online" connect as online no yes, to "standby" connect as standby no yes, to "default" connect as online Setting of cluster -> all remote nodes = online: > [root@virt-248:~]# pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Mon Jul 17 22:39:03 2023 on virt-248 > * Last change: Mon Jul 17 22:39:00 2023 by root via cibadmin on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * online: [ virt-248 virt-249 ] > * Remoteonline: [ virt-256 virt-257 virt-261 ] Disabling remote nodes: > [root@virt-248:~]# pcs resource disable virt-256 > [root@virt-248:~]# pcs resource disable virt-257 > [root@virt-248:~]# pcs resource disable virt-261 > [root@virt-248:~]# pcs status > ... > Node List: > * online: [ virt-248 virt-249 ] > * RemoteOFFLIno: [ virt-256 virt-257 virt-261 ] > ... Changing PCMK_node_start_state in /etc/sysconfig/pacemaker for each remote nodes different values - "online", "default" and "standby": > [root@virt-261 ~]# vim /etc/sysconfig/pacemaker > VALGRIND_OPTS="--leak-check=full --trace-children=no --vgdb=no --num-callers=25" > VALGRIND_OPTS="$VALGRIND_OPTS --log-file=/var/lib/pacemaker/valgrind-%p" > VALGRIND_OPTS="$VALGRIND_OPTS --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions" > VALGRIND_OPTS="$VALGRIND_OPTS --gen-suppressions=all" > PCMK_node_start_state="default" > [root@virt-261 ~]# systemctl restart pacemaker-remote > [root@virt-257 ~]# vim /etc/sysconfig/pacemaker > VALGRIND_OPTS="--leak-check=full --trace-children=no --vgdb=no --num-callers=25" > VALGRIND_OPTS="$VALGRIND_OPTS --log-file=/var/lib/pacemaker/valgrind-%p" > VALGRIND_OPTS="$VALGRIND_OPTS --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions" > VALGRIND_OPTS="$VALGRIND_OPTS --gen-suppressions=all" > PCMK_node_start_state="online" > [root@virt-257 ~]# systemctl restart pacemaker-remote > [root@virt-256 ~]# vim /etc/sysconfig/pacemaker > VALGRIND_OPTS="--leak-check=full --trace-children=no --vgdb=no --num-callers=25" > VALGRIND_OPTS="$VALGRIND_OPTS --log-file=/var/lib/pacemaker/valgrind-%p" > VALGRIND_OPTS="$VALGRIND_OPTS --suppressions=/usr/share/pacemaker/tests/valgrind-pcmk.suppressions" > VALGRIND_OPTS="$VALGRIND_OPTS --gen-suppressions=all" > PCMK_node_start_state="standby" > [root@virt-256 ~]# systemctl restart pacemaker-remote Enabling remote nodes: > [root@virt-248:~]# pcs resource enable virt-256 > [root@virt-248:~]# pcs resource enable virt-261 > [root@virt-248:~]# pcs resource enable virt-257 > [root@virt-248:~]# pcs status > Cluster name: STSRHTS8954 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-249 (version 2.1.6-3.el8-6fdc9deea29) - partition with quorum > * Last updated: Tue Jul 18 00:10:48 2023 on virt-248 > * Last change: Tue Jul 18 00:10:42 2023 by hacluster via crmd on virt-248 > * 5 nodes configured > * 8 resource instances configured > > Node List: > * RemoteNode virt-256: standby > * Online: [ virt-248 virt-249 ] > * RemoteOnline: [ virt-257 virt-261 ] RESULT: It is working well for this option -> 2 remote nodes are online (RNs with "default" and "online") and 1 remote node has standby state (RN with "standby").