Bug 1315992

Summary: pcs node maintenance / standby should edit CIB directly instead of running crm tools
Product: Red Hat Enterprise Linux 7 Reporter: Tomas Jelinek <tojeline>
Component: pcsAssignee: Tomas Jelinek <tojeline>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 7.2CC: cfeist, cluster-maint, idevat, mlisik, omular, rsteiger, tlavigne, tojeline
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pcs-0.9.158-2.el7 Doc Type: Bug Fix
Doc Text:
Cause: User wants to enable standby or maintenance mode for several nodes. Consequence: The mode was changed for one node at a time causing unnecessary load as pacemaker was moving resources from node to node. Fix: Add support for changing the mode for more nodes at once. Result: Reduced load on cluster nodes, improved user experience (it is no longer needed to run the command for each node)
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 18:22:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1420101    
Attachments:
Description Flags
proposed fix
none
fix --help none

Description Tomas Jelinek 2016-03-09 08:21:42 UTC
When putting nodes to or from maintenance or standby mode, pcs achieves this by running crm_attribute or crm_standby tool. If user specifies more than one node, pcs runs these tools in a loop modifying each node in one iteration, as the tools can work with only one node at a time.

We want to edit CIB in pcs directly instead of running these commands. This will make changing mode of multiple nodes an atomic operation and save some workload moving resources from node to node (when resources are moved to a node which is going to be put in standby in next iteration).

Comment 2 Tomas Jelinek 2016-11-25 11:08:06 UTC
Created attachment 1224213 [details]
proposed fix

Tests are in the patch.

Basically we need to test that standby and maintenance commands work as before and additionally that it is now possible to specify more nodes in the commands.

Comment 4 Ivan Devat 2017-02-20 07:43:25 UTC
After Fix:
[vm-rhel72-1 ~] $ rpm -q pcs
pcs-0.9.156-1.el7.x86_64

[vm-rhel72-1 ~] $ pcs status|grep ^Online:
Online: [ vm-rhel72-1 vm-rhel72-2 vm-rhel72-3 ]

[vm-rhel72-1 ~] $ pcs node standby vm-rhel72-2 vm-rhel72-3 --wait
[vm-rhel72-1 ~] $ pcs status|grep "^\(Online:\|Node vm-rhel\)"
Node vm-rhel72-2: standby
Node vm-rhel72-3: standby
Online: [ vm-rhel72-1 ]

[vm-rhel72-1 ~] $ pcs node unstandby vm-rhel72-2 vm-rhel72-3 --wait
[vm-rhel72-1 ~] $ pcs status|grep ^Online:
Online: [ vm-rhel72-1 vm-rhel72-2 vm-rhel72-3 ]

> Basically we need to test that standby and maintenance commands work as before and additionally that it is now possible to specify more nodes in the commands.

Comment 8 Tomas Jelinek 2017-05-03 16:08:28 UTC
Created attachment 1275934 [details]
fix --help

--help fixed for the following commands:
* pcs booth
* pcs node
* pcs qdevice
* pcs quorum
* pcs resource
* pcs stonith

Comment 9 Tomas Jelinek 2017-05-26 10:58:47 UTC
After fix:

[root@rh73-node1:~]# rpm -q pcs
pcs-0.9.158-2.el7.x86_64
[root@rh73-node1:~]# for i in attribute maintenance unmaintenance standby unstandby utilization; do pcs node $i --help; done

Usage: pcs node <command>
    attribute [[<node>] [--name <name>] | <node> <name>=<value> ...]
        Manage node attributes.  If no parameters are specified, show attributes
        of all nodes.  If one parameter is specified, show attributes
        of specified node.  If --name is specified, show specified attribute's
        value from all nodes.  If more parameters are specified, set attributes
        of specified node.  Attributes can be removed by setting an attribute
        without a value.


Usage: pcs node <command>
    maintenance [--all | <node>...] [--wait[=n]]
        Put specified node(s) into maintenance mode, if no nodes or options are
        specified the current node will be put into maintenance mode, if --all
        is specified all nodes will be put into maintenance mode.
        If --wait is specified, pcs will wait up to 'n' seconds for the node(s)
        to be put into maintenance mode and then return 0 on success or 1 if
        the operation not succeeded yet. If 'n' is not specified it defaults
        to 60 minutes.


Usage: pcs node <command>
    unmaintenance [--all | <node>...] [--wait[=n]]
        Remove node(s) from maintenance mode, if no nodes or options are
        specified the current node will be removed from maintenance mode,
        if --all is specified all nodes will be removed from maintenance mode.
        If --wait is specified, pcs will wait up to 'n' seconds for the node(s)
        to be removed from maintenance mode and then return 0 on success or 1 if
        the operation not succeeded yet. If 'n' is not specified it defaults
        to 60 minutes.


Usage: pcs node <command>
    standby [--all | <node>...] [--wait[=n]]
        Put specified node(s) into standby mode (the node specified will no
        longer be able to host resources), if no nodes or options are specified
        the current node will be put into standby mode, if --all is specified
        all nodes will be put into standby mode.
        If --wait is specified, pcs will wait up to 'n' seconds for the node(s)
        to be put into standby mode and then return 0 on success or 1 if
        the operation not succeeded yet. If 'n' is not specified it defaults
        to 60 minutes.


Usage: pcs node <command>
    unstandby [--all | <node>...] [--wait[=n]]
        Remove node(s) from standby mode (the node specified will now be able to
        host resources), if no nodes or options are specified the current node
        will be removed from standby mode, if --all is specified all nodes will
        be removed from standby mode.
        If --wait is specified, pcs will wait up to 'n' seconds for the node(s)
        to be removed from standby mode and then return 0 on success or 1 if
        the operation not succeeded yet. If 'n' is not specified it defaults
        to 60 minutes.


Usage: pcs node <command>
    utilization [[<node>] [--name <name>] | <node> <name>=<value> ...]
        Add specified utilization options to specified node.  If node is not
        specified, shows utilization of all nodes.  If --name is specified,
        shows specified utilization value from all nodes. If utilization options
        are not specified, shows utilization of specified node.  Utilization
        option should be in format name=value, value has to be integer.  Options
        may be removed by setting an option without a value.
        Example: pcs node utilization node1 cpu=4 ram=

Comment 13 errata-xmlrpc 2017-08-01 18:22:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1958