RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 902407 - Different results when moving Master/Slave resources
Summary: Different results when moving Master/Slave resources
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: pacemaker
Version: 6.4
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Andrew Beekhof
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 987355
TreeView+ depends on / blocked
 
Reported: 2013-01-21 15:40 UTC by Jaroslav Kortus
Modified: 2013-11-21 12:08 UTC (History)
6 users (show)

Fixed In Version: pacemaker-1.1.10-12.el6
Doc Type: Bug Fix
Doc Text:
Cause: The 'crm_resource --move' command was designed for atomic resources, not clones and master/slave resources which can be present on multiple nodes. Consequence: crm_resource could not infer sufficient information to know what the admin intended, and so did nothing. Fix: The --ban and --clear commands were added, allowing the admin to unambiguously instruct the cluster. See comment #10 for the new help text. Result: Clones and master/slave resources can now be moved around the cluster with crm_resource
Clone Of:
Environment:
Last Closed: 2013-11-21 12:08:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:1635 0 normal SHIPPED_LIVE Low: pacemaker security, bug fix, and enhancement update 2013-11-20 21:53:44 UTC

Description Jaroslav Kortus 2013-01-21 15:40:59 UTC
Description of problem:
When I set up a Master/Slave resource and issue pcs resource move <resource>, then each time different part of the set can be moved (or shut down in this case).

After discussing this with David Vossel, it seems that the first resource found on lowest node id wins and it's always moved.

This has the consequence, that sometimes the master instance is moved, and sometimes the slave instance is moved. In first case the master is re-elected, in both cases the slave instance is just shut down on the node (due to the constraint created).

It would be good to have consistent approach on what should happen from user point of view. My favourite would be to force shutdown of the master instance and new master election.

This bug is more a request for discussion about how it should really behave and if current behaviour is the best one.

Version-Release number of selected component (if applicable):
pacemaker-1.1.8-7.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. pcs resource create dummystateful ocf:pacemaker:Stateful
2. pcs resource master MasterResource dummystateful
3. pcs resoure move MasterResource
  
Actual results:
lowest node-id instance is "moved", which may or may not be the master one

Expected results:
I'd go for master instance being moved at all times or error if there is no master yet.

Additional info:

Comment 2 RHEL Program Management 2013-01-25 06:48:00 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 Andrew Beekhof 2013-02-06 06:08:18 UTC
Chris, what crm_resource command does "pcs resoure move MasterResource" expand to for m/s resources?


I'm thinking move for a master/slave resource doesn't make much sense (unless a node is specified and interpreted to mean the place where the resource can no longer run). 

We'd need to add support for that to crm_resource

Comment 4 Chris Feist 2013-04-10 03:56:34 UTC
If a node is specified, I use:
crm_resource --resource resource_id --move --node node_id 

If not, I use:
crm_resource --resource resource_id --move

I don't do any special for different types of resources (master/slave, clones).

Should I prevent users from running that command with master/slave resources or clones and just print an error message telling them if they really want to do that they should be creating constraints?

Comment 5 Andrew Beekhof 2013-05-15 05:51:39 UTC
(In reply to comment #4)
> If a node is specified, I use:
> crm_resource --resource resource_id --move --node node_id 
> 
> If not, I use:
> crm_resource --resource resource_id --move
> 
> I don't do any special for different types of resources (master/slave,
> clones).
> 
> Should I prevent users from running that command with master/slave resources
> or clones and just print an error message telling them if they really want
> to do that they should be creating constraints?

Yep, I think thats what we want to do, but it should probably happen at the crm_resource level so you don't have to care.

Although if you detect it early the error message might be a little nicer :)

Jaroslav: Do you agree with the approach?

Comment 6 Jaroslav Kortus 2013-05-15 08:31:28 UTC
That would make sense. Can you please print the command creating the constraint to stderr? I think users would be glad to see next command instead of having to look it up in help.

Also, is there a way to move the master resource, i.e. force some other node to be the master?

Comment 7 Andrew Beekhof 2013-05-30 03:42:17 UTC
(In reply to Jaroslav Kortus from comment #6)
> That would make sense. Can you please print the command creating the
> constraint to stderr? I think users would be glad to see next command
> instead of having to look it up in help.

Makes sense.

> 
> Also, is there a way to move the master resource, i.e. force some other node
> to be the master?

Basically no :-(

How about:

1. Deprecate crm_resource --move when --host is not specified.
   --move is then just for pushing resources to a particular location, 
   not for "anywhere but here"
2. Impliment --(un)ban and --(un)ban-master

# Stop a resource (including clones) from being allow to run at a particular location
#
# If 'something' is a group or primitive:
#  - move it from its current location to 'somewhere'
# If 'something' is a m/s resource and there is only one master:
#  - move the master from its current location to another 'somewhere'
# Otherwise,
#  - error
crm_resource --move --resource something --host somewhere

# Stop a resource (including clones) from being allow to run at a 
# particular location
#
# If --host is not specified and 'something' is a group or primitive:
#  - move it from its current location to another node
# If --host is not specified and 'something' is a m/s resource and there is
# only one master:
#  - move the master from its current location to another node
# Otherwise, if --host is not specified
#  - error
crm_resource --ban --resource something --host somewhere

# Re-allow a resource (including clones) to run at a particular location
#
# If --host is not specified, remove all generated constraints
crm_resource --unban --resource something --host somewhere

# Stop a m/s resource from being promoted at a particular location
# The resource can still be a slave on that node
# If --host is not specified and there is only one master:
#  - move the master from its current location to another node
# Otherwise, if --host is not specified
#  - error
crm_resource --ban-master --resource someclone --host somewhere

# Re-allow a m/s resource to be promoted at a particular location
# If --host is not specified, remove all generated constraints
crm_resource --unban-master --resource someclone --host somewhere


Examples:

====

Assume: 'www' group running on 'node1' 

node1 - 'www'
node2 - 
node3 - 

# crm_resource --ban www
# Creates --ban for node1

node1 -
node2 - 'www' 
node3 - 

# crm_resource --ban www --host node2

node1 -
node2 -
node3 - 'www' 

# crm_resource --move www --host node2
# Undoes --ban for node2
# Creates --ban for node3

node1 -
node2 - 'www'
node3 - 

# crm_resource --unban www --host node1
# Might move back to node1 based on stickiness

node1 - 'www'
node2 -
node3 - 

====

Assume: 
node1 - mysql:master
node2 - mysql:slave

# crm_resource --ban-master mysql

node1 - mysql:master
node2 - mysql:slave

# crm_resource --ban-master mysql --host node2

node1 - mysql:slave
node2 - mysql:slave

# crm_resource --ban mysql --host node1

node1 - 
node2 - mysql:slave

# crm_resource --unban mysql --host node2

node1 - 
node2 - mysql:master

# crm_resource --move mysql --host node1

node1 - mysql:master
node2 - mysql:slave

# crm_resource --move mysql

node1 - mysql:slave
node2 - mysql:master



Can anyone find something I missed?

Comment 8 Andrew Beekhof 2013-05-30 04:05:25 UTC
Follow-up... 

   --(un)ban and --(un)ban-master 

or

  --(un)ban [--master]

?

The latter might be less clunky

Comment 9 Jaroslav Kortus 2013-05-30 17:21:24 UTC
> How about:
> 
> 1. Deprecate crm_resource --move when --host is not specified.
>    --move is then just for pushing resources to a particular location, 
>    not for "anywhere but here"

I'd not do that, it's a convenient way of pushing resources around, isn't it? When I say "move" I mean "go away", not having --host seems perfectly OK to me. So I'm -1 on this :).


> # Stop a resource (including clones) from being allow to run at a particular
> location
> #
> # If 'something' is a group or primitive:
> #  - move it from its current location to 'somewhere'
> # If 'something' is a m/s resource and there is only one master:
> #  - move the master from its current location to another 'somewhere'
> # Otherwise,
> #  - error
> crm_resource --move --resource something --host somewhere
 
> # Stop a resource (including clones) from being allow to run at a 
> # particular location
> #
> # If --host is not specified and 'something' is a group or primitive:
> #  - move it from its current location to another node
> # If --host is not specified and 'something' is a m/s resource and there is
> # only one master:
> #  - move the master from its current location to another node
> # Otherwise, if --host is not specified
> #  - error
> crm_resource --ban --resource something --host somewhere

Isn't this what --move would do? It sounds to me that --ban and --move would be one subset of the other (ban being the superset here).

> 
> # Re-allow a resource (including clones) to run at a particular location
> #
> # If --host is not specified, remove all generated constraints
> crm_resource --unban --resource something --host somewhere
> 
> # Stop a m/s resource from being promoted at a particular location
> # The resource can still be a slave on that node
> # If --host is not specified and there is only one master:
> #  - move the master from its current location to another node
> # Otherwise, if --host is not specified
> #  - error
> crm_resource --ban-master --resource someclone --host somewhere
> 
> # Re-allow a m/s resource to be promoted at a particular location
> # If --host is not specified, remove all generated constraints
> crm_resource --unban-master --resource someclone --host somewhere
> 
> 
> Examples:
> 
> ====
> 
> Assume: 'www' group running on 'node1' 
> 
> node1 - 'www'
> node2 - 
> node3 - 
> 
> # crm_resource --ban www
> # Creates --ban for node1
> 
> node1 -
> node2 - 'www' 
> node3 - 
> 
> # crm_resource --ban www --host node2
> 
> node1 -
> node2 -
> node3 - 'www' 
> 
> # crm_resource --move www --host node2
> # Undoes --ban for node2
> # Creates --ban for node3
> 
> node1 -
> node2 - 'www'
> node3 - 
> 
> # crm_resource --unban www --host node1
> # Might move back to node1 based on stickiness
> 
> node1 - 'www'
> node2 -
> node3 - 
> 
> ====
> 
> Assume: 
> node1 - mysql:master
> node2 - mysql:slave
> 
> # crm_resource --ban-master mysql
> 
> node1 - mysql:master
> node2 - mysql:slave
> 

should not this be like this?:
node1 - mysql:slave
node2 - mysql:master

> # crm_resource --ban-master mysql --host node2
> 
> node1 - mysql:slave
> node2 - mysql:slave
> 
> # crm_resource --ban mysql --host node1
> 
> node1 - 
> node2 - mysql:slave
> 

> # crm_resource --unban mysql --host node2
> 
> node1 - 
> node2 - mysql:master
This should probably be --unban-master here.


> 
> # crm_resource --move mysql --host node1
> 
> node1 - mysql:master
> node2 - mysql:slave
> 
> # crm_resource --move mysql
> 
> node1 - mysql:slave
> node2 - mysql:master
> 
> 
> 

This looks good. I just have a feeling that --move and --ban are doing the same for most of the time, so we should probably have just one of these options not to confuse users.

Comment 10 Andrew Beekhof 2013-05-30 23:42:48 UTC
(In reply to Jaroslav Kortus from comment #9)
> > How about:
> > 
> > 1. Deprecate crm_resource --move when --host is not specified.
> >    --move is then just for pushing resources to a particular location, 
> >    not for "anywhere but here"
> 
> I'd not do that, it's a convenient way of pushing resources around, isn't
> it? When I say "move" I mean "go away", not having --host seems perfectly OK
> to me. So I'm -1 on this :).

It will still work that way, we'd just not advertise it.
Removing it completely would be a really good way for me to piss off almost everyone :)

The intention though, is that --ban (potentially with no --host) would be the "better way" to do the same thing. 

> 
> 
> > # Stop a resource (including clones) from being allow to run at a particular
> > location
> > #
> > # If 'something' is a group or primitive:
> > #  - move it from its current location to 'somewhere'
> > # If 'something' is a m/s resource and there is only one master:
> > #  - move the master from its current location to another 'somewhere'
> > # Otherwise,
> > #  - error
> > crm_resource --move --resource something --host somewhere
>  
> > # Stop a resource (including clones) from being allow to run at a 
> > # particular location
> > #
> > # If --host is not specified and 'something' is a group or primitive:
> > #  - move it from its current location to another node
> > # If --host is not specified and 'something' is a m/s resource and there is
> > # only one master:
> > #  - move the master from its current location to another node
> > # Otherwise, if --host is not specified
> > #  - error
> > crm_resource --ban --resource something --host somewhere
> 
> Isn't this what --move would do? It sounds to me that --ban and --move would
> be one subset of the other (ban being the superset here).

Almost.

The --host argument for --ban means "the node on which it can't run anymore".
For --move its "the node on which it should run now".

There is much overlap, but one can't completely replace the other.
Eg. There is no single --ban command that lets you move a resource to a specific node.

> > 
> > # Re-allow a resource (including clones) to run at a particular location
> > #
> > # If --host is not specified, remove all generated constraints
> > crm_resource --unban --resource something --host somewhere
> > 
> > # Stop a m/s resource from being promoted at a particular location
> > # The resource can still be a slave on that node
> > # If --host is not specified and there is only one master:
> > #  - move the master from its current location to another node
> > # Otherwise, if --host is not specified
> > #  - error
> > crm_resource --ban-master --resource someclone --host somewhere
> > 
> > # Re-allow a m/s resource to be promoted at a particular location
> > # If --host is not specified, remove all generated constraints
> > crm_resource --unban-master --resource someclone --host somewhere
> > 
> > 
> > Examples:
> > 
> > ====
> > 
> > Assume: 'www' group running on 'node1' 
> > 
> > node1 - 'www'
> > node2 - 
> > node3 - 
> > 
> > # crm_resource --ban www
> > # Creates --ban for node1
> > 
> > node1 -
> > node2 - 'www' 
> > node3 - 
> > 
> > # crm_resource --ban www --host node2
> > 
> > node1 -
> > node2 -
> > node3 - 'www' 
> > 
> > # crm_resource --move www --host node2
> > # Undoes --ban for node2
> > # Creates --ban for node3
> > 
> > node1 -
> > node2 - 'www'
> > node3 - 
> > 
> > # crm_resource --unban www --host node1
> > # Might move back to node1 based on stickiness
> > 
> > node1 - 'www'
> > node2 -
> > node3 - 
> > 
> > ====
> > 
> > Assume: 
> > node1 - mysql:master
> > node2 - mysql:slave
> > 
> > # crm_resource --ban-master mysql
> > 
> > node1 - mysql:master
> > node2 - mysql:slave
> > 
> 
> should not this be like this?:
> node1 - mysql:slave
> node2 - mysql:master

Yes.

> 
> > # crm_resource --ban-master mysql --host node2
> > 
> > node1 - mysql:slave
> > node2 - mysql:slave
> > 
> > # crm_resource --ban mysql --host node1
> > 
> > node1 - 
> > node2 - mysql:slave
> > 
> 
> > # crm_resource --unban mysql --host node2
> > 
> > node1 - 
> > node2 - mysql:master
> This should probably be --unban-master here.

Either would achieve the same result in my mind.

> > # crm_resource --move mysql --host node1
> > 
> > node1 - mysql:master
> > node2 - mysql:slave
> > 
> > # crm_resource --move mysql
> > 
> > node1 - mysql:slave
> > node2 - mysql:master
> > 
> > 
> > 
> 
> This looks good. I just have a feeling that --move and --ban are doing the
> same for most of the time, so we should probably have just one of these
> options not to confuse users.

I made a head start on this and ironed out some kinks.
This is the current help text which make the need for both more obvious:


Resource location:
 -M, --move 	Move a resource from its current location to the named destination.
  		Requires: --host. Optional: --lifetime, --master

		NOTE: This may prevent the resource from running on the previous 
		location node until the implicit constraints expire or are removed
		with --unban

 -B, --ban 	Prevent the named resource from running on the named --host.  
		Requires: --resource. Optional: --host, --lifetime, --master

		If --host is not specified, it defaults to:
		 * the curent location for primitives and groups, or
		 * the curent location of the master for m/s resources with master-max=1

		All other situations result in an error as there is no sane default.

		NOTE: This will prevent the resource from running on this node until
		the constraint expires or is removed with --clear

 -U, --clear 	Remove all constraints created by the --ban and/or --move commands.  
		Requires: --resource. Optional: --host, --master

		If --host is not specified, all constraints created by --ban and
		--move will be removed for the named resource.

 -u, --lifetime=value		
		Lifespan of constraints created by the --ban and --move commands

     --master 	Limit the scope of the --ban, --move and --clear  commands to the
		Master role.
		For --ban, the previous master can still remain active in the Slave role.

Comment 11 Andrew Beekhof 2013-05-31 02:35:24 UTC
Oh, and we need a QE ack please :)

Comment 12 Jaroslav Kortus 2013-05-31 08:47:12 UTC
> > > 1. Deprecate crm_resource --move when --host is not specified.
> > >    --move is then just for pushing resources to a particular location, 
> > >    not for "anywhere but here"
> > 
> > I'd not do that, it's a convenient way of pushing resources around, isn't
> > it? When I say "move" I mean "go away", not having --host seems perfectly OK
> > to me. So I'm -1 on this :).
> 
> It will still work that way, we'd just not advertise it.
> Removing it completely would be a really good way for me to piss off almost
> everyone :)

But you have added mandatory --host, so for me it's different. Or did I miss something?

> Resource location:
>  -M, --move 	Move a resource from its current location to the named
> destination.
>   		Requires: --host. Optional: --lifetime, --master
> 
> 		NOTE: This may prevent the resource from running on the previous 
> 		location node until the implicit constraints expire or are removed
> 		with --unban
> 
>  -B, --ban 	Prevent the named resource from running on the named --host.  
> 		Requires: --resource. Optional: --host, --lifetime, --master
> 
> 		If --host is not specified, it defaults to:
> 		 * the curent location for primitives and groups, or
> 		 * the curent location of the master for m/s resources with master-max=1
> 
> 		All other situations result in an error as there is no sane default.
> 
> 		NOTE: This will prevent the resource from running on this node until
> 		the constraint expires or is removed with --clear
> 
>  -U, --clear 	Remove all constraints created by the --ban and/or --move
> commands.  
> 		Requires: --resource. Optional: --host, --master
> 
> 		If --host is not specified, all constraints created by --ban and
> 		--move will be removed for the named resource.


So moving is where we know where do we want it to run, ban is when we know where it should not run, correct? :)

> 
>  -u, --lifetime=value		
> 		Lifespan of constraints created by the --ban and --move commands

Please add "value in seconds" or some example (like "1d30min" if supported).

Just a small recap if I got all the details correctly.
--move will move the resource (create -INF constraint on current node and +INF on the other if specified by --host).
--ban will create -INF constraint on the current node. If --host is specified, it will create the constraint for that node, not necessarily moving the resource.

If master/slave resource is in the game, then all of it's instances would be moved from the node. Only if --master is specified, they would eventually be allowed to run in slave mode (i.e. no -INF constraint as in previous case, just prohibiting it to be elected master).

move && --master && --host = always run master on this node
ban && --master && --host = never run master on this node

Is this correct?

Comment 15 Andrew Beekhof 2013-06-19 04:16:46 UTC
(In reply to Jaroslav Kortus from comment #12)
> > > > 1. Deprecate crm_resource --move when --host is not specified.
> > > >    --move is then just for pushing resources to a particular location, 
> > > >    not for "anywhere but here"
> > > 
> > > I'd not do that, it's a convenient way of pushing resources around, isn't
> > > it? When I say "move" I mean "go away", not having --host seems perfectly OK
> > > to me. So I'm -1 on this :).
> > 
> > It will still work that way, we'd just not advertise it.
> > Removing it completely would be a really good way for me to piss off almost
> > everyone :)
> 
> But you have added mandatory --host, so for me it's different. Or did I miss
> something?

I've "said" that it is mandatory to simplify the documentation, but the code will not enforce this.


> So moving is where we know where do we want it to run, ban is when we know
> where it should not run, correct? :)

Exactly

> > 
> >  -u, --lifetime=value		
> > 		Lifespan of constraints created by the --ban and --move commands
> 
> Please add "value in seconds" or some example (like "1d30min" if supported).

Both are allowed.

> Just a small recap if I got all the details correctly.
> --move will move the resource (create -INF constraint on current node and
> +INF on the other if specified by --host).
> --ban will create -INF constraint on the current node. If --host is
> specified, it will create the constraint for that node, not necessarily
> moving the resource.

Correct

> 
> If master/slave resource is in the game, then all of it's instances would be
> moved from the node.

Correct

> Only if --master is specified, they would eventually be
> allowed to run in slave mode (i.e. no -INF constraint as in previous case,
> just prohibiting it to be elected master).

There would still be a -INF constraint, but the scope would be restricted to being the master role.

> move && --master && --host = always run master on this node
> ban && --master && --host = never run master on this node
> 
> Is this correct?

Yes.  Satisfactory?

Comment 16 Jaroslav Kortus 2013-06-24 09:31:17 UTC
Yes, thank you :).

Comment 17 Andrew Beekhof 2013-07-24 06:36:54 UTC
Can we get a QE ack for this please? :-)

Comment 22 Jaroslav Kortus 2013-08-21 14:44:27 UTC
[root@virt-073 ~]# pcs status
Cluster name: STSRHTS31819
Last updated: Wed Aug 21 16:23:19 2013
Last change: Wed Aug 21 16:22:12 2013 via cibadmin on virt-073.cluster-qe.lab.eng.brq.redhat.com
Stack: cman
Current DC: virt-071.cluster-qe.lab.eng.brq.redhat.com - partition with quorum
Version: 1.1.10-7.el6-368c726
3 Nodes configured
4 Resources configured


Online: [ virt-071.cluster-qe.lab.eng.brq.redhat.com virt-072.cluster-qe.lab.eng.brq.redhat.com virt-073.cluster-qe.lab.eng.brq.redhat.com ]

Full list of resources:

 virt-fencing	(stonith:fence_xvm):	Started virt-071.cluster-qe.lab.eng.brq.redhat.com 
 Master/Slave Set: MasterResource [dummystateful]
     Masters: [ virt-071.cluster-qe.lab.eng.brq.redhat.com ]
     Slaves: [ virt-072.cluster-qe.lab.eng.brq.redhat.com virt-073.cluster-qe.lab.eng.brq.redhat.com ]

[root@virt-073 ~]# crm_resource --resource MasterResource --ban 
Resource 'MasterResource' not moved: active in 3 locations.
You can prevent 'MasterResource' from running on a specific location with: --ban --host <name>
You can prevent 'MasterResource' from being promoted at its current location with: --ban --master
You can prevent 'MasterResource' from being promoted at a specific location with: --ban --master --host <name>
Error performing operation: Invalid argument



##### Here it should have been banned on virt-071, so the result would be master somewhere else and no slave running on virt-071


----------------------------------------------------------------------------

[root@virt-073 ~]# crm_resource --resource MasterResource --ban --master
WARNING: Creating rsc_location constraint 'cli-ban-MasterResource-on-virt-071.cluster-qe.lab.eng.brq.redhat.com' with a score of -INFINITY for resource MasterResource on virt-071.cluster-qe.lab.eng.brq.redhat.com.
	This will prevent MasterResource from running on virt-071.cluster-qe.lab.eng.brq.redhat.com until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if virt-071.cluster-qe.lab.eng.brq.redhat.com is the last node in the cluster
	This message can be disabled with --quiet

# Result:
 Master/Slave Set: MasterResource [dummystateful]
     Masters: [ virt-072.cluster-qe.lab.eng.brq.redhat.com ]
     Slaves: [ virt-073.cluster-qe.lab.eng.brq.redhat.com ]
     Stopped: [ virt-071.cluster-qe.lab.eng.brq.redhat.com ]


### Here it should have been banned on virt-071, while the slave instance should still be allowed to run there


------------------------------------------------------------------------------

[root@virt-073 ~]# crm_resource --resource MasterResource --ban
Resource 'MasterResource' not moved: active in 3 locations.
You can prevent 'MasterResource' from running on a specific location with: --ban --host <name>
You can prevent 'MasterResource' from being promoted at its current location with: --ban --master
You can prevent 'MasterResource' from being promoted at a specific location with: --ban --master --host <name>
Error performing operation: Invalid argument
[root@virt-073 ~]# cibadmin -Q | grep master-max
          <nvpair id="MasterResource-meta_attributes-master-max" name="master-max" value="1"/>

### According to man page this should have created a ban on master node of that resource, but it did not happen, despite the fact that master-max is set to "1" (default).

As this does not meet our expectations outlined in the comments above, I'm flipping it back to assigned.

Comment 23 Jaroslav Kortus 2013-08-21 14:45:35 UTC
Previous comment tested on:
pacemaker-cli-1.1.10-7.el6.x86_64
pacemaker-1.1.10-7.el6.x86_64

Comment 24 Andrew Beekhof 2013-08-23 03:29:39 UTC
A related patch has been committed upstream:
  https://github.com/beekhof/pacemaker/commit/7423c0b

with subject:

   Bug rhbz#902407 - crm_resource: Handle --ban for master/slave resources as advertised

Further details (if any):

Comment 25 Andrew Beekhof 2013-08-23 03:29:47 UTC
A related patch has been committed upstream:
  https://github.com/beekhof/pacemaker/commit/1bf33b4

with subject:

   Fix: xml: Location constraints are allowed to specify a role

Further details (if any):

Comment 26 Andrew Beekhof 2013-08-23 03:36:43 UTC
Well spotted, my apologies.
With the noted patches, the expected behaviour occurs: 

CIB_file=pengine/test10/master-promotion-constraint.xml tools/crm_resource --master  --ban -r ms0 ; git diff pengine/
WARNING: Creating rsc_location constraint 'cli-ban-ms0-on-hex-14' with a score of -INFINITY for resource ms0 on hex-14.
	This will prevent ms0 from being promoted on hex-14 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if hex-14 is the last node in the cluster
	This message can be disabled with --quiet
diff --git a/pengine/test10/master-promotion-constraint.xml b/pengine/test10/master-promotion-constraint.xml
index 3028aa1..0cf26ff2 100644
--- a/pengine/test10/master-promotion-constraint.xml
+++ b/pengine/test10/master-promotion-constraint.xml
...
@@ -55,6 +55,7 @@
     <constraints>
       <rsc_order first="g0" id="promote-after-g0" score="INFINITY" then="ms0" then-action="promote"/>
       <rsc_colocation id="master-with-g0" rsc="ms0" rsc-role="Master" score="INFINITY" with-rsc="g0"/>
+      <rsc_location id="cli-ban-ms0-on-hex-14" rsc="ms0" role="Master" node="hex-14" score="-INFINITY"/>
...


CIB_file=pengine/test10/master-promotion-constraint.xml tools/crm_resource  --ban -r ms0 ; git diff pengine/
WARNING: Creating rsc_location constraint 'cli-ban-ms0-on-hex-14' with a score of -INFINITY for resource ms0 on hex-14.
	This will prevent ms0 from running on hex-14 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
	This will be the case even if hex-14 is the last node in the cluster
	This message can be disabled with --quiet
diff --git a/pengine/test10/master-promotion-constraint.xml b/pengine/test10/master-promotion-constraint.xml
index 3028aa1..27b2595 100644
--- a/pengine/test10/master-promotion-constraint.xml
+++ b/pengine/test10/master-promotion-constraint.xml
...
@@ -55,6 +55,7 @@
     <constraints>
       <rsc_order first="g0" id="promote-after-g0" score="INFINITY" then="ms0" then-action="promote"/>
       <rsc_colocation id="master-with-g0" rsc="ms0" rsc-role="Master" score="INFINITY" with-rsc="g0"/>
+      <rsc_location id="cli-ban-ms0-on-hex-14" rsc="ms0" node="hex-14" score="-INFINITY"/>
...

Comment 27 Jaroslav Kortus 2013-09-18 15:52:32 UTC
Does the --lifetime work?
I used just out of curiosity:
crm_resource --ban --resource MasterResource --lifetime P60S
Migration will take effect until: 2013-09-18 15:49:08Z
[...]

But when the time passed, the constraint is still there and service stopped on the node.

And btw, "P60S" was really not intuitive to me. Is there any reason why this is specified this way instead of, for example, "60" or "60s" or "1d22h33s" ?

Comment 28 Andrew Beekhof 2013-09-23 07:55:07 UTC
(In reply to Jaroslav Kortus from comment #27)
> Does the --lifetime work?

Yep

> I used just out of curiosity:
> crm_resource --ban --resource MasterResource --lifetime P60S
> Migration will take effect until: 2013-09-18 15:49:08Z
> [...]
> 
> But when the time passed, the constraint is still there 

Thats normal

> and service stopped
> on the node.

You may need to wait up to 'cluster-recheck-interval' for the expiration to take effect.

       cluster-recheck-interval = time [15min]
           Polling interval for time based changes to options, resource parameters and constraints.

           The Cluster is primarily event driven, however the configuration can have elements that change based on time. To ensure these changes take effect, we can optionally poll the cluster's status for changes. Allowed values: Zero disables polling. Positive values are an
           interval in seconds (unless other SI units are specified. eg. 5min)



> 
> And btw, "P60S" was really not intuitive to me. Is there any reason why this
> is specified this way instead of, for example, "60" or "60s" or "1d22h33s" ?

We used ISO-8601 for specifying dates, periods and time ranges.

Comment 29 Jaroslav Kortus 2013-09-25 16:05:38 UTC
Andrew,

there is currently no difference in "--move --host" and "--move --master --host", they both leave the old master as slave and move the master instance only.

Wouldn't it make more sense if there was -INF constraint on the old slave as is it in case with --move without --host?

Comment 30 Jaroslav Kortus 2013-09-25 16:39:54 UTC
Found something which I think is a bug. Consider following situation:

 Master/Slave Set: ms [dummystateful]
     Slaves: [ virt-030.cluster-qe.lab.eng.brq.redhat.com ]
     Stopped: [ virt-028.cluster-qe.lab.eng.brq.redhat.com virt-029.cluster-qe.lab.eng.brq.redhat.com ]

The resource is banned on the stopped nodes and banned for role=Master on the virt-030. Now if I want to move the master there:

# crm_resource --resource ms --move --master --host virt-030.cluster-qe.lab.eng.brq.redhat.com
Error performing operation: ms is already active on virt-030.cluster-qe.lab.eng.brq.redhat.com
Error performing operation: Invalid argument

This happens only if the other two are banned (stopped). In other cases it seems to work just fine.

Reproducing steps:
# crm_resource --resource ms --move --master --host virt-030.cluster-qe.lab.eng.brq.redhat.com
# crm_resource --resource ms --move --master 
# crm_resource --resource ms --move 
# crm_resource --resource ms --move 
# crm_resource --resource ms --move --master --host virt-030.cluster-qe.lab.eng.brq.redhat.com
Error performing operation: ms is already active on virt-030.cluster-qe.lab.eng.brq.redhat.com
Error performing operation: Invalid argument


I'll flip this back to assigned.

Comment 32 Jaroslav Kortus 2013-09-26 16:36:51 UTC
Another corner case:

If all nodes are forced to slave mode (by banning master until there is no node left), then command "crm_resource --quiet --resource  ms --ban --host node" succeeds, but the slave instance is still keeps running there.

My expectation here would be that the slave instance is stopped on that node.

Andrew, is it somehow related to previous comments or shall I produce another crm_report? :)

Comment 33 Jaroslav Kortus 2013-09-30 14:10:26 UTC
ad comment 32:

the same scenario with banning master only works as expected, where the previous ban on all roles is replaced by role="Master" and slave instance is started on previously stopped node.

Comment 34 Andrew Beekhof 2013-10-01 05:41:05 UTC
(In reply to Jaroslav Kortus from comment #32)
> Another corner case:
> 
> If all nodes are forced to slave mode (by banning master until there is no
> node left), then command "crm_resource --quiet --resource  ms --ban --host
> node" succeeds, but the slave instance is still keeps running there.
> 
> My expectation here would be that the slave instance is stopped on that node.

Mine too.
The existing constraint is being updated but there wasn't enough information in the new one to stop it from only applying to the Master role.

Patch contains:

     if(scope_master) {
         crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_MASTER_S);
+    } else {
+        crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_STARTED_S);
     }

to fix.

> 
> Andrew, is it somehow related to previous comments or shall I produce
> another crm_report? :)

The starting point is sufficiently similar that I was able to reproduce it easily enough :)

Comment 35 Andrew Beekhof 2013-10-01 05:45:25 UTC
(In reply to Jaroslav Kortus from comment #29)
> Andrew,
> 
> there is currently no difference in "--move --host" and "--move --master
> --host", they both leave the old master as slave and move the master
> instance only.

Well observed. Patch contains:

@@ -976,6 +978,11 @@ prefer_resource(const char *rsc_id, const char *host, cib_t * cib_conn)
     free(id);
 
     crm_xml_add(location, XML_COLOC_ATTR_SOURCE, rsc_id);
+    if(scope_master) {
+        crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_MASTER_S);
+    } else {
+        crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_STARTED_S);
+    }
 
to fix.

> 
> Wouldn't it make more sense if there was -INF constraint on the old slave as
> is it in case with --move without --host?

For --move --host, the -INF constraint is only added when --force is supplied.

Comment 36 Andrew Beekhof 2013-10-01 10:56:36 UTC
A related patch has been committed upstream:
  https://github.com/beekhof/pacemaker/commit/b4c34e0

with subject:

   Fix: PE: Location constraints with role=Started should prevent masters from running at all

Further details (if any):

Comment 37 Andrew Beekhof 2013-10-01 10:56:45 UTC
A related patch has been committed upstream:
  https://github.com/beekhof/pacemaker/commit/f36c32b

with subject:

   Fix: crm_resource: Observe --master modifier for --move

Further details (if any):

 Also ensure role=Master is overwritten for --ban
Do something sane for --move when called for a clone

Comment 38 Jaroslav Kortus 2013-10-02 14:53:44 UTC
Everything now looks correct with pacemaker-1.1.10-12.el6.x86_64. Marking as verified.

Comment 39 errata-xmlrpc 2013-11-21 12:08:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-1635.html


Note You need to log in before you can comment on or make changes to this bug.