Bug 902407
| Summary: | Different results when moving Master/Slave resources | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Jaroslav Kortus <jkortus> |
| Component: | pacemaker | Assignee: | Andrew Beekhof <abeekhof> |
| Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 6.4 | CC: | cfeist, cluster-maint, dvossel, fdinitto, jkortus, tlavigne |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | pacemaker-1.1.10-12.el6 | Doc Type: | Bug Fix |
| Doc Text: |
Cause:
The 'crm_resource --move' command was designed for atomic resources, not clones and master/slave resources which can be present on multiple nodes.
Consequence:
crm_resource could not infer sufficient information to know what the admin intended, and so did nothing.
Fix:
The --ban and --clear commands were added, allowing the admin to unambiguously instruct the cluster. See comment #10 for the new help text.
Result:
Clones and master/slave resources can now be moved around the cluster with crm_resource
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2013-11-21 12:08:27 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 987355 | ||
|
Description
Jaroslav Kortus
2013-01-21 15:40:59 UTC
This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux. Chris, what crm_resource command does "pcs resoure move MasterResource" expand to for m/s resources? I'm thinking move for a master/slave resource doesn't make much sense (unless a node is specified and interpreted to mean the place where the resource can no longer run). We'd need to add support for that to crm_resource If a node is specified, I use: crm_resource --resource resource_id --move --node node_id If not, I use: crm_resource --resource resource_id --move I don't do any special for different types of resources (master/slave, clones). Should I prevent users from running that command with master/slave resources or clones and just print an error message telling them if they really want to do that they should be creating constraints? (In reply to comment #4) > If a node is specified, I use: > crm_resource --resource resource_id --move --node node_id > > If not, I use: > crm_resource --resource resource_id --move > > I don't do any special for different types of resources (master/slave, > clones). > > Should I prevent users from running that command with master/slave resources > or clones and just print an error message telling them if they really want > to do that they should be creating constraints? Yep, I think thats what we want to do, but it should probably happen at the crm_resource level so you don't have to care. Although if you detect it early the error message might be a little nicer :) Jaroslav: Do you agree with the approach? That would make sense. Can you please print the command creating the constraint to stderr? I think users would be glad to see next command instead of having to look it up in help. Also, is there a way to move the master resource, i.e. force some other node to be the master? (In reply to Jaroslav Kortus from comment #6) > That would make sense. Can you please print the command creating the > constraint to stderr? I think users would be glad to see next command > instead of having to look it up in help. Makes sense. > > Also, is there a way to move the master resource, i.e. force some other node > to be the master? Basically no :-( How about: 1. Deprecate crm_resource --move when --host is not specified. --move is then just for pushing resources to a particular location, not for "anywhere but here" 2. Impliment --(un)ban and --(un)ban-master # Stop a resource (including clones) from being allow to run at a particular location # # If 'something' is a group or primitive: # - move it from its current location to 'somewhere' # If 'something' is a m/s resource and there is only one master: # - move the master from its current location to another 'somewhere' # Otherwise, # - error crm_resource --move --resource something --host somewhere # Stop a resource (including clones) from being allow to run at a # particular location # # If --host is not specified and 'something' is a group or primitive: # - move it from its current location to another node # If --host is not specified and 'something' is a m/s resource and there is # only one master: # - move the master from its current location to another node # Otherwise, if --host is not specified # - error crm_resource --ban --resource something --host somewhere # Re-allow a resource (including clones) to run at a particular location # # If --host is not specified, remove all generated constraints crm_resource --unban --resource something --host somewhere # Stop a m/s resource from being promoted at a particular location # The resource can still be a slave on that node # If --host is not specified and there is only one master: # - move the master from its current location to another node # Otherwise, if --host is not specified # - error crm_resource --ban-master --resource someclone --host somewhere # Re-allow a m/s resource to be promoted at a particular location # If --host is not specified, remove all generated constraints crm_resource --unban-master --resource someclone --host somewhere Examples: ==== Assume: 'www' group running on 'node1' node1 - 'www' node2 - node3 - # crm_resource --ban www # Creates --ban for node1 node1 - node2 - 'www' node3 - # crm_resource --ban www --host node2 node1 - node2 - node3 - 'www' # crm_resource --move www --host node2 # Undoes --ban for node2 # Creates --ban for node3 node1 - node2 - 'www' node3 - # crm_resource --unban www --host node1 # Might move back to node1 based on stickiness node1 - 'www' node2 - node3 - ==== Assume: node1 - mysql:master node2 - mysql:slave # crm_resource --ban-master mysql node1 - mysql:master node2 - mysql:slave # crm_resource --ban-master mysql --host node2 node1 - mysql:slave node2 - mysql:slave # crm_resource --ban mysql --host node1 node1 - node2 - mysql:slave # crm_resource --unban mysql --host node2 node1 - node2 - mysql:master # crm_resource --move mysql --host node1 node1 - mysql:master node2 - mysql:slave # crm_resource --move mysql node1 - mysql:slave node2 - mysql:master Can anyone find something I missed? Follow-up... --(un)ban and --(un)ban-master or --(un)ban [--master] ? The latter might be less clunky > How about: > > 1. Deprecate crm_resource --move when --host is not specified. > --move is then just for pushing resources to a particular location, > not for "anywhere but here" I'd not do that, it's a convenient way of pushing resources around, isn't it? When I say "move" I mean "go away", not having --host seems perfectly OK to me. So I'm -1 on this :). > # Stop a resource (including clones) from being allow to run at a particular > location > # > # If 'something' is a group or primitive: > # - move it from its current location to 'somewhere' > # If 'something' is a m/s resource and there is only one master: > # - move the master from its current location to another 'somewhere' > # Otherwise, > # - error > crm_resource --move --resource something --host somewhere > # Stop a resource (including clones) from being allow to run at a > # particular location > # > # If --host is not specified and 'something' is a group or primitive: > # - move it from its current location to another node > # If --host is not specified and 'something' is a m/s resource and there is > # only one master: > # - move the master from its current location to another node > # Otherwise, if --host is not specified > # - error > crm_resource --ban --resource something --host somewhere Isn't this what --move would do? It sounds to me that --ban and --move would be one subset of the other (ban being the superset here). > > # Re-allow a resource (including clones) to run at a particular location > # > # If --host is not specified, remove all generated constraints > crm_resource --unban --resource something --host somewhere > > # Stop a m/s resource from being promoted at a particular location > # The resource can still be a slave on that node > # If --host is not specified and there is only one master: > # - move the master from its current location to another node > # Otherwise, if --host is not specified > # - error > crm_resource --ban-master --resource someclone --host somewhere > > # Re-allow a m/s resource to be promoted at a particular location > # If --host is not specified, remove all generated constraints > crm_resource --unban-master --resource someclone --host somewhere > > > Examples: > > ==== > > Assume: 'www' group running on 'node1' > > node1 - 'www' > node2 - > node3 - > > # crm_resource --ban www > # Creates --ban for node1 > > node1 - > node2 - 'www' > node3 - > > # crm_resource --ban www --host node2 > > node1 - > node2 - > node3 - 'www' > > # crm_resource --move www --host node2 > # Undoes --ban for node2 > # Creates --ban for node3 > > node1 - > node2 - 'www' > node3 - > > # crm_resource --unban www --host node1 > # Might move back to node1 based on stickiness > > node1 - 'www' > node2 - > node3 - > > ==== > > Assume: > node1 - mysql:master > node2 - mysql:slave > > # crm_resource --ban-master mysql > > node1 - mysql:master > node2 - mysql:slave > should not this be like this?: node1 - mysql:slave node2 - mysql:master > # crm_resource --ban-master mysql --host node2 > > node1 - mysql:slave > node2 - mysql:slave > > # crm_resource --ban mysql --host node1 > > node1 - > node2 - mysql:slave > > # crm_resource --unban mysql --host node2 > > node1 - > node2 - mysql:master This should probably be --unban-master here. > > # crm_resource --move mysql --host node1 > > node1 - mysql:master > node2 - mysql:slave > > # crm_resource --move mysql > > node1 - mysql:slave > node2 - mysql:master > > > This looks good. I just have a feeling that --move and --ban are doing the same for most of the time, so we should probably have just one of these options not to confuse users. (In reply to Jaroslav Kortus from comment #9) > > How about: > > > > 1. Deprecate crm_resource --move when --host is not specified. > > --move is then just for pushing resources to a particular location, > > not for "anywhere but here" > > I'd not do that, it's a convenient way of pushing resources around, isn't > it? When I say "move" I mean "go away", not having --host seems perfectly OK > to me. So I'm -1 on this :). It will still work that way, we'd just not advertise it. Removing it completely would be a really good way for me to piss off almost everyone :) The intention though, is that --ban (potentially with no --host) would be the "better way" to do the same thing. > > > > # Stop a resource (including clones) from being allow to run at a particular > > location > > # > > # If 'something' is a group or primitive: > > # - move it from its current location to 'somewhere' > > # If 'something' is a m/s resource and there is only one master: > > # - move the master from its current location to another 'somewhere' > > # Otherwise, > > # - error > > crm_resource --move --resource something --host somewhere > > > # Stop a resource (including clones) from being allow to run at a > > # particular location > > # > > # If --host is not specified and 'something' is a group or primitive: > > # - move it from its current location to another node > > # If --host is not specified and 'something' is a m/s resource and there is > > # only one master: > > # - move the master from its current location to another node > > # Otherwise, if --host is not specified > > # - error > > crm_resource --ban --resource something --host somewhere > > Isn't this what --move would do? It sounds to me that --ban and --move would > be one subset of the other (ban being the superset here). Almost. The --host argument for --ban means "the node on which it can't run anymore". For --move its "the node on which it should run now". There is much overlap, but one can't completely replace the other. Eg. There is no single --ban command that lets you move a resource to a specific node. > > > > # Re-allow a resource (including clones) to run at a particular location > > # > > # If --host is not specified, remove all generated constraints > > crm_resource --unban --resource something --host somewhere > > > > # Stop a m/s resource from being promoted at a particular location > > # The resource can still be a slave on that node > > # If --host is not specified and there is only one master: > > # - move the master from its current location to another node > > # Otherwise, if --host is not specified > > # - error > > crm_resource --ban-master --resource someclone --host somewhere > > > > # Re-allow a m/s resource to be promoted at a particular location > > # If --host is not specified, remove all generated constraints > > crm_resource --unban-master --resource someclone --host somewhere > > > > > > Examples: > > > > ==== > > > > Assume: 'www' group running on 'node1' > > > > node1 - 'www' > > node2 - > > node3 - > > > > # crm_resource --ban www > > # Creates --ban for node1 > > > > node1 - > > node2 - 'www' > > node3 - > > > > # crm_resource --ban www --host node2 > > > > node1 - > > node2 - > > node3 - 'www' > > > > # crm_resource --move www --host node2 > > # Undoes --ban for node2 > > # Creates --ban for node3 > > > > node1 - > > node2 - 'www' > > node3 - > > > > # crm_resource --unban www --host node1 > > # Might move back to node1 based on stickiness > > > > node1 - 'www' > > node2 - > > node3 - > > > > ==== > > > > Assume: > > node1 - mysql:master > > node2 - mysql:slave > > > > # crm_resource --ban-master mysql > > > > node1 - mysql:master > > node2 - mysql:slave > > > > should not this be like this?: > node1 - mysql:slave > node2 - mysql:master Yes. > > > # crm_resource --ban-master mysql --host node2 > > > > node1 - mysql:slave > > node2 - mysql:slave > > > > # crm_resource --ban mysql --host node1 > > > > node1 - > > node2 - mysql:slave > > > > > # crm_resource --unban mysql --host node2 > > > > node1 - > > node2 - mysql:master > This should probably be --unban-master here. Either would achieve the same result in my mind. > > # crm_resource --move mysql --host node1 > > > > node1 - mysql:master > > node2 - mysql:slave > > > > # crm_resource --move mysql > > > > node1 - mysql:slave > > node2 - mysql:master > > > > > > > > This looks good. I just have a feeling that --move and --ban are doing the > same for most of the time, so we should probably have just one of these > options not to confuse users. I made a head start on this and ironed out some kinks. This is the current help text which make the need for both more obvious: Resource location: -M, --move Move a resource from its current location to the named destination. Requires: --host. Optional: --lifetime, --master NOTE: This may prevent the resource from running on the previous location node until the implicit constraints expire or are removed with --unban -B, --ban Prevent the named resource from running on the named --host. Requires: --resource. Optional: --host, --lifetime, --master If --host is not specified, it defaults to: * the curent location for primitives and groups, or * the curent location of the master for m/s resources with master-max=1 All other situations result in an error as there is no sane default. NOTE: This will prevent the resource from running on this node until the constraint expires or is removed with --clear -U, --clear Remove all constraints created by the --ban and/or --move commands. Requires: --resource. Optional: --host, --master If --host is not specified, all constraints created by --ban and --move will be removed for the named resource. -u, --lifetime=value Lifespan of constraints created by the --ban and --move commands --master Limit the scope of the --ban, --move and --clear commands to the Master role. For --ban, the previous master can still remain active in the Slave role. Oh, and we need a QE ack please :) > > > 1. Deprecate crm_resource --move when --host is not specified. > > > --move is then just for pushing resources to a particular location, > > > not for "anywhere but here" > > > > I'd not do that, it's a convenient way of pushing resources around, isn't > > it? When I say "move" I mean "go away", not having --host seems perfectly OK > > to me. So I'm -1 on this :). > > It will still work that way, we'd just not advertise it. > Removing it completely would be a really good way for me to piss off almost > everyone :) But you have added mandatory --host, so for me it's different. Or did I miss something? > Resource location: > -M, --move Move a resource from its current location to the named > destination. > Requires: --host. Optional: --lifetime, --master > > NOTE: This may prevent the resource from running on the previous > location node until the implicit constraints expire or are removed > with --unban > > -B, --ban Prevent the named resource from running on the named --host. > Requires: --resource. Optional: --host, --lifetime, --master > > If --host is not specified, it defaults to: > * the curent location for primitives and groups, or > * the curent location of the master for m/s resources with master-max=1 > > All other situations result in an error as there is no sane default. > > NOTE: This will prevent the resource from running on this node until > the constraint expires or is removed with --clear > > -U, --clear Remove all constraints created by the --ban and/or --move > commands. > Requires: --resource. Optional: --host, --master > > If --host is not specified, all constraints created by --ban and > --move will be removed for the named resource. So moving is where we know where do we want it to run, ban is when we know where it should not run, correct? :) > > -u, --lifetime=value > Lifespan of constraints created by the --ban and --move commands Please add "value in seconds" or some example (like "1d30min" if supported). Just a small recap if I got all the details correctly. --move will move the resource (create -INF constraint on current node and +INF on the other if specified by --host). --ban will create -INF constraint on the current node. If --host is specified, it will create the constraint for that node, not necessarily moving the resource. If master/slave resource is in the game, then all of it's instances would be moved from the node. Only if --master is specified, they would eventually be allowed to run in slave mode (i.e. no -INF constraint as in previous case, just prohibiting it to be elected master). move && --master && --host = always run master on this node ban && --master && --host = never run master on this node Is this correct? (In reply to Jaroslav Kortus from comment #12) > > > > 1. Deprecate crm_resource --move when --host is not specified. > > > > --move is then just for pushing resources to a particular location, > > > > not for "anywhere but here" > > > > > > I'd not do that, it's a convenient way of pushing resources around, isn't > > > it? When I say "move" I mean "go away", not having --host seems perfectly OK > > > to me. So I'm -1 on this :). > > > > It will still work that way, we'd just not advertise it. > > Removing it completely would be a really good way for me to piss off almost > > everyone :) > > But you have added mandatory --host, so for me it's different. Or did I miss > something? I've "said" that it is mandatory to simplify the documentation, but the code will not enforce this. > So moving is where we know where do we want it to run, ban is when we know > where it should not run, correct? :) Exactly > > > > -u, --lifetime=value > > Lifespan of constraints created by the --ban and --move commands > > Please add "value in seconds" or some example (like "1d30min" if supported). Both are allowed. > Just a small recap if I got all the details correctly. > --move will move the resource (create -INF constraint on current node and > +INF on the other if specified by --host). > --ban will create -INF constraint on the current node. If --host is > specified, it will create the constraint for that node, not necessarily > moving the resource. Correct > > If master/slave resource is in the game, then all of it's instances would be > moved from the node. Correct > Only if --master is specified, they would eventually be > allowed to run in slave mode (i.e. no -INF constraint as in previous case, > just prohibiting it to be elected master). There would still be a -INF constraint, but the scope would be restricted to being the master role. > move && --master && --host = always run master on this node > ban && --master && --host = never run master on this node > > Is this correct? Yes. Satisfactory? Yes, thank you :). Can we get a QE ack for this please? :-) [root@virt-073 ~]# pcs status
Cluster name: STSRHTS31819
Last updated: Wed Aug 21 16:23:19 2013
Last change: Wed Aug 21 16:22:12 2013 via cibadmin on virt-073.cluster-qe.lab.eng.brq.redhat.com
Stack: cman
Current DC: virt-071.cluster-qe.lab.eng.brq.redhat.com - partition with quorum
Version: 1.1.10-7.el6-368c726
3 Nodes configured
4 Resources configured
Online: [ virt-071.cluster-qe.lab.eng.brq.redhat.com virt-072.cluster-qe.lab.eng.brq.redhat.com virt-073.cluster-qe.lab.eng.brq.redhat.com ]
Full list of resources:
virt-fencing (stonith:fence_xvm): Started virt-071.cluster-qe.lab.eng.brq.redhat.com
Master/Slave Set: MasterResource [dummystateful]
Masters: [ virt-071.cluster-qe.lab.eng.brq.redhat.com ]
Slaves: [ virt-072.cluster-qe.lab.eng.brq.redhat.com virt-073.cluster-qe.lab.eng.brq.redhat.com ]
[root@virt-073 ~]# crm_resource --resource MasterResource --ban
Resource 'MasterResource' not moved: active in 3 locations.
You can prevent 'MasterResource' from running on a specific location with: --ban --host <name>
You can prevent 'MasterResource' from being promoted at its current location with: --ban --master
You can prevent 'MasterResource' from being promoted at a specific location with: --ban --master --host <name>
Error performing operation: Invalid argument
##### Here it should have been banned on virt-071, so the result would be master somewhere else and no slave running on virt-071
----------------------------------------------------------------------------
[root@virt-073 ~]# crm_resource --resource MasterResource --ban --master
WARNING: Creating rsc_location constraint 'cli-ban-MasterResource-on-virt-071.cluster-qe.lab.eng.brq.redhat.com' with a score of -INFINITY for resource MasterResource on virt-071.cluster-qe.lab.eng.brq.redhat.com.
This will prevent MasterResource from running on virt-071.cluster-qe.lab.eng.brq.redhat.com until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
This will be the case even if virt-071.cluster-qe.lab.eng.brq.redhat.com is the last node in the cluster
This message can be disabled with --quiet
# Result:
Master/Slave Set: MasterResource [dummystateful]
Masters: [ virt-072.cluster-qe.lab.eng.brq.redhat.com ]
Slaves: [ virt-073.cluster-qe.lab.eng.brq.redhat.com ]
Stopped: [ virt-071.cluster-qe.lab.eng.brq.redhat.com ]
### Here it should have been banned on virt-071, while the slave instance should still be allowed to run there
------------------------------------------------------------------------------
[root@virt-073 ~]# crm_resource --resource MasterResource --ban
Resource 'MasterResource' not moved: active in 3 locations.
You can prevent 'MasterResource' from running on a specific location with: --ban --host <name>
You can prevent 'MasterResource' from being promoted at its current location with: --ban --master
You can prevent 'MasterResource' from being promoted at a specific location with: --ban --master --host <name>
Error performing operation: Invalid argument
[root@virt-073 ~]# cibadmin -Q | grep master-max
<nvpair id="MasterResource-meta_attributes-master-max" name="master-max" value="1"/>
### According to man page this should have created a ban on master node of that resource, but it did not happen, despite the fact that master-max is set to "1" (default).
As this does not meet our expectations outlined in the comments above, I'm flipping it back to assigned.
Previous comment tested on: pacemaker-cli-1.1.10-7.el6.x86_64 pacemaker-1.1.10-7.el6.x86_64 A related patch has been committed upstream: https://github.com/beekhof/pacemaker/commit/7423c0b with subject: Bug rhbz#902407 - crm_resource: Handle --ban for master/slave resources as advertised Further details (if any): A related patch has been committed upstream: https://github.com/beekhof/pacemaker/commit/1bf33b4 with subject: Fix: xml: Location constraints are allowed to specify a role Further details (if any): Well spotted, my apologies.
With the noted patches, the expected behaviour occurs:
CIB_file=pengine/test10/master-promotion-constraint.xml tools/crm_resource --master --ban -r ms0 ; git diff pengine/
WARNING: Creating rsc_location constraint 'cli-ban-ms0-on-hex-14' with a score of -INFINITY for resource ms0 on hex-14.
This will prevent ms0 from being promoted on hex-14 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
This will be the case even if hex-14 is the last node in the cluster
This message can be disabled with --quiet
diff --git a/pengine/test10/master-promotion-constraint.xml b/pengine/test10/master-promotion-constraint.xml
index 3028aa1..0cf26ff2 100644
--- a/pengine/test10/master-promotion-constraint.xml
+++ b/pengine/test10/master-promotion-constraint.xml
...
@@ -55,6 +55,7 @@
<constraints>
<rsc_order first="g0" id="promote-after-g0" score="INFINITY" then="ms0" then-action="promote"/>
<rsc_colocation id="master-with-g0" rsc="ms0" rsc-role="Master" score="INFINITY" with-rsc="g0"/>
+ <rsc_location id="cli-ban-ms0-on-hex-14" rsc="ms0" role="Master" node="hex-14" score="-INFINITY"/>
...
CIB_file=pengine/test10/master-promotion-constraint.xml tools/crm_resource --ban -r ms0 ; git diff pengine/
WARNING: Creating rsc_location constraint 'cli-ban-ms0-on-hex-14' with a score of -INFINITY for resource ms0 on hex-14.
This will prevent ms0 from running on hex-14 until the constraint is removed using the 'crm_resource --clear' command or manually with cibadmin
This will be the case even if hex-14 is the last node in the cluster
This message can be disabled with --quiet
diff --git a/pengine/test10/master-promotion-constraint.xml b/pengine/test10/master-promotion-constraint.xml
index 3028aa1..27b2595 100644
--- a/pengine/test10/master-promotion-constraint.xml
+++ b/pengine/test10/master-promotion-constraint.xml
...
@@ -55,6 +55,7 @@
<constraints>
<rsc_order first="g0" id="promote-after-g0" score="INFINITY" then="ms0" then-action="promote"/>
<rsc_colocation id="master-with-g0" rsc="ms0" rsc-role="Master" score="INFINITY" with-rsc="g0"/>
+ <rsc_location id="cli-ban-ms0-on-hex-14" rsc="ms0" node="hex-14" score="-INFINITY"/>
...
Does the --lifetime work? I used just out of curiosity: crm_resource --ban --resource MasterResource --lifetime P60S Migration will take effect until: 2013-09-18 15:49:08Z [...] But when the time passed, the constraint is still there and service stopped on the node. And btw, "P60S" was really not intuitive to me. Is there any reason why this is specified this way instead of, for example, "60" or "60s" or "1d22h33s" ? (In reply to Jaroslav Kortus from comment #27) > Does the --lifetime work? Yep > I used just out of curiosity: > crm_resource --ban --resource MasterResource --lifetime P60S > Migration will take effect until: 2013-09-18 15:49:08Z > [...] > > But when the time passed, the constraint is still there Thats normal > and service stopped > on the node. You may need to wait up to 'cluster-recheck-interval' for the expiration to take effect. cluster-recheck-interval = time [15min] Polling interval for time based changes to options, resource parameters and constraints. The Cluster is primarily event driven, however the configuration can have elements that change based on time. To ensure these changes take effect, we can optionally poll the cluster's status for changes. Allowed values: Zero disables polling. Positive values are an interval in seconds (unless other SI units are specified. eg. 5min) > > And btw, "P60S" was really not intuitive to me. Is there any reason why this > is specified this way instead of, for example, "60" or "60s" or "1d22h33s" ? We used ISO-8601 for specifying dates, periods and time ranges. Andrew, there is currently no difference in "--move --host" and "--move --master --host", they both leave the old master as slave and move the master instance only. Wouldn't it make more sense if there was -INF constraint on the old slave as is it in case with --move without --host? Found something which I think is a bug. Consider following situation:
Master/Slave Set: ms [dummystateful]
Slaves: [ virt-030.cluster-qe.lab.eng.brq.redhat.com ]
Stopped: [ virt-028.cluster-qe.lab.eng.brq.redhat.com virt-029.cluster-qe.lab.eng.brq.redhat.com ]
The resource is banned on the stopped nodes and banned for role=Master on the virt-030. Now if I want to move the master there:
# crm_resource --resource ms --move --master --host virt-030.cluster-qe.lab.eng.brq.redhat.com
Error performing operation: ms is already active on virt-030.cluster-qe.lab.eng.brq.redhat.com
Error performing operation: Invalid argument
This happens only if the other two are banned (stopped). In other cases it seems to work just fine.
Reproducing steps:
# crm_resource --resource ms --move --master --host virt-030.cluster-qe.lab.eng.brq.redhat.com
# crm_resource --resource ms --move --master
# crm_resource --resource ms --move
# crm_resource --resource ms --move
# crm_resource --resource ms --move --master --host virt-030.cluster-qe.lab.eng.brq.redhat.com
Error performing operation: ms is already active on virt-030.cluster-qe.lab.eng.brq.redhat.com
Error performing operation: Invalid argument
I'll flip this back to assigned.
Another corner case: If all nodes are forced to slave mode (by banning master until there is no node left), then command "crm_resource --quiet --resource ms --ban --host node" succeeds, but the slave instance is still keeps running there. My expectation here would be that the slave instance is stopped on that node. Andrew, is it somehow related to previous comments or shall I produce another crm_report? :) ad comment 32: the same scenario with banning master only works as expected, where the previous ban on all roles is replaced by role="Master" and slave instance is started on previously stopped node. (In reply to Jaroslav Kortus from comment #32) > Another corner case: > > If all nodes are forced to slave mode (by banning master until there is no > node left), then command "crm_resource --quiet --resource ms --ban --host > node" succeeds, but the slave instance is still keeps running there. > > My expectation here would be that the slave instance is stopped on that node. Mine too. The existing constraint is being updated but there wasn't enough information in the new one to stop it from only applying to the Master role. Patch contains: if(scope_master) { crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_MASTER_S); + } else { + crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_STARTED_S); } to fix. > > Andrew, is it somehow related to previous comments or shall I produce > another crm_report? :) The starting point is sufficiently similar that I was able to reproduce it easily enough :) (In reply to Jaroslav Kortus from comment #29) > Andrew, > > there is currently no difference in "--move --host" and "--move --master > --host", they both leave the old master as slave and move the master > instance only. Well observed. Patch contains: @@ -976,6 +978,11 @@ prefer_resource(const char *rsc_id, const char *host, cib_t * cib_conn) free(id); crm_xml_add(location, XML_COLOC_ATTR_SOURCE, rsc_id); + if(scope_master) { + crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_MASTER_S); + } else { + crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_STARTED_S); + } to fix. > > Wouldn't it make more sense if there was -INF constraint on the old slave as > is it in case with --move without --host? For --move --host, the -INF constraint is only added when --force is supplied. A related patch has been committed upstream: https://github.com/beekhof/pacemaker/commit/b4c34e0 with subject: Fix: PE: Location constraints with role=Started should prevent masters from running at all Further details (if any): A related patch has been committed upstream: https://github.com/beekhof/pacemaker/commit/f36c32b with subject: Fix: crm_resource: Observe --master modifier for --move Further details (if any): Also ensure role=Master is overwritten for --ban Do something sane for --move when called for a clone Everything now looks correct with pacemaker-1.1.10-12.el6.x86_64. Marking as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-1635.html |