Bug 1572116

Summary:	Provide a CLI tool to check if specified rules / rule constraints are expired
Product:	Red Hat Enterprise Linux 8	Reporter:	Tomas Jelinek <tojeline>
Component:	pacemaker	Assignee:	Chris Lumens <clumens>
Status:	CLOSED ERRATA	QA Contact:	cluster-qe <cluster-qe>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	8.0	CC:	abeekhof, ableisch, aherr, cfeist, cluster-maint, cluster-qe, idevat, jruemker, kgaillot, mnovacek, obenes, omular, phagara, pzimek, rbeyel, sbradley, tojeline
Target Milestone:	rc	Keywords:	FutureFeature
Target Release:	8.1
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	pacemaker-2.0.2-1.el8	Doc Type:	Enhancement
Doc Text:		Story Points:	---
Clone Of:	1442116
Clones:	1794850 (view as bug list)		Environment:
Last Closed:	2019-11-05 20:57:32 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1442116, 1546815, 1679810, 1759320, 1794850

Comment 1 Tomas Jelinek 2018-04-26 08:39:27 UTC

Pcs is requested to mark rule constraints not currently in effect. Pacemaker already has code to evaluate rules but it is not exposed through any CLI tool pcs could use. This bz requests the code to be exposed in a pacemaker CLI tool.

Running the tool for each constraint / rule separately may be quite slow. It would be great if the tool could decide for more than one constraint / rule, something like this:
# decide <rule or constraint id>... [--date=<date>]
id_1 : in_effect
id_2 : not_in_effect
Rules could be loaded from the current CIB or from CIB_FILE path.

This is just a raw idea, ideas to improve the interface of the tool are more than welcome.

Comment 2 Patrik Hagara 2018-12-06 11:51:53 UTC

qa_ack+

New pacemaker CLI tool for viewing configured constraints and whether they are currently in effect (ie. not expired). This tool MUST NOT have any ability to modify the cluster configuration.

Comment 3 Chris Lumens 2018-12-13 21:13:15 UTC

After digging into this a little bit, it looks fairly complicated to fully check out a rule to see if it's expired or not.  Rules can be nested and have boolean operators applied to their sub-rules.  Some of the expressions can involve checking for a specific node or other site-specific information.  In short, rules can be arbitrarily complicated and the code is difficult to unravel to fully simulate testing of a rule.  I'm not even sure how feasible it is given the node restriction possibility.

So what I'm wondering is:  would you be okay with at least the initial implementation of this taking just a date_expression ID and testing that single node to see if it's expired?  Alternately, I could perhaps make it to where it takes a rule ID, drills down to a single date_expression under that, and checks.  But I don't really see a good way of testing more complex constraints than that.  Perhaps afterwards, someone could improve upon this initial implementation.

Comment 4 Tomas Jelinek 2019-01-02 14:25:16 UTC

Sorry, I would not be ok with that.

The whole point of this bz being filed against pacemaker is to have a rule evaluation engine implemented in pacemaker only, where it is implemented already anyway. Evaluating just date expressions in pacemaker leaving all the boolean operators and sub-rules to be evaluated in pcs is a complete opposite of that. I understand that evaluating rules is complicated and that related code in pacemaker may need major changes in order to implement the requested functionality. But I don't see that being a justification for re-implementing the rule evaluation engine in pcs. Doing so would bring new set of problems. It would require the two implementations to be working exactly the same, which brings another issues considering there is no strict relation between pcs and pacemaker versions.

Comment 5 Ken Gaillot 2019-01-02 16:15:58 UTC

(In reply to Tomas Jelinek from comment #4)
> Sorry, I would not be ok with that.
> 
> The whole point of this bz being filed against pacemaker is to have a rule
> evaluation engine implemented in pacemaker only, where it is implemented
> already anyway. Evaluating just date expressions in pacemaker leaving all
> the boolean operators and sub-rules to be evaluated in pcs is a complete
> opposite of that. I understand that evaluating rules is complicated and that
> related code in pacemaker may need major changes in order to implement the
> requested functionality. But I don't see that being a justification for
> re-implementing the rule evaluation engine in pcs. Doing so would bring new
> set of problems. It would require the two implementations to be working
> exactly the same, which brings another issues considering there is no strict
> relation between pcs and pacemaker versions.

The problem is not the mechanism of evaluation, but a lack of clarity in the desired goal. Pacemaker can determine a general solution to "in effect now, or not", but not "expired, or not". For example, consider a rule that's in effect from 9am-5pm weekdays on nodes whose "datacenter" attribute is "building2". Also, when rules are used with cluster properties or resource instance attributes, there can be multiple attribute blocks each with a rule, so even evaluating a complete (single) rule might not be a complete picture in that case.

I think we are limited to these possibilities:

- Given a single rule, return whether the rule is in effect on a specified node (defaulting to the local node) at a specified time (defaulting to now) (note: it is not possible in this case to detect "expired")

- Given a single rule expression, return whether the expression is "in effect", "not in effect because expired", or "not in effect for some other reason"

- Given either a cluster property, or a resource plus resource attribute, as well as a specified node (default local) and time (default now), return the evaluated value (by comparison, crm_attribute will currently return multiple matches if there are rules for a cluster property)

... none of which directly supports checking whether a constraint is expired. However, pcs doesn't necessarily need a general solution. If all pcs needs is to hide expired constraints, pcs could check each constraint for whether it had a single rule with a single expression, and if so, pass that expression ID to the new tool as suggested. I.e. pcs could hide the simple cases, and pass through everything else.

Comment 6 Tomas Jelinek 2019-01-04 10:03:46 UTC

OK, now I see what the issues is, thanks for explaining that out, Ken.

The goal is that pcs displays expired constraints differently (mark them as expired or hide them, that does not really matter here). We want rule evaluation to be implemented in pacemaker only, not in pcs. Those are the main points. Perhaps the bz should have been named "... if specified rules / rule constraints are expired" instead. "Expired" and "in effect" are two very different statements, which was not obvious to me. Sorry for the confusion it created. At least it will be covered in this bz for future readers.

So, what does "expired" mean? I would say it means a rule will never be in effect in the future. Meaning there is a date expression in the rule saying when the rule will not apply any more.
According to this, from 9am-5pm weekdays on nodes whose "datacenter" attribute is "building2" never expires - there is no limiting date. Using cluster properties or resource instance attributes does not matter either, I think. It is perfectly possible that some day in future properties or attributes will be set so that a rule using them would be in effect, i.e. such rule also never expires.
On the other hand, rules like "from 2018-12-01 to 2019-01-31" or "until 2019-01-31" clearly expire on the ending date.
Now for combinations:
"from 9am-5pm weekdays and from 2018-12-01 to 2019-01-31" - expires on the ending date
"from 9am-5pm weekdays or from 2018-12-01 to 2019-01-31" - never expires, there will be plenty of weekdays after 2019-01-31
These are just simple combinations, actual rules may be much more complex than this. That, if I understand correctly, is the problem with deciding if they are expired.

Giving that, I get now why you would like pcs to check what a rule looks like.

How about this:
* pcs would ask pacemaker if a rule is expired or not no matter what the rule looks like (i.e. pcs would always pass rule id to pacemaker)
* pacemaker would check if it is capable to determine if the rule is expired based on its format / complexity
* for the simple cases, pacemaker would mark the rule as expired or not
* for other cases, pacemaker would mark the rule as not expired (or undecidable, perhaps)

This would allow to keep all the logic in pacemaker and potentially improve / expand it in future without any changes needed in pcs. Also, it would keep the decisioning in pacemaker as simple as you originally proposed - we can start with simple rules with one expression and everything else would be marked by pacemaker as "undecidable". I guess this is pretty much in line with what is being implemented for bz1658650, isn't it?

Deciding if a rule is in effect is a completely different task and I think there is no need to implement it now. Perhaps in the future, if someone requests it, there will be another tool for it. Or the tool for deciding expired rules could mark rules as (expired | not expired | don't know), (in effect | not in effect | don't know) - meaning it would give two distinctive flags to each rule.

Comment 7 Ken Gaillot 2019-01-07 19:13:46 UTC

(In reply to Tomas Jelinek from comment #6)
> "from 9am-5pm weekdays and from 2018-12-01 to 2019-01-31" - expires on the ending date

Debatable; e.g. if the end date is a Saturday, does it expire at 11:59:59pm Saturday, or 5pm Friday? We even have that ridiculous "moon" option, so a rule could be e.g. "days with a full moon in 2019", and we'd have to figure out the last full moon of the year to know whether it's expired. Possible, but ugly.

> marked by pacemaker as "undecidable". I guess this is pretty much in line
> with what is being implemented for bz1658650, isn't it?

That bz is much simpler -- it only has to deal with rules created by crm_resource, which uses a well-defined, narrow subset of the full rule syntax.

A general solution is much more challenging. All existing code involves a simple pass/fail test of each rule component, so an "expired" determination would involve a significant rewrite.

Looking at this more closely, the only simple case is a rule containing a single date_expression with an operation that is not "date_spec". I don't think I'd be comfortable releasing something that covers such a narrow case, and returning "undecidable" for everything else, with the meaning of that possibly changing over time. That seems a bit too fuzzy for the pacemaker CLI, which should be a definitive interface.

I'm thinking this will have to drop in priority and get pushed to RHEL 8 only.

Comment 8 Ken Gaillot 2019-01-08 20:43:30 UTC

I think we can follow this plan, in the RHEL 8.1 time frame:

1. Create a new CLI tool that accepts options for a timestamp (default now), a node name (defaulting to the local node), and a rule ID.

2. The tool would have exit status codes (besides errors) for "rule is in effect" (success i.e. 0) and "rule is not in effect".

3. The tool would print messages to stdout about whether the rule is in effect, and why not if not, with an option to create XML output instead (for reliable parsing by pcs and such). The initial implementation would check for the narrow "expired" case (rule has a single date_expression with an operation that is not "date_spec"). This output would be declared experimental and unsupported. pcs could rely on the behavior of the version in RHEL, but upstream and RHEL end users would be discouraged from using it until fully implemented.

5. We could open a separate BZ for the full "why not" implementation, i.e. the ability to handle date_spec and multiple expressions, as well as reasons other than expired (e.g. in future, or node attribute does not match). The tool could also potentially be the home of a new option to fully evaluate any cluster property or resource parameter, a request that occasionally comes up -- current tools can only say what values are explicitly configured (possibly none or multiple), not what final value would be used at the moment (so, for example, there's currently no way to find out what the default of some property is).

Comment 10 Chris Lumens 2019-03-12 14:02:24 UTC

This is fixed by upstream commit fd91fd8e33b575810b03d2890de3ee23cb3b5e0e.

Comment 13 Patrik Hagara 2019-09-04 11:16:30 UTC

The new "crm_rule" tool interface:

> [root@virt-141 ~]# crm_rule --help
> crm_rule - Tool for querying the state of rules
> Usage: crm_rule [options]
> Options:
>  -?, --help		This text
>  -$, --version		Version information
>  -V, --verbose		Increase debug output
> 
> Modes (mutually exclusive):
>  -c, --check		Check whether a rule is in effect
> 
> Additional options:
>  -d, --date=value	Whether the rule is in effect on a given date
>  -r, --rule=value	The ID of the rule to check
> 
> Data:
>  -X, --xml-text=value	Use argument for XML (or stdin if '-')
> 
> 
> This tool is currently experimental.
> 
> The interface, behavior, and output may change with any version of pacemaker.
> 
> 
> Report bugs to users


(In reply to Ken Gaillot from comment #8)
> 1. Create a new CLI tool that accepts options for a timestamp (default now),
> a node name (defaulting to the local node), and a rule ID.

The tool does not accept any node name. Not a huge issue, since the constraints themselves usually specify the target node's uname (eg. when created using "pcs resource move <rsc> lifetime=<iso8601-spec>", as per the original BZ request).

 
> 2. The tool would have exit status codes (besides errors) for "rule is in
> effect" (success i.e. 0) and "rule is not in effect".


Test #1: temporarily ban a resource from running on a particular node, use crm_rule to check whether it's still in effect

> [root@virt-141 ~]# pcs resource move dummy lifetime=PT30M
> Migration will take effect until: 2019-09-04 13:17:29 +02:00
> Warning: Creating location constraint 'cli-ban-dummy-on-virt-142' with a score of -INFINITY for resource dummy on virt-142.
> 	This will prevent dummy from running on virt-142 until the constraint is removed
> 	This will be the case even if virt-142 is the last node in the cluster
> [root@virt-141 ~]# pcs constraint
> Location Constraints:
>   Resource: dummy
>     Constraint: cli-ban-dummy-on-virt-142
>       Rule: boolean-op=and score=-INFINITY
>         Expression: #uname eq string virt-142
>         Expression: date lt 2019-09-04 13:17:29 +02:00
> Ordering Constraints:
> Colocation Constraints:
> Ticket Constraints:
> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-142
> No rule found with ID=cli-ban-dummy-on-virt-142 containing a date_expression
> Error checking rule: No such device or address
> [root@virt-141 ~]# echo $?
> 105

Result: crm_rule is unable to check whether a simple lifetime ban constraint is still in effect. Since this was supposed to be the primary use-case for this new tool, I'm marking this as FailedQA.

Looks like the CTS regression tests [1] for this feature are using nonsensical rules -- location constraints specifying only the lifetime, not the actual location (node) to avoid/prefer:

>     <constraints>
>       <rsc_location id="no-date-expression" rsc="dummy" score="-INFINITY" node="node01"/>
>       <rsc_location id="cli-prefer-dummy-expired" rsc="dummy">
>         <rule id="cli-prefer-rule-dummy-expired" score="INFINITY">
>           <date_expression id="cli-prefer-lifetime-end-dummy-expired" operation="lt" end=""/>
>         </rule>
>       </rsc_location>
>       <rsc_location id="cli-prefer-dummy-not-yet" rsc="dummy">
>         <rule id="cli-prefer-rule-dummy-not-yet" score="INFINITY">
>           <date_expression id="cli-prefer-lifetime-end-dummy-not-yet" operation="gt" start=""/>
>         </rule>
>       </rsc_location>
>     </constraints>

For comparison, the `pcs resource move dummy lifetime=PT30M` command generated the following constraint:

>     <constraints>
>       <rsc_location id="cli-ban-dummy-on-virt-142" rsc="dummy" role="Started">
>         <rule id="cli-ban-dummy-on-virt-142-rule" score="-INFINITY" boolean-op="and">
>           <expression id="cli-ban-dummy-on-virt-142-expr" attribute="#uname" operation="eq" value="virt-142" type="string"/>
>           <date_expression id="cli-ban-dummy-on-virt-142-lifetime" operation="lt" end="2019-09-04 13:17:29 +02:00"/>
>         </rule>
>       </rsc_location>
>     </constraints>

[1] https://github.com/ClusterLabs/pacemaker/pull/1691/files#diff-402ac06cbaaac9dc2a4c4fb1de754201

Comment 15 Patrik Hagara 2019-09-04 11:23:21 UTC

Hm, looks like another UX issue...

The `pcs constraint` command displays "rsc_location" ID ("cli-ban-dummy-on-virt-142" in this case) and `crm_rule` expects the "rule" ID ("cli-ban-dummy-on-virt-142-rule", with "-rule" appended):

> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-142-rule
> Rule cli-ban-dummy-on-virt-142-rule is expired
> [root@virt-141 ~]# date
> Wed Sep  4 13:18:10 CEST 2019
> [root@virt-141 ~]# pcs constraint
> Location Constraints:
>   Resource: dummy
>     Constraint: cli-ban-dummy-on-virt-142
>       Rule: boolean-op=and score=-INFINITY
>         Expression: #uname eq string virt-142
>         Expression: date lt 2019-09-04 13:17:29 +02:00
> Ordering Constraints:
> Colocation Constraints:
> Ticket Constraints:

For reference, the constrains section from CIB:

>     <constraints>
>       <rsc_location id="cli-ban-dummy-on-virt-142" rsc="dummy" role="Started">
>         <rule id="cli-ban-dummy-on-virt-142-rule" score="-INFINITY" boolean-op="and">
>           <expression id="cli-ban-dummy-on-virt-142-expr" attribute="#uname" operation="eq" value="virt-142" type="string"/>
>           <date_expression id="cli-ban-dummy-on-virt-142-lifetime" operation="lt" end="2019-09-04 13:17:29 +02:00"/>
>         </rule>
>       </rsc_location>
>     </constraints>

This is very confusing.

Comment 16 Patrik Hagara 2019-09-04 11:39:02 UTC

Switching back to ON_QA and clearing FailedQA flag, as the UX issue is not that critical for an experimental tool to be used internally by pcs in the future.

Comment 17 Patrik Hagara 2019-09-04 13:02:36 UTC

> [root@virt-141 ~]# rpm -q pacemaker
> pacemaker-2.0.2-3.el8.x86_64
> [root@virt-141 ~]# pcs constraint show --full
> Location Constraints:
>   Resource: dummy
>    Constraint: ban-dummy-on-virt-142-cron
>      Rule: boolean-op=and score=-INFINITY  (id:ban-dummy-on-virt-142-cron-rule)
>        Expression: #uname eq string virt-142  (id:ban-dummy-on-virt-142-cron-expr)
>        Expression:  (id:ban-dummy-on-virt-142-cron-spec)
>          Date Spec: years=2019  (id:ban-dummy-on-virt-142-cron-spec-year)
>     Constraint: ban-dummy-on-virt-142-forever
>       Rule: boolean-op=and score=-INFINITY  (id:ban-dummy-on-virt-142-forever-rule)
>         Expression: #uname eq string virt-142  (id:ban-dummy-on-virt-142-forever-expr)
>     Constraint: cli-ban-dummy-on-virt-141
>       Rule: boolean-op=and score=-INFINITY  (id:cli-ban-dummy-on-virt-141-rule)
>         Expression: #uname eq string virt-141  (id:cli-ban-dummy-on-virt-141-expr)
>         Expression: date gt 2019-09-04 13:58:30 +02:00  (id:cli-ban-dummy-on-virt-141-lifetime-from)
>         Expression: date lt 2019-10-04 13:58:30 +02:00  (id:cli-ban-dummy-on-virt-141-lifetime-to)
>     Constraint: cli-ban-dummy-on-virt-142-first
>       Rule: boolean-op=and score=-INFINITY  (id:cli-ban-dummy-on-virt-142-first-rule)
>         Expression: #uname eq string virt-142  (id:cli-ban-dummy-on-virt-142-first-expr)
>         Expression: date lt 2019-09-04 14:02:32 +02:00  (id:cli-ban-dummy-on-virt-142-first-lifetime)
>     Constraint: cli-ban-dummy-on-virt-142-second
>       Rule: boolean-op=and score=-INFINITY  (id:cli-ban-dummy-on-virt-142-second-rule)
>         Expression: #uname eq string virt-142  (id:cli-ban-dummy-on-virt-142-second-expr)
>         Expression: date lt 2019-10-04 14:02:32 +02:00  (id:cli-ban-dummy-on-virt-142-second-lifetime)
> Ordering Constraints:
> Colocation Constraints:
> Ticket Constraints:
> [root@virt-141 ~]# date
> Wed Sep  4 13:59:17 CEST 2019


Test #1: non-existent rule ID

> [root@virt-141 ~]# crm_rule --check --rule nonexistent
> No rule found with ID=nonexistent containing a date_expression
> Error checking rule: No such device or address
> [root@virt-141 ~]# echo $?
> 105

Result: non-zero exit code, "no rule found" error message. Pass.


Test #2: rule without date expression

> [root@virt-141 ~]# crm_rule --check --rule ban-dummy-on-virt-142-forever-rule
> No rule found with ID=ban-dummy-on-virt-142-forever-rule containing a date_expression
> Error checking rule: No such device or address
> [root@virt-141 ~]# echo $?
> 105

Result: same non-zero exit code and error message as for non-existent rule. Slightly weird behavior ("No such device or address" for an existing rule ID), but good enough for an internal, experimental tool (for now). Pass.


Test #3: rule with multiple date expressions

> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-141-rule
> More than one date_expression in cli-ban-dummy-on-virt-141-rule is not supported
> Error checking rule: Operation not supported
> [root@virt-141 ~]# echo $?
> 3

Result: unique & non-zero exit code, clear error message. Pass.


Test #4: standard ban constraint with lifetime, expired

> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-142-first-rule
> Rule cli-ban-dummy-on-virt-142-first-rule is expired
> [root@virt-141 ~]# echo $?
> 110

Result: unique & non-zero exit code, rule correctly evaluated as expired. Pass.


Test #5: standard ban constraint with lifetime, still effective

> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-142-second-rule
> Rule cli-ban-dummy-on-virt-142-second-rule is still in effect
> [root@virt-141 ~]# echo $?
> 0

Result: zero exit code, rule correctly evaluated as still in effect. Pass.


Test #6: standard ban constraint with lifetime, still effective now but checked against a future date

> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-142-second-rule --date=2020-01-01
> Rule cli-ban-dummy-on-virt-142-second-rule is expired
> [root@virt-141 ~]# echo $?
> 110

Result: correct non-zero exit code and message as for an expired rule. Pass.


Test #7: standard ban constraint with lifetime, expired now but checked against a past date

> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-142-first-rule --date=2010-01-01
> Rule cli-ban-dummy-on-virt-142-first-rule is still in effect
> [root@virt-141 ~]# echo $?
> 0

Result: correct zero exit code and message as for an effective rule. Pass.


Test #8: ban constraint with a date_spec

> [root@virt-141 ~]# crm_rule --check --rule ban-dummy-on-virt-142-cron-rule
> No rule found with ID=ban-dummy-on-virt-142-cron-rule containing a date_expression
> Error checking rule: No such device or address
> [root@virt-141 ~]# echo $?
> 105

Result: same non-zero exit code and error message as for non-existent rule. Rules containing date_spec are explicitly unsupported as per comment#8 (for now). See Test #2 for further comments. Pass.


(In reply to Ken Gaillot from comment #8)
> 1. Create a new CLI tool that accepts options for a timestamp (default now), a node name (defaulting to the local node), and a rule ID.

Node name option not present, but not required for implementing the requested functionality in pcs. Timestamp option works, as does specifying rule by its ID (with a small note in Test #2).

(In reply to Ken Gaillot from comment #8)
> 2. The tool would have exit status codes (besides errors) for "rule is in effect" (success i.e. 0) and "rule is not in effect".

Effective rules result in an exit code of 0, expired 110, unsupported 3 or 105 (see Test #2 again), non-existent 105.

(In reply to Ken Gaillot from comment #8)
> 3. The tool would print messages to stdout about whether the rule is in effect, and why not if not, with an option to create XML output instead (for reliable parsing by pcs and such). The initial implementation would check for the narrow "expired" case (rule has a single date_expression with an operation that is not "date_spec"). This output would be declared experimental and unsupported. pcs could rely on the behavior of the version in RHEL, but upstream and RHEL end users would be discouraged from using it until fully implemented.

No XML output option present (and IIUC not necessary for requested feature implementation in pcs). XML input is available, although not well documented. Reading the code reveals the "-X/--xml-text=value" option expects a CIB XML -- read either from stdin or as a string directly from argv, which is highly unusual (I'd expect it to accept a filename instead). This CIB XML, when specified, is used instead of connecting to the live CIB:

> [root@virt-141 ~]# pcs cluster cib > cib.xml
> [root@virt-141 ~]# systemctl stop pacemaker
> [root@virt-141 ~]# pcs status
> Error: cluster is not currently running on this node
> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-142-second-rule -X - < cib.xml
> Rule cli-ban-dummy-on-virt-142-second-rule is still in effect


Marking verified in 2.0.2-3.el8.

Comment 18 Ken Gaillot 2019-09-04 17:11:08 UTC

(In reply to Patrik Hagara from comment #13)
<snip>
> (In reply to Ken Gaillot from comment #8)
> > 1. Create a new CLI tool that accepts options for a timestamp (default now),
> > a node name (defaulting to the local node), and a rule ID.
> 
> The tool does not accept any node name. Not a huge issue, since the
> constraints themselves usually specify the target node's uname (eg. when
> created using "pcs resource move <rsc> lifetime=<iso8601-spec>", as per the
> original BZ request).

We decided to limit the scope of the initial implementation further -- I forgot to update the BZ, sorry.

As you saw, it also doesn't currently have an XML output option; we decided that the exit status codes were sufficient for pcs's use.
 
> > 2. The tool would have exit status codes (besides errors) for "rule is in
> > effect" (success i.e. 0) and "rule is not in effect".

For the record, besides errors, and success for "in effect", we added 110 for "expired", 111 for "not yet in effect", and 112 for "undetermined (rule is too complicated for current implementation)".

<snip>

> Looks like the CTS regression tests [1] for this feature are using
> nonsensical rules -- location constraints specifying only the lifetime, not
> the actual location (node) to avoid/prefer:

Good point, but the simplified form is sufficient to test the underlying implementation.

(In reply to Patrik Hagara from comment #15)
> Hm, looks like another UX issue...
> 
> The `pcs constraint` command displays "rsc_location" ID
> ("cli-ban-dummy-on-virt-142" in this case) and `crm_rule` expects the "rule"
> ID ("cli-ban-dummy-on-virt-142-rule", with "-rule" appended):

Another good point, but this is an artifact of mixing pcs and the pacemaker command-line. pcs will use this capability internally and might not directly expose it to the user, so it wouldn't need to show the user rule IDs in that case. If pcs does decide to expose it, presumably it would change the display accordingly.

In RHEL documentation we only document the pcs interfaces, to nudge users into using only that. If a user is using the pacemaker command-line tools, they are a more advanced user, and hopefully will learn the differences between the two approaches without too much pain.

Comment 19 Patrik Hagara 2019-09-04 17:24:58 UTC

Thank you for mentioning the expected exit codes.

I missed one test case...


Test #9: ban constraint with start time in the future

> [root@virt-141 ~]# pcs constraint --full
> Location Constraints:
[...]
>     Constraint: cli-ban-dummy-on-virt-142-third
>       Rule: boolean-op=and score=-INFINITY  (id:cli-ban-dummy-on-virt-142-third-rule)
>         Expression: #uname eq string virt-142  (id:cli-ban-dummy-on-virt-142-third-expr)
>         Expression: date gt 2019-10-04 14:02:32 +02:00  (id:cli-ban-dummy-on-virt-142-third-lifetime)
> Ordering Constraints:
> Colocation Constraints:
> Ticket Constraints:
> [root@virt-141 ~]# crm_rule --check --rule cli-ban-dummy-on-virt-142-third-rule
> Rule cli-ban-dummy-on-virt-142-third-rule has not yet taken effect
> [root@virt-141 ~]# echo $?
> 111
> [root@virt-141 ~]# date
> Wed Sep  4 19:19:46 CEST 2019

Result: exit code and message matches expectations. Pass.


What concerns me now are the unexpected exit codes in test cases #1, #2, #3 and #8.

Comment 21 errata-xmlrpc 2019-11-05 20:57:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3385