Bug 1759269
Summary: | 'pcs resource description' could lead users to misunderstand 'cleanup' and 'refresh' [RHEL 7] | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Ken Gaillot <kgaillot> | ||||||
Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | ||||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 7.6 | CC: | cfeist, cluster-maint, cluster-qe, idevat, jseunghw, mlisik, mpospisi, nhostako, nwahl, omular, ondrej-redhat-developer, pkomarov, sbradley, tojeline | ||||||
Target Milestone: | rc | ||||||||
Target Release: | 7.9 | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | pcs-0.9.169-1.el7 | Doc Type: | Bug Fix | ||||||
Doc Text: |
Cause:
User runs 'pcs resource cleanup | refresh' to clean history / failures of a resource.
Consequence:
If the resource has a parent resource such as a bundle, clone or group, the parent resource's history is cleaned as well.
Fix:
Clarify the functionality of the commands in documentation. Provide an option to limit the operation to the specified resource only.
Result:
User is able to clean history / failures of a specified resource without affecting its parent resources.
|
Story Points: | --- | ||||||
Clone Of: | 1758969 | ||||||||
: | 1805082 (view as bug list) | Environment: | |||||||
Last Closed: | 2020-09-29 20:10:26 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1758969 | ||||||||
Bug Blocks: | 1805082, 1846412 | ||||||||
Attachments: |
|
Comment 3
Ondrej Faměra
2019-10-23 06:11:17 UTC
Current situation in pcs: The 'pcs resource cleanup' command does not allow --force or any other flag which would be propagated to crm_resource --cleanup as --force. So we can add a flag to the pcs command for this purpose. Going with --force is not the best option I think, since the flag is used for different purpose throughout pcs. The 'pcs resource refresh' uses --force flag for running the command even in cases a significant load on a cluster would be caused - if the cluster has many nodes an many resources, --force is required to run the command. So we definitely cannot use pcs's --force for crm_resource's --force. However, 'pcs resource refresh' also has --full flag which is propagated to crm_resource as --force. And it is documented in pcs: Use --full to refresh a resource on all nodes, otherwise only nodes where the resource's state is known will be considered. Ken: Has the meaning of --force in crm_resource --refresh changed since it was implemented? See bz1508350 for details. (In reply to Tomas Jelinek from comment #4) > Current situation in pcs: > > The 'pcs resource cleanup' command does not allow --force or any other flag > which would be propagated to crm_resource --cleanup as --force. So we can > add a flag to the pcs command for this purpose. Going with --force is not > the best option I think, since the flag is used for different purpose > throughout pcs. > > The 'pcs resource refresh' uses --force flag for running the command even in > cases a significant load on a cluster would be caused - if the cluster has > many nodes an many resources, --force is required to run the command. So we > definitely cannot use pcs's --force for crm_resource's --force. That's reasonable > However, 'pcs resource refresh' also has --full flag which is propagated to > crm_resource as --force. And it is documented in pcs: > Use --full to refresh a resource on all nodes, otherwise only nodes where > the resource's state is known will be considered. I was surprised to hear that. Looking into it, that idea apparently was abandoned and never implemented. I don't see any value in probing on nodes with resource-discovery=never, so I'm not sure why Beekhof originally proposed it. > Ken: Has the meaning of --force in crm_resource --refresh changed since it > was implemented? See bz1508350 for details. --force for --cleanup/--refresh has always meant what it does in the new help text, both before and after the behavior of cleanup changed. This is the final help text we went with: "If the named resource is part of a group, or one numbered instance of a clone or bundled resource, the clean-up applies to the whole collective resource unless --force is given." Confirming --force has no effect in crm_resource --refresh in pacemaker-1.1.21-3.el7 and pacemaker-2.0.3-5.el8 Created attachment 1667511 [details]
proposed fix + tests
* updated pcs help and manpage to match pacemaker
* 'pcs (resource | stonith) (cleanup | refresh)' now has the --strict flag which translates to pacemaker's --force flag
Created attachment 1667766 [details]
proposed fix 2
Make sure 'pcs resource | stonith refresh --full' works the same way as before to keep backwards compatibility for whoever was using it. The --full flag is not supposed to be documented.
After Fix > --strict is supported [kid76 ~] $ pcs resource cleanup -h|head -n3|tail -n1 cleanup [<resource id>] [--node <node>] [--strict] [kid76 ~] $ pcs resource refresh -h|head -n3|tail -n1 refresh [<resource id>] [--node <node>] [--strict] [kid76 ~] $ pcs stonith cleanup -h|head -n3|tail -n1 cleanup [<stonith id>] [--node <node>] [--strict] [kid76 ~] $ pcs stonith refresh -h|head -n3|tail -n1 refresh [<stonith id>] [--node <node>] [--strict] [kid76 ~] $ pcs resource cleanup --strict Cleaned up all resources on all nodes [kid76 ~] $ pcs resource refresh --strict Waiting for 1 reply from the CRMd. OK [kid76 ~] $ pcs stonith cleanup --strict Cleaned up all resources on all nodes [kid76 ~] $ pcs stonith refresh --strict Waiting for 1 reply from the CRMd. OK > --full' works [kid76 ~] $ pcs resource refresh --full Waiting for 1 reply from the CRMd. OK [kid76 ~] $ pcs stonith refresh --full Waiting for 1 reply from the CRMd. OK Hi, This is a RHEL7 bug. but, just to check on RHEL8, I found: *not* reflected with 'pcs resource description' on RHEL 8.2. cleanup [<resource id>] [node=<node>] [operation=<operation> [interval=<interval>]] Make the cluster forget failed operations from history of the resource and re-detect its current state. This can be useful to purge knowledge of past failures that have since been resolved. If a resource id is not specified then all resources / stonith devices will be cleaned up. If a node is not specified then resources / stonith devices on all nodes will be cleaned up. refresh [<resource id>] [node=<node>] [--full] Make the cluster forget the complete operation history (including failures) of the resource and re-detect its current state. If you are interested in forgetting failed operations only, use the 'pcs resource cleanup' command. If a resource id is not specified then all resources / stonith devices will be refreshed. If a node is not specified then resources / stonith devices on all nodes will be refreshed. Use --full to refresh a resource on all nodes, otherwise only nodes where the resource's state is known will be considered. reflected with 'crm_resource --help' on RHEL8.2 .. -C, --cleanup If resource has any past failures, clear its history and fail count. Optionally filtered by --resource, --node, --operation, and --interval (otherwise all). --operation and --interval apply to fail counts, but entire history is always cleared, to allow current state to be rechecked. If the named resource is part of a group, or one numbered instance of a clone or bundled resource, the clean-up applies to the whole collective resource unless --force is given. <================ -R, --refresh Delete resource's history (including failures) so its current state is rechecked. Optionally filtered by --resource and --node (otherwise all). If the named resource is part of a group, or one numbered instance of a clone or bundled resource, the clean-up applies to the whole collective resource unless --force is given. - pcs-0.10.4-6.el8.x86_64 - pacemaker-2.0.3-5.el8.x86_64 Do we have to clone a bug for RHEL8 for pcs ? This bz has been cloned for RHEL8 already: bz1805082 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pcs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3964 |