Bug 1805082
Summary: | 'pcs resource description' could lead users to misunderstand 'cleanup' and 'refresh' [RHEL 8] | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Tomas Jelinek <tojeline> | ||||||||
Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 8.2 | CC: | cfeist, cluster-maint, cluster-qe, idevat, jseunghw, kgaillot, mlisik, mmazoure, mpospisi, nhostako, nwahl, omular, ondrej-redhat-developer, pkomarov, sbradley, tojeline | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | 8.3 | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | pcs-0.10.6-2.el8 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
Cause:
User runs 'pcs resource cleanup | refresh' to clean history / failures of a resource.
Consequence:
If the resource has a parent resource such as a bundle, clone or group, the parent resource's history is cleaned as well.
Fix:
Clarify the functionality of the commands in documentation. Provide an option to limit the operation to the specified resource only.
Result:
User is able to clean history / failures of a specified resource without affecting its parent resources.
|
Story Points: | --- | ||||||||
Clone Of: | 1759269 | Environment: | |||||||||
Last Closed: | 2020-11-04 02:28:16 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 1758969, 1759269 | ||||||||||
Bug Blocks: | |||||||||||
Attachments: |
|
Description
Tomas Jelinek
2020-02-20 08:39:15 UTC
Created attachment 1667519 [details]
proposed fix + tests
* updated pcs help and manpage to match pacemaker
* 'pcs (resource | stonith) (cleanup | refresh)' now has the --strict flag which translates to pacemaker's --force flag
Created attachment 1667767 [details]
proposed fix 2
Make sure 'pcs resource | stonith refresh --full' works the same way as before to keep backwards compatibility for whoever was using it. The --full flag is not supposed to be documented.
Additional change was made in pacemaker help, see bz1758969 comment 12 and bz1758969 comment 13. This change should be in pcs help as well. Test: [root@r8-node-01 pcs]# rpm -q pcs pcs-0.10.6-1.el8.x86_64 [root@r8-node-01 pcs]# pcs resource * Resource Group: Group: * Delay1 (ocf::heartbeat:Delay): Started r8-node-01 * Delay2 (ocf::heartbeat:Delay): FAILED r8-node-01 [root@r8-node-01 pcs]# pcs resource refresh Delay2 --strict Cleaned up Delay2 on r8-node-02 Cleaned up Delay2 on r8-node-01 Waiting for 2 replies from the controller.. OK [root@r8-node-01 pcs]# pcs resource refresh Delay2 Cleaned up Delay1 on r8-node-02 Cleaned up Delay1 on r8-node-01 Cleaned up Delay2 on r8-node-02 Cleaned up Delay2 on r8-node-01 Waiting for 4 replies from the controller.... OK Moving to ASSIGNED to fix the issue described in comment 4. Created attachment 1698432 [details]
proposed fix - documentation
Check: [root@r8-node-01 ~]# rpm -q pcs pcs-0.10.6-2.el8.x86_64 [root@r8-node-01 ]# pcs resource refresh --help | grep 'clean-up' BEFORE_FIX ========== [root@virt-023 ~]# rpm -q pcs pcs-0.10.4-6.el8.x86_64 [root@virt-023 ~]# rpm -q pacemaker pacemaker-2.0.3-5.el8.x86_64 [root@virt-023 ~]# pcs resource cleanup --help Usage: pcs resource cleanup... cleanup [<resource id>] [node=<node>] [operation=<operation> [interval=<interval>]] Make the cluster forget failed operations from history of the resource and re-detect its current state. This can be useful to purge knowledge of past failures that have since been resolved. If a resource id is not specified then all resources / stonith devices will be cleaned up. If a node is not specified then resources / stonith devices on all nodes will be cleaned up. [root@virt-023 ~]# pcs resource refresh --help Usage: pcs resource refresh... refresh [<resource id>] [node=<node>] [--full] Make the cluster forget the complete operation history (including failures) of the resource and re-detect its current state. If you are interested in forgetting failed operations only, use the 'pcs resource cleanup' command. If a resource id is not specified then all resources / stonith devices will be refreshed. If a node is not specified then resources / stonith devices on all nodes will be refreshed. Use --full to refresh a resource on all nodes, otherwise only nodes where the resource's state is known will be considered. [root@virt-023 ~]# crm_resource --help ... -C, --cleanup If resource has any past failures, clear its history and fail count. Optionally filtered by --resource, --node, --operation, and --interval (otherwise all). --operation and --interval apply to fail counts, but entire history is always cleared, to allow current state to be rechecked. If the named resource is part of a group, or one numbered instance of a clone or bundled resource, the clean-up applies to the whole collective resource unless --force is given. -R, --refresh Delete resource's history (including failures) so its current state is rechecked. Optionally filtered by --resource and --node (otherwise all). If the named resource is part of a group, or one numbered instance of a clone or bundled resource, the clean-up applies to the whole collective resource unless --force is given. ... > pcs and pacemaker discrepancy AFTER_FIX ========= [root@virt-141 ~]# rpm -q pcs pcs-0.10.6-3.el8.x86_64 [root@virt-141 ~]# rpm -q pacemaker pacemaker-2.0.4-3.el8.x86_64 1. new '--strict' flag added for 'pcs resource|stonith refresh' and 'pcs resource|stonith cleanup' A) GROUP [root@virt-141 ~]# pcs resource * Resource Group: group_test: * A (ocf::pacemaker:Dummy): Started virt-142 * B (ocf::pacemaker:Dummy): Started virt-142 * C (ocf::pacemaker:Dummy): Started virt-142 ## refresh [root@virt-141 ~]# pcs resource refresh A --strict Cleaned up A on virt-142 Cleaned up A on virt-141 Waiting for 2 replies from the controller.. OK [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith refresh A --strict Cleaned up A on virt-142 Cleaned up A on virt-141 Waiting for 2 replies from the controller.. OK [root@virt-141 ~]# echo $? 0 > '--strict' applies refresh only to the specified resource [root@virt-141 ~]# pcs resource refresh group_test --strict Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 Waiting for 6 replies from the controller...... OK [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith refresh group_test --strict Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 .Waiting for 5 replies from the controller..... OK [root@virt-141 ~]# echo $? 0 > Functionality on the whole group level remained unchanged [root@virt-141 ~]# pcs resource refresh A Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 Waiting for 6 replies from the controller...... OK [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith refresh A Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 Waiting for 6 replies from the controller...... OK [root@virt-141 ~]# echo $? 0 > Without the --strict flag, the command is effective for all resources within the group [root@virt-141 ~]# pcs resource refresh group_test Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 Waiting for 6 replies from the controller...... OK [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith refresh group_test Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 .Waiting for 5 replies from the controller..... OK [root@virt-141 ~]# echo $? 0 > The same result when running the command on the whole group ## cleanup [root@virt-141 ~]# pcs resource cleanup A --strict Cleaned up A on virt-142 Cleaned up A on virt-141 [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith cleanup A --strict Cleaned up A on virt-142 Cleaned up A on virt-141 [root@virt-141 ~]# echo $? 0 > --strict applies 'cleanup' only on the specified resource [root@virt-141 ~]# pcs resource cleanup group_test --strict Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith cleanup group_test --strict Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 [root@virt-141 ~]# echo $? 0 > Functionality on the whole group level remained unchanged [root@virt-141 ~]# pcs resource cleanup A Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith cleanup A Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 [root@virt-141 ~]# echo $? 0 > Without the --strict flag, the command is effective for all resources within the group [root@virt-141 ~]# pcs resource cleanup group_test Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith cleanup group_test Cleaned up A on virt-142 Cleaned up A on virt-141 Cleaned up B on virt-142 Cleaned up B on virt-141 Cleaned up C on virt-142 Cleaned up C on virt-141 [root@virt-141 ~]# echo $? 0 > The same result when running the command on the whole group B) CLONE [root@virt-141 ~]# pcs resource create A ocf:pacemaker:Dummy clone [root@virt-141 ~]# pcs resource * Clone Set: A-clone [A]: * Started: [ virt-141 virt-142 ] ## refresh [root@virt-141 ~]# pcs resource refresh A --strict Cleaned up A:0 on virt-141 Waiting for 1 reply from the controller. OK [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith refresh A --strict Cleaned up A:0 on virt-141 Waiting for 1 reply from the controller. OK [root@virt-141 ~]# echo $? 0 > '--strict' applies refresh only on the clone on the specific node [root@virt-141 ~]# pcs resource refresh A Cleaned up A:0 on virt-141 Cleaned up A:1 on virt-142 Waiting for 2 replies from the controller.. OK [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith refresh A Cleaned up A:0 on virt-141 Cleaned up A:1 on virt-142 Waiting for 2 replies from the controller.. OK [root@virt-141 ~]# echo $? 0 > Without the flag, the command is effective on the cloned resource running on all the nodes ## cleanup [root@virt-141 ~]# pcs resource cleanup A --strict Cleaned up A:0 on virt-141 [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith cleanup A --strict Cleaned up A:0 on virt-141 [root@virt-141 ~]# echo $? 0 > OK [root@virt-141 ~]# pcs resource cleanup A Cleaned up A:0 on virt-141 Cleaned up A:1 on virt-142 [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith cleanup A Cleaned up A:0 on virt-141 Cleaned up A:1 on virt-142 [root@virt-141 ~]# echo $? 0 >OK ## --strict flag is propagated to pacemaker as --force for all tested cases [root@virt-141 ~]# pcs resource refresh A --strict --debug Running: /usr/sbin/crm_resource --refresh --resource A --force ... [root@virt-141 ~]# pcs stonith refresh A --strict --debug Running: /usr/sbin/crm_resource --refresh --resource A --force ... [root@virt-141 ~]# pcs resource cleanup A --strict --debug Running: /usr/sbin/crm_resource --cleanup --resource A --force ... [root@virt-141 ~]# pcs stonith cleanup A --strict --debug Running: /usr/sbin/crm_resource --cleanup --resource A --force .... 2. --full flag is avalaible for 'pcs resource refresh' just to ensure backwards compatibility A) GROUP [root@virt-141 ~]# pcs resource refresh A --full Warning: '--full' has been deprecated Cleaned up A on virt-142 Cleaned up A on virt-141 Waiting for 2 replies from the controller.. OK [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith refresh A --full Warning: '--full' has been deprecated Cleaned up A on virt-142 Cleaned up A on virt-141 Waiting for 2 replies from the controller.. OK [root@virt-141 ~]# echo $? 0 B) CLONE [root@virt-141 ~]# pcs resource refresh A --full Warning: '--full' has been deprecated Cleaned up A:0 on virt-141 Waiting for 1 reply from the controller. OK [root@virt-141 ~]# echo $? 0 [root@virt-141 ~]# pcs stonith refresh A --full Warning: '--full' has been deprecated Cleaned up A:0 on virt-141 Waiting for 1 reply from the controller. OK [root@virt-141 ~]# echo $? 0 > '--full' is functionally the same as '--strict' flag 3. Documentation reflects the pacemaker usage as well as changes in pcs [root@virt-141 ~]# pcs resource description ... cleanup [<resource id>] [node=<node>] [operation=<operation> [interval=<interval>]] [--strict] Make the cluster forget failed operations from history of the resource and re-detect its current state. This can be useful to purge knowledge of past failures that have since been resolved. If the named resource is part of a group, or one numbered instance of a clone or bundled resource, the clean-up applies to the whole collective resource unless --strict is given. If a resource id is not specified then all resources / stonith devices will be cleaned up. If a node is not specified then resources / stonith devices on all nodes will be cleaned up. refresh [<resource id>] [node=<node>] [--strict] Make the cluster forget the complete operation history (including failures) of the resource and re-detect its current state. If you are interested in forgetting failed operations only, use the 'pcs resource cleanup' command. If the named resource is part of a group, or one numbered instance of a clone or bundled resource, the refresh applies to the whole collective resource unless --strict is given. If a resource id is not specified then all resources / stonith devices will be refreshed. If a node is not specified then resources / stonith devices on all nodes will be refreshed. ... > help and man pcs has been fixed accordingly Verified , [root@controller-2 ~]# rpm -qa pcs pcs-0.10.6-2.el8.x86_64 [root@controller-2 ~]# pcs resource refresh rabbitmq-bundle-0 --strict Cleaned up rabbitmq-bundle-0 on controller-2 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Waiting for 3 replies from the controller... OK [root@controller-2 ~]# pcs stonith * stonith-fence_compute-fence-nova (stonith:fence_compute): Started controller-0 * stonith-fence_ipmilan-525400743bce (stonith:fence_ipmilan): Started controller-1 * stonith-fence_ipmilan-525400952172 (stonith:fence_ipmilan): Started controller-1 * stonith-fence_ipmilan-525400459d1e (stonith:fence_ipmilan): Started controller-0 * stonith-fence_ipmilan-525400aec4eb (stonith:fence_ipmilan): Started controller-2 * stonith-fence_ipmilan-525400f1b397 (stonith:fence_ipmilan): Started controller-1 Target: controller-0 Level 1 - stonith-fence_ipmilan-525400f1b397 Target: controller-1 Level 1 - stonith-fence_ipmilan-525400aec4eb Target: controller-2 Level 1 - stonith-fence_ipmilan-525400459d1e Target: overcloud-novacomputeiha-0 Level 1 - stonith-fence_ipmilan-525400743bce,stonith-fence_compute-fence-nova Target: overcloud-novacomputeiha-1 Level 1 - stonith-fence_ipmilan-525400952172,stonith-fence_compute-fence-nova [root@controller-2 ~]# pcs stonith refresh stonith-fence_ipmilan-525400f1b397 --strict Cleaned up stonith-fence_ipmilan-525400f1b397 on controller-2 Cleaned up stonith-fence_ipmilan-525400f1b397 on controller-1 Cleaned up stonith-fence_ipmilan-525400f1b397 on controller-0 Waiting for 3 replies from the controller... OK [root@controller-2 ~]# pcs resource refresh rabbitmq-bundle --strict Cleaned up rabbitmq-bundle-podman-0 on controller-2 Cleaned up rabbitmq-bundle-podman-0 on controller-1 Cleaned up rabbitmq-bundle-podman-0 on controller-0 Cleaned up rabbitmq-bundle-0 on controller-2 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Cleaned up rabbitmq-bundle-podman-1 on controller-2 Cleaned up rabbitmq-bundle-podman-1 on controller-1 Cleaned up rabbitmq-bundle-podman-1 on controller-0 Cleaned up rabbitmq-bundle-1 on controller-2 Cleaned up rabbitmq-bundle-1 on controller-1 Cleaned up rabbitmq-bundle-1 on controller-0 Cleaned up rabbitmq-bundle-podman-2 on controller-2 Cleaned up rabbitmq-bundle-podman-2 on controller-1 Cleaned up rabbitmq-bundle-podman-2 on controller-0 Cleaned up rabbitmq-bundle-2 on controller-2 Cleaned up rabbitmq-bundle-2 on controller-1 Cleaned up rabbitmq-bundle-2 on controller-0 Cleaned up rabbitmq:0 on rabbitmq-bundle-0 Cleaned up rabbitmq:1 on rabbitmq-bundle-1 Cleaned up rabbitmq:2 on rabbitmq-bundle-2 Waiting for 21 replies from the controller..................... OK [root@controller-2 ~]# pcs resource refresh stonith-fence_compute-fence-nova --strict Cleaned up stonith-fence_compute-fence-nova on controller-2 Cleaned up stonith-fence_compute-fence-nova on controller-1 Cleaned up stonith-fence_compute-fence-nova on controller-0 Waiting for 3 replies from the controller... OK [root@controller-2 ~]# pcs resource refresh stonith-fence_compute-fence-nova Cleaned up stonith-fence_compute-fence-nova on controller-2 Cleaned up stonith-fence_compute-fence-nova on controller-1 Cleaned up stonith-fence_compute-fence-nova on controller-0 Waiting for 3 replies from the controller... OK [root@controller-2 ~]# pcs resource cleanup rabbitmq-bundle-0 --strict Cleaned up rabbitmq-bundle-0 on controller-2 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Waiting for 1 reply from the controller. OK [root@controller-2 ~]# pcs resource cleanup rabbitmq-bundle --strict Cleaned up rabbitmq-bundle-podman-0 on controller-2 Cleaned up rabbitmq-bundle-podman-0 on controller-1 Cleaned up rabbitmq-bundle-podman-0 on controller-0 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Cleaned up rabbitmq-bundle-podman-1 on controller-2 Cleaned up rabbitmq-bundle-podman-1 on controller-1 Cleaned up rabbitmq-bundle-podman-1 on controller-0 Cleaned up rabbitmq-bundle-1 on controller-2 Cleaned up rabbitmq-bundle-1 on controller-1 Cleaned up rabbitmq-bundle-1 on controller-0 Cleaned up rabbitmq-bundle-podman-2 on controller-2 Cleaned up rabbitmq-bundle-podman-2 on controller-1 Cleaned up rabbitmq-bundle-podman-2 on controller-0 Cleaned up rabbitmq-bundle-2 on controller-2 Cleaned up rabbitmq-bundle-2 on controller-1 Cleaned up rabbitmq-bundle-2 on controller-0 Cleaned up rabbitmq:1 on rabbitmq-bundle-1 Cleaned up rabbitmq:2 on rabbitmq-bundle-2 [root@controller-2 ~]# pcs resource cleanup rabbitmq-bundle Cleaned up rabbitmq-bundle-podman-0 on controller-2 Cleaned up rabbitmq-bundle-podman-0 on controller-1 Cleaned up rabbitmq-bundle-podman-0 on controller-0 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Cleaned up rabbitmq-bundle-podman-1 on controller-2 Cleaned up rabbitmq-bundle-podman-1 on controller-1 Cleaned up rabbitmq-bundle-podman-1 on controller-0 Cleaned up rabbitmq-bundle-1 on controller-2 Cleaned up rabbitmq-bundle-1 on controller-1 Cleaned up rabbitmq-bundle-1 on controller-0 Cleaned up rabbitmq-bundle-podman-2 on controller-2 Cleaned up rabbitmq-bundle-podman-2 on controller-1 Cleaned up rabbitmq-bundle-podman-2 on controller-0 Cleaned up rabbitmq-bundle-2 on controller-2 Cleaned up rabbitmq-bundle-2 on controller-1 Cleaned up rabbitmq-bundle-2 on controller-0 Cleaned up rabbitmq:1 on rabbitmq-bundle-1 Cleaned up rabbitmq:2 on rabbitmq-bundle-2 [root@controller-2 ~]# pcs resource cleanup rabbitmq-bundle-0 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pcs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:4617 |