Bug 1805082
| Summary: | 'pcs resource description' could lead users to misunderstand 'cleanup' and 'refresh' [RHEL 8] | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Tomas Jelinek <tojeline> | ||||||||
| Component: | pcs | Assignee: | Tomas Jelinek <tojeline> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||||
| Severity: | medium | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 8.2 | CC: | cfeist, cluster-maint, cluster-qe, idevat, jseunghw, kgaillot, mlisik, mmazoure, mpospisi, nhostako, nwahl, omular, ondrej-redhat-developer, pkomarov, sbradley, tojeline | ||||||||
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
||||||||
| Target Release: | 8.3 | ||||||||||
| Hardware: | All | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | pcs-0.10.6-2.el8 | Doc Type: | Bug Fix | ||||||||
| Doc Text: |
Cause:
User runs 'pcs resource cleanup | refresh' to clean history / failures of a resource.
Consequence:
If the resource has a parent resource such as a bundle, clone or group, the parent resource's history is cleaned as well.
Fix:
Clarify the functionality of the commands in documentation. Provide an option to limit the operation to the specified resource only.
Result:
User is able to clean history / failures of a specified resource without affecting its parent resources.
|
Story Points: | --- | ||||||||
| Clone Of: | 1759269 | Environment: | |||||||||
| Last Closed: | 2020-11-04 02:28:16 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | 1758969, 1759269 | ||||||||||
| Bug Blocks: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Tomas Jelinek
2020-02-20 08:39:15 UTC
Created attachment 1667519 [details]
proposed fix + tests
* updated pcs help and manpage to match pacemaker
* 'pcs (resource | stonith) (cleanup | refresh)' now has the --strict flag which translates to pacemaker's --force flag
Created attachment 1667767 [details]
proposed fix 2
Make sure 'pcs resource | stonith refresh --full' works the same way as before to keep backwards compatibility for whoever was using it. The --full flag is not supposed to be documented.
Additional change was made in pacemaker help, see bz1758969 comment 12 and bz1758969 comment 13. This change should be in pcs help as well. Test:
[root@r8-node-01 pcs]# rpm -q pcs
pcs-0.10.6-1.el8.x86_64
[root@r8-node-01 pcs]# pcs resource
* Resource Group: Group:
* Delay1 (ocf::heartbeat:Delay): Started r8-node-01
* Delay2 (ocf::heartbeat:Delay): FAILED r8-node-01
[root@r8-node-01 pcs]# pcs resource refresh Delay2 --strict
Cleaned up Delay2 on r8-node-02
Cleaned up Delay2 on r8-node-01
Waiting for 2 replies from the controller.. OK
[root@r8-node-01 pcs]# pcs resource refresh Delay2
Cleaned up Delay1 on r8-node-02
Cleaned up Delay1 on r8-node-01
Cleaned up Delay2 on r8-node-02
Cleaned up Delay2 on r8-node-01
Waiting for 4 replies from the controller.... OK
Moving to ASSIGNED to fix the issue described in comment 4. Created attachment 1698432 [details]
proposed fix - documentation
Check: [root@r8-node-01 ~]# rpm -q pcs pcs-0.10.6-2.el8.x86_64 [root@r8-node-01 ]# pcs resource refresh --help | grep 'clean-up' BEFORE_FIX
==========
[root@virt-023 ~]# rpm -q pcs
pcs-0.10.4-6.el8.x86_64
[root@virt-023 ~]# rpm -q pacemaker
pacemaker-2.0.3-5.el8.x86_64
[root@virt-023 ~]# pcs resource cleanup --help
Usage: pcs resource cleanup...
cleanup [<resource id>] [node=<node>] [operation=<operation>
[interval=<interval>]]
Make the cluster forget failed operations from history of the resource
and re-detect its current state. This can be useful to purge knowledge
of past failures that have since been resolved. If a resource id is not
specified then all resources / stonith devices will be cleaned up. If a
node is not specified then resources / stonith devices on all nodes will
be cleaned up.
[root@virt-023 ~]# pcs resource refresh --help
Usage: pcs resource refresh...
refresh [<resource id>] [node=<node>] [--full]
Make the cluster forget the complete operation history (including
failures) of the resource and re-detect its current state. If you are
interested in forgetting failed operations only, use the 'pcs resource
cleanup' command. If a resource id is not specified then all resources
/ stonith devices will be refreshed. If a node is not specified then
resources / stonith devices on all nodes will be refreshed. Use --full
to refresh a resource on all nodes, otherwise only nodes where the
resource's state is known will be considered.
[root@virt-023 ~]# crm_resource --help
...
-C, --cleanup
If resource has any past failures, clear its history and fail count.
Optionally filtered by --resource, --node, --operation, and --interval (otherwise all).
--operation and --interval apply to fail counts, but entire history is always cleared,
to allow current state to be rechecked. If the named resource is part of a group, or
one numbered instance of a clone or bundled resource, the clean-up applies to the
whole collective resource unless --force is given.
-R, --refresh
Delete resource's history (including failures) so its current state is rechecked.
Optionally filtered by --resource and --node (otherwise all). If the named resource is
part of a group, or one numbered instance of a clone or bundled resource, the clean-up
applies to the whole collective resource unless --force is given.
...
> pcs and pacemaker discrepancy
AFTER_FIX
=========
[root@virt-141 ~]# rpm -q pcs
pcs-0.10.6-3.el8.x86_64
[root@virt-141 ~]# rpm -q pacemaker
pacemaker-2.0.4-3.el8.x86_64
1. new '--strict' flag added for 'pcs resource|stonith refresh' and 'pcs resource|stonith cleanup'
A) GROUP
[root@virt-141 ~]# pcs resource
* Resource Group: group_test:
* A (ocf::pacemaker:Dummy): Started virt-142
* B (ocf::pacemaker:Dummy): Started virt-142
* C (ocf::pacemaker:Dummy): Started virt-142
## refresh
[root@virt-141 ~]# pcs resource refresh A --strict
Cleaned up A on virt-142
Cleaned up A on virt-141
Waiting for 2 replies from the controller.. OK
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith refresh A --strict
Cleaned up A on virt-142
Cleaned up A on virt-141
Waiting for 2 replies from the controller.. OK
[root@virt-141 ~]# echo $?
0
> '--strict' applies refresh only to the specified resource
[root@virt-141 ~]# pcs resource refresh group_test --strict
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
Waiting for 6 replies from the controller...... OK
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith refresh group_test --strict
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
.Waiting for 5 replies from the controller..... OK
[root@virt-141 ~]# echo $?
0
> Functionality on the whole group level remained unchanged
[root@virt-141 ~]# pcs resource refresh A
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
Waiting for 6 replies from the controller...... OK
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith refresh A
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
Waiting for 6 replies from the controller...... OK
[root@virt-141 ~]# echo $?
0
> Without the --strict flag, the command is effective for all resources within the group
[root@virt-141 ~]# pcs resource refresh group_test
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
Waiting for 6 replies from the controller...... OK
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith refresh group_test
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
.Waiting for 5 replies from the controller..... OK
[root@virt-141 ~]# echo $?
0
> The same result when running the command on the whole group
## cleanup
[root@virt-141 ~]# pcs resource cleanup A --strict
Cleaned up A on virt-142
Cleaned up A on virt-141
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith cleanup A --strict
Cleaned up A on virt-142
Cleaned up A on virt-141
[root@virt-141 ~]# echo $?
0
> --strict applies 'cleanup' only on the specified resource
[root@virt-141 ~]# pcs resource cleanup group_test --strict
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith cleanup group_test --strict
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
[root@virt-141 ~]# echo $?
0
> Functionality on the whole group level remained unchanged
[root@virt-141 ~]# pcs resource cleanup A
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith cleanup A
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
[root@virt-141 ~]# echo $?
0
> Without the --strict flag, the command is effective for all resources within the group
[root@virt-141 ~]# pcs resource cleanup group_test
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith cleanup group_test
Cleaned up A on virt-142
Cleaned up A on virt-141
Cleaned up B on virt-142
Cleaned up B on virt-141
Cleaned up C on virt-142
Cleaned up C on virt-141
[root@virt-141 ~]# echo $?
0
> The same result when running the command on the whole group
B) CLONE
[root@virt-141 ~]# pcs resource create A ocf:pacemaker:Dummy clone
[root@virt-141 ~]# pcs resource
* Clone Set: A-clone [A]:
* Started: [ virt-141 virt-142 ]
## refresh
[root@virt-141 ~]# pcs resource refresh A --strict
Cleaned up A:0 on virt-141
Waiting for 1 reply from the controller. OK
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith refresh A --strict
Cleaned up A:0 on virt-141
Waiting for 1 reply from the controller. OK
[root@virt-141 ~]# echo $?
0
> '--strict' applies refresh only on the clone on the specific node
[root@virt-141 ~]# pcs resource refresh A
Cleaned up A:0 on virt-141
Cleaned up A:1 on virt-142
Waiting for 2 replies from the controller.. OK
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith refresh A
Cleaned up A:0 on virt-141
Cleaned up A:1 on virt-142
Waiting for 2 replies from the controller.. OK
[root@virt-141 ~]# echo $?
0
> Without the flag, the command is effective on the cloned resource running on all the nodes
## cleanup
[root@virt-141 ~]# pcs resource cleanup A --strict
Cleaned up A:0 on virt-141
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith cleanup A --strict
Cleaned up A:0 on virt-141
[root@virt-141 ~]# echo $?
0
> OK
[root@virt-141 ~]# pcs resource cleanup A
Cleaned up A:0 on virt-141
Cleaned up A:1 on virt-142
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith cleanup A
Cleaned up A:0 on virt-141
Cleaned up A:1 on virt-142
[root@virt-141 ~]# echo $?
0
>OK
## --strict flag is propagated to pacemaker as --force for all tested cases
[root@virt-141 ~]# pcs resource refresh A --strict --debug
Running: /usr/sbin/crm_resource --refresh --resource A --force
...
[root@virt-141 ~]# pcs stonith refresh A --strict --debug
Running: /usr/sbin/crm_resource --refresh --resource A --force
...
[root@virt-141 ~]# pcs resource cleanup A --strict --debug
Running: /usr/sbin/crm_resource --cleanup --resource A --force
...
[root@virt-141 ~]# pcs stonith cleanup A --strict --debug
Running: /usr/sbin/crm_resource --cleanup --resource A --force
....
2. --full flag is avalaible for 'pcs resource refresh' just to ensure backwards compatibility
A) GROUP
[root@virt-141 ~]# pcs resource refresh A --full
Warning: '--full' has been deprecated
Cleaned up A on virt-142
Cleaned up A on virt-141
Waiting for 2 replies from the controller.. OK
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith refresh A --full
Warning: '--full' has been deprecated
Cleaned up A on virt-142
Cleaned up A on virt-141
Waiting for 2 replies from the controller.. OK
[root@virt-141 ~]# echo $?
0
B) CLONE
[root@virt-141 ~]# pcs resource refresh A --full
Warning: '--full' has been deprecated
Cleaned up A:0 on virt-141
Waiting for 1 reply from the controller. OK
[root@virt-141 ~]# echo $?
0
[root@virt-141 ~]# pcs stonith refresh A --full
Warning: '--full' has been deprecated
Cleaned up A:0 on virt-141
Waiting for 1 reply from the controller. OK
[root@virt-141 ~]# echo $?
0
> '--full' is functionally the same as '--strict' flag
3. Documentation reflects the pacemaker usage as well as changes in pcs
[root@virt-141 ~]# pcs resource description
...
cleanup [<resource id>] [node=<node>] [operation=<operation>
[interval=<interval>]] [--strict]
Make the cluster forget failed operations from history of the resource
and re-detect its current state. This can be useful to purge knowledge
of past failures that have since been resolved.
If the named resource is part of a group, or one numbered instance of a
clone or bundled resource, the clean-up applies to the whole collective
resource unless --strict is given.
If a resource id is not specified then all resources / stonith devices
will be cleaned up.
If a node is not specified then resources / stonith devices on all
nodes will be cleaned up.
refresh [<resource id>] [node=<node>] [--strict]
Make the cluster forget the complete operation history (including
failures) of the resource and re-detect its current state. If you are
interested in forgetting failed operations only, use the 'pcs resource
cleanup' command.
If the named resource is part of a group, or one numbered instance of a
clone or bundled resource, the refresh applies to the whole collective
resource unless --strict is given.
If a resource id is not specified then all resources / stonith devices
will be refreshed.
If a node is not specified then resources / stonith devices on all
nodes will be refreshed.
...
> help and man pcs has been fixed accordingly
Verified , [root@controller-2 ~]# rpm -qa pcs pcs-0.10.6-2.el8.x86_64 [root@controller-2 ~]# pcs resource refresh rabbitmq-bundle-0 --strict Cleaned up rabbitmq-bundle-0 on controller-2 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Waiting for 3 replies from the controller... OK [root@controller-2 ~]# pcs stonith * stonith-fence_compute-fence-nova (stonith:fence_compute): Started controller-0 * stonith-fence_ipmilan-525400743bce (stonith:fence_ipmilan): Started controller-1 * stonith-fence_ipmilan-525400952172 (stonith:fence_ipmilan): Started controller-1 * stonith-fence_ipmilan-525400459d1e (stonith:fence_ipmilan): Started controller-0 * stonith-fence_ipmilan-525400aec4eb (stonith:fence_ipmilan): Started controller-2 * stonith-fence_ipmilan-525400f1b397 (stonith:fence_ipmilan): Started controller-1 Target: controller-0 Level 1 - stonith-fence_ipmilan-525400f1b397 Target: controller-1 Level 1 - stonith-fence_ipmilan-525400aec4eb Target: controller-2 Level 1 - stonith-fence_ipmilan-525400459d1e Target: overcloud-novacomputeiha-0 Level 1 - stonith-fence_ipmilan-525400743bce,stonith-fence_compute-fence-nova Target: overcloud-novacomputeiha-1 Level 1 - stonith-fence_ipmilan-525400952172,stonith-fence_compute-fence-nova [root@controller-2 ~]# pcs stonith refresh stonith-fence_ipmilan-525400f1b397 --strict Cleaned up stonith-fence_ipmilan-525400f1b397 on controller-2 Cleaned up stonith-fence_ipmilan-525400f1b397 on controller-1 Cleaned up stonith-fence_ipmilan-525400f1b397 on controller-0 Waiting for 3 replies from the controller... OK [root@controller-2 ~]# pcs resource refresh rabbitmq-bundle --strict Cleaned up rabbitmq-bundle-podman-0 on controller-2 Cleaned up rabbitmq-bundle-podman-0 on controller-1 Cleaned up rabbitmq-bundle-podman-0 on controller-0 Cleaned up rabbitmq-bundle-0 on controller-2 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Cleaned up rabbitmq-bundle-podman-1 on controller-2 Cleaned up rabbitmq-bundle-podman-1 on controller-1 Cleaned up rabbitmq-bundle-podman-1 on controller-0 Cleaned up rabbitmq-bundle-1 on controller-2 Cleaned up rabbitmq-bundle-1 on controller-1 Cleaned up rabbitmq-bundle-1 on controller-0 Cleaned up rabbitmq-bundle-podman-2 on controller-2 Cleaned up rabbitmq-bundle-podman-2 on controller-1 Cleaned up rabbitmq-bundle-podman-2 on controller-0 Cleaned up rabbitmq-bundle-2 on controller-2 Cleaned up rabbitmq-bundle-2 on controller-1 Cleaned up rabbitmq-bundle-2 on controller-0 Cleaned up rabbitmq:0 on rabbitmq-bundle-0 Cleaned up rabbitmq:1 on rabbitmq-bundle-1 Cleaned up rabbitmq:2 on rabbitmq-bundle-2 Waiting for 21 replies from the controller..................... OK [root@controller-2 ~]# pcs resource refresh stonith-fence_compute-fence-nova --strict Cleaned up stonith-fence_compute-fence-nova on controller-2 Cleaned up stonith-fence_compute-fence-nova on controller-1 Cleaned up stonith-fence_compute-fence-nova on controller-0 Waiting for 3 replies from the controller... OK [root@controller-2 ~]# pcs resource refresh stonith-fence_compute-fence-nova Cleaned up stonith-fence_compute-fence-nova on controller-2 Cleaned up stonith-fence_compute-fence-nova on controller-1 Cleaned up stonith-fence_compute-fence-nova on controller-0 Waiting for 3 replies from the controller... OK [root@controller-2 ~]# pcs resource cleanup rabbitmq-bundle-0 --strict Cleaned up rabbitmq-bundle-0 on controller-2 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Waiting for 1 reply from the controller. OK [root@controller-2 ~]# pcs resource cleanup rabbitmq-bundle --strict Cleaned up rabbitmq-bundle-podman-0 on controller-2 Cleaned up rabbitmq-bundle-podman-0 on controller-1 Cleaned up rabbitmq-bundle-podman-0 on controller-0 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Cleaned up rabbitmq-bundle-podman-1 on controller-2 Cleaned up rabbitmq-bundle-podman-1 on controller-1 Cleaned up rabbitmq-bundle-podman-1 on controller-0 Cleaned up rabbitmq-bundle-1 on controller-2 Cleaned up rabbitmq-bundle-1 on controller-1 Cleaned up rabbitmq-bundle-1 on controller-0 Cleaned up rabbitmq-bundle-podman-2 on controller-2 Cleaned up rabbitmq-bundle-podman-2 on controller-1 Cleaned up rabbitmq-bundle-podman-2 on controller-0 Cleaned up rabbitmq-bundle-2 on controller-2 Cleaned up rabbitmq-bundle-2 on controller-1 Cleaned up rabbitmq-bundle-2 on controller-0 Cleaned up rabbitmq:1 on rabbitmq-bundle-1 Cleaned up rabbitmq:2 on rabbitmq-bundle-2 [root@controller-2 ~]# pcs resource cleanup rabbitmq-bundle Cleaned up rabbitmq-bundle-podman-0 on controller-2 Cleaned up rabbitmq-bundle-podman-0 on controller-1 Cleaned up rabbitmq-bundle-podman-0 on controller-0 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Cleaned up rabbitmq-bundle-podman-1 on controller-2 Cleaned up rabbitmq-bundle-podman-1 on controller-1 Cleaned up rabbitmq-bundle-podman-1 on controller-0 Cleaned up rabbitmq-bundle-1 on controller-2 Cleaned up rabbitmq-bundle-1 on controller-1 Cleaned up rabbitmq-bundle-1 on controller-0 Cleaned up rabbitmq-bundle-podman-2 on controller-2 Cleaned up rabbitmq-bundle-podman-2 on controller-1 Cleaned up rabbitmq-bundle-podman-2 on controller-0 Cleaned up rabbitmq-bundle-2 on controller-2 Cleaned up rabbitmq-bundle-2 on controller-1 Cleaned up rabbitmq-bundle-2 on controller-0 Cleaned up rabbitmq:1 on rabbitmq-bundle-1 Cleaned up rabbitmq:2 on rabbitmq-bundle-2 [root@controller-2 ~]# pcs resource cleanup rabbitmq-bundle-0 Cleaned up rabbitmq-bundle-0 on controller-1 Cleaned up rabbitmq-bundle-0 on controller-0 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pcs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:4617 |