Bug 2122806
| Summary: | Pacemaker does not abort transition if resource configuration is moved | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Ken Gaillot <kgaillot> | |
| Component: | pacemaker | Assignee: | Reid Wahl <nwahl> | |
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | |
| Severity: | urgent | Docs Contact: | Steven J. Levine <slevine> | |
| Priority: | urgent | |||
| Version: | 8.7 | CC: | cfeist, cluster-maint, kgaillot, mjuricek, msmazova, slevine | |
| Target Milestone: | rc | Keywords: | Regression, Triaged, ZStream | |
| Target Release: | 8.8 | Flags: | pm-rhel:
mirror+
|
|
| Hardware: | All | |||
| OS: | All | |||
| Whiteboard: | ||||
| Fixed In Version: | pacemaker-2.1.5-1.el8 | Doc Type: | Bug Fix | |
| Doc Text: |
.Pacemaker now rechecks resource assignments immediately when resource order changes
As of RHEL 8.7, Pacemaker did not recheck resource assignments when the order of resources in the CIB changed with no changes to the resource definition. If configuration reordering would cause resources to move, that would not take place until the next natural transition, up to the value of `cluster-recheck-interval-property`. This could cause issues if resource stickiness is not configured for a resource.
With this change, Pacemaker rechecks resource assignments when the order of the resources in the CIB changes, as it did for earlier Pacemaker releases. The cluster now responds immediately to these changes, if needed.
|
Story Points: | --- | |
| Clone Of: | ||||
| : | 2125337 2125589 (view as bug list) | Environment: | ||
| Last Closed: | 2023-05-16 08:35:22 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | 2.1.5 | |
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2125337, 2125589 | |||
|
Description
Ken Gaillot
2022-08-30 20:13:57 UTC
Fixed by upstream commit 04d1ba5f after fix: --------- > [root@virt-507 ~]# rpm -q pacemaker > pacemaker-2.1.5-4.el8.x86_64 Configure a two node cluster with no resource stickiness: > [root@virt-507 ~]# pcs status > Cluster name: STSRHTS29018 > Status of pacemakerd: 'Pacemaker is running' (last updated 2022-12-23 17:16:11 +01:00) > Cluster Summary: > * Stack: corosync > * Current DC: virt-507 (version 2.1.5-4.el8-a3f44794f94) - partition with quorum > * Last updated: Fri Dec 23 17:16:11 2022 > * Last change: Fri Dec 23 17:05:45 2022 by root via cibadmin on virt-507 > * 2 nodes configured > * 2 resource instances configured > Node List: > * Online: [ virt-507 virt-508 ] > Full List of Resources: > * fence-virt-507 (stonith:fence_xvm): Started virt-507 > * fence-virt-508 (stonith:fence_xvm): Started virt-508 > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled > [root@virt-507 ~]# pcs resource defaults update stickiness=0 > Warning: Defaults do not apply to resources which override them with their own defined values > [root@virt-507 ~]# pcs resource defaults > Meta Attrs: rsc_defaults-meta_attributes > stickiness=0 Create three resources: > [root@virt-507 ~]# pcs resource create rsc1 ocf:pacemaker:Dummy > [root@virt-507 ~]# pcs resource create rsc2 ocf:pacemaker:Dummy > [root@virt-507 ~]# pcs resource create rsc3 ocf:pacemaker:Dummy Resources "rsc1" and "rsc3" should run on node "virt-507" and "rsc2" should run on node "virt-508": > [root@virt-507 ~]# pcs resource > * rsc1 (ocf::pacemaker:Dummy): Started virt-507 > * rsc2 (ocf::pacemaker:Dummy): Started virt-508 > * rsc3 (ocf::pacemaker:Dummy): Started virt-507 Run `pcs cluster edit` and move the configuration of "rsc3" before "rsc1": > [root@virt-507 ~]# pcs cluster edit > CIB updated Check resources: > [root@virt-507 ~]# time crm_resource --wait > real 0m2.225s > user 0m0.011s > sys 0m0.008s > [root@virt-507 ~]# pcs resource > * rsc3 (ocf::pacemaker:Dummy): Started virt-507 > * rsc1 (ocf::pacemaker:Dummy): Started virt-508 > * rsc2 (ocf::pacemaker:Dummy): Started virt-508 Check log: > [root@virt-507 ~]# tail -f /var/log/messages > Dec 23 17:22:52 virt-507 pacemaker-controld[395397]: notice: State transition S_IDLE -> S_POLICY_ENGINE > Dec 23 17:22:52 virt-507 pacemaker-schedulerd[395396]: notice: Actions: Move rsc1 ( virt-507 -> virt-508 ) > Dec 23 17:22:52 virt-507 pacemaker-schedulerd[395396]: notice: Calculated transition 9, saving inputs in /var/lib/pacemaker/pengine/pe-input-48.bz2 > Dec 23 17:22:52 virt-507 pacemaker-controld[395397]: notice: Initiating stop operation rsc1_stop_0 locally on virt-507 > Dec 23 17:22:52 virt-507 pacemaker-controld[395397]: notice: Requesting local execution of stop operation for rsc1 on virt-507 > Dec 23 17:22:52 virt-507 pacemaker-controld[395397]: notice: Result of stop operation for rsc1 on virt-507: ok > Dec 23 17:22:52 virt-507 pacemaker-controld[395397]: notice: Initiating start operation rsc1_start_0 on virt-508 > Dec 23 17:22:52 virt-507 pacemaker-controld[395397]: notice: Initiating monitor operation rsc1_monitor_10000 on virt-508 > Dec 23 17:22:52 virt-507 pacemaker-controld[395397]: notice: Transition 9 (Complete=3, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-48.bz2): Complete > Dec 23 17:22:52 virt-507 pacemaker-controld[395397]: notice: State transition S_TRANSITION_ENGINE -> S_IDLE Check resource configuration: > [root@virt-507 ~]# cibadmin --query --xpath "//resources/primitive[@class='ocf']" > <xpath-query> > <primitive class="ocf" id="rsc3" provider="pacemaker" type="Dummy"> > <operations> > <op id="rsc3-migrate_from-interval-0s" interval="0s" name="migrate_from" timeout="20s"/> > <op id="rsc3-migrate_to-interval-0s" interval="0s" name="migrate_to" timeout="20s"/> > <op id="rsc3-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/> > <op id="rsc3-reload-interval-0s" interval="0s" name="reload" timeout="20s"/> > <op id="rsc3-reload-agent-interval-0s" interval="0s" name="reload-agent" timeout="20s"/> > <op id="rsc3-start-interval-0s" interval="0s" name="start" timeout="20s"/> > <op id="rsc3-stop-interval-0s" interval="0s" name="stop" timeout="20s"/> > </operations> > </primitive> > <primitive class="ocf" id="rsc1" provider="pacemaker" type="Dummy"> > <operations> > <op id="rsc1-migrate_from-interval-0s" interval="0s" name="migrate_from" timeout="20s"/> > <op id="rsc1-migrate_to-interval-0s" interval="0s" name="migrate_to" timeout="20s"/> > <op id="rsc1-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/> > <op id="rsc1-reload-interval-0s" interval="0s" name="reload" timeout="20s"/> > <op id="rsc1-reload-agent-interval-0s" interval="0s" name="reload-agent" timeout="20s"/> > <op id="rsc1-start-interval-0s" interval="0s" name="start" timeout="20s"/> > <op id="rsc1-stop-interval-0s" interval="0s" name="stop" timeout="20s"/> > </operations> > </primitive> > <primitive class="ocf" id="rsc2" provider="pacemaker" type="Dummy"> > <operations> > <op id="rsc2-migrate_from-interval-0s" interval="0s" name="migrate_from" timeout="20s"/> > <op id="rsc2-migrate_to-interval-0s" interval="0s" name="migrate_to" timeout="20s"/> > <op id="rsc2-monitor-interval-10s" interval="10s" name="monitor" timeout="20s"/> > <op id="rsc2-reload-interval-0s" interval="0s" name="reload" timeout="20s"/> > <op id="rsc2-reload-agent-interval-0s" interval="0s" name="reload-agent" timeout="20s"/> > <op id="rsc2-start-interval-0s" interval="0s" name="start" timeout="20s"/> > <op id="rsc2-stop-interval-0s" interval="0s" name="stop" timeout="20s"/> > </operations> > </primitive> > </xpath-query> Result: Resource rsc1 is immediately moved to node "virt-508" after changing their order in the CIB. marking VERIFIED in pacemaker-2.1.5-4.el8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2818 |