Bug 1688149
| Summary: | pacemaker cluster will never settle | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | michal novacek <mnovacek> | ||||
| Component: | pacemaker | Assignee: | Reid Wahl <nwahl> | ||||
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 8.0 | CC: | cluster-maint, jrehova, kgaillot, nwahl | ||||
| Target Milestone: | pre-dev-freeze | Keywords: | Reopened, Triaged | ||||
| Target Release: | 8.9 | Flags: | pm-rhel:
mirror+
|
||||
| Hardware: | All | ||||||
| OS: | All | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | pacemaker-2.1.6-4.el8 | Doc Type: | Bug Fix | ||||
| Doc Text: |
Cause: Pacemaker previously assigned clone instances to equally scored nodes without considering the instances' current nodes.
Consequence: If a clone had equally scored location constraints on a subset of nodes, clone instances could be assigned to a different node each time and continuously stopped and restarted by the cluster.
Fix: Instances are now assigned to their current node whenever possible.
Result: Clone instances do not get restarted unnecessarily.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-11-14 15:32:34 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | 2.1.7 | ||||
| Embargoed: | |||||||
| Bug Depends On: | 1682116 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Description
michal novacek
2019-03-13 09:50:45 UTC
It's actually not running happily; starting at Mar 12 16:03:37 in the logs (when the LVM-activate/Filesystem/VirtualDomain resources and their constraints are added), the resources are continuously restarting. :( Removing the location constraints for the Filesystem resources seems to work around the problem. (They are equally scored location constraints for both nodes in a symmetric cluster, so they have no effect.) I also see location constraints keeping various resources off the VirtualDomain resources. Those are not Pacemaker Remote nodes, so the constraints do not mean anything. However those constraints aren't causing any problems. There is a separate issue with the simulation (but not the cluster) thinking the fence devices need to be restarted. That might interfere with the --wait as well. This is a known issue that has not been investigated. Can you try the workaround and see if it helps? We need to fix the underlying issues, but given how difficult it is to get anything into GA at this point, a workaround would be good to have. I can confirm that removing the positive constraint for filesystem works around the problem. An update: (In reply to Ken Gaillot from comment #1) > It's actually not running happily; starting at Mar 12 16:03:37 in the logs > (when the LVM-activate/Filesystem/VirtualDomain resources and their > constraints are added), the resources are continuously restarting. :( Looking at the logs more closely, I was off a bit: the configuration was being repeatedly changed during this time, so resources were starting and stopping appropriately. Problems actually start at Mar 12 16:32:27. > Removing the location constraints for the Filesystem resources seems to work > around the problem. (They are equally scored location constraints for both > nodes in a symmetric cluster, so they have no effect.) Changing the location constraints to have a score less than INFINITY also works around the problem. Pacemaker assigns an instance number to clone instances on each node. What is going wrong here is that every time Pacemaker runs its scheduler, it assigns different instance numbers to the existing active instances compared to what it wants the final result to be, so it thinks the instances need to be moved. The cause for that still needs to be found and fixed. > There is a separate issue with the simulation (but not the cluster) thinking > the fence devices need to be restarted. That might interfere with the --wait > as well. This is a known issue that has not been investigated. As an aside, the simulation issue has been fixed, though the fix will not make it into RHEL 8.4. However, that issue does not affect --wait when used with a live cluster. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. (In reply to RHEL Program Management from comment #13) > After evaluating this issue, there are no plans to address it further or fix > it in an upcoming release. Therefore, it is being closed. If plans change > such that this issue will be fixed in an upcoming release, then the bug can > be reopened. This is still a high priority and I am hopeful the fix will be in RHEL 8.5. Once we are further along in 8.5 release planning, we will likely reopen this. This is fixed by upstream commit 018ad6d5. Version of pacemaker: > [root@virt-143:~]# rpm -q pacemaker > pacemaker-2.1.6-7.el8.x86_64 Setting of cluster with stopped cluster on node virt-134: > [root@virt-143:~]# pcs status > Cluster name: STSRHTS30001 > Cluster Summary: > * Stack: corosync (Pacemaker is running) > * Current DC: virt-143 (version 2.1.6-7.el8-6fdc9deea29) - partition with quorum > * Last updated: Wed Aug 23 17:00:47 2023 on virt-143 > * Last change: Wed Aug 23 16:37:15 2023 by root via cibadmin on virt-134 > * 2 nodes configured > * 2 resource instances configured > > Node List: > * Online: [ virt-143 ] > * OFFLINE: [ virt-134 ] > > Full List of Resources: > * fence-virt-134 (stonith:fence_xvm): Started virt-143 > * fence-virt-143 (stonith:fence_xvm): Started virt-143 > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/enabled Creating a cloned resource group called base_group-clone with interleave=true: > [root@virt-143:~]# pcs resource create base-a ocf:pacemaker:Dummy --group base_group > [root@virt-143:~]# pcs resource create base-b ocf:pacemaker:Dummy --group base_group > [root@virt-143:~]# pcs resource clone base_group interleave=true Creating a working CIB copy: > [root@virt-143:~]# pcs cluster cib > /tmp/cib.xml Creating a cloned resource group called dependent_group-clone with interleave=true: > [root@virt-143:~]# pcs -f /tmp/cib.xml resource create dependent-a ocf:pacemaker:Dummy --group dependent_group > [root@virt-143:~]# pcs -f /tmp/cib.xml resource create dependent-b ocf:pacemaker:Dummy --group dependent_group > [root@virt-143:~]# pcs -f /tmp/cib.xml resource clone dependent_group interleave=true Colocating dependent_group-clone with base_group-clone: > [root@virt-143:~]# pcs -f /tmp/cib.xml constraint colocation add dependent_group-clone with base_group-clone Creating an ordering constraint to start base_group-clone before dependent_group-clone: > [root@virt-143:~]# pcs -f /tmp/cib.xml constraint order start base_group-clone then start dependent_group-clone > Adding base_group-clone dependent_group-clone (kind: Mandatory) (Options: first-action=start then-action=start) Creating INFINITY location constraints for dependent_group-clone on both nodes: > [root@virt-143:~]# pcs -f /tmp/cib.xml constraint location dependent_group-clone prefers virt-134 > Warning: Validation for node existence in the cluster will be skipped > [root@virt-143:~]# pcs -f /tmp/cib.xml constraint location dependent_group-clone prefers virt-143 > Warning: Validation for node existence in the cluster will be skipped Creating dependent-a's state file on both nodes: > [root@virt-143:~]# touch /var/run/Dummy-dependent-a.state > [root@virt-134:~]# touch /var/run/Dummy-dependent-a.state Pushing the working configuration to the live CIB: > [root@virt-143:~]# pcs cluster cib-push --config /tmp/cib.xml > CIB updated Result: The dependent_group resources do not get moved. Instead, the second resource (dependent-b) in each dependent_group instance gets started on the node where the first resource (dependent-a) is already running. > [root@virt-143:~]# vim /var/log/pacemaker/pacemaker.log > Aug 23 17:00:36 virt-143 pacemaker-controld [59482] (do_state_transition) notice: State transition S_IDLE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph > ... > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (recurring_op_for_active) info: Start 10s-interval monitor for dependent-a:0 on virt-134 > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (recurring_op_for_active) info: Start 10s-interval monitor for dependent-b:0 on virt-134 > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (recurring_op_for_active) info: Start 10s-interval monitor for dependent-a:1 on virt-143 > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (recurring_op_for_active) info: Start 10s-interval monitor for dependent-b:1 on virt-143 > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave fence-virt-134 (Started virt-143) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave fence-virt-143 (Started virt-134) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave base-a:0 (Started virt-143) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave base-b:0 (Started virt-143) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave base-a:1 (Started virt-134) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave base-b:1 (Started virt-134) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (log_list_item) notice: Actions: Start dependent-a:0 ( virt-134 ) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (log_list_item) notice: Actions: Start dependent-b:0 ( virt-134 ) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (log_list_item) notice: Actions: Start dependent-a:1 ( virt-143 ) > Aug 23 17:14:17 virt-143 pacemaker-schedulerd[59481] (log_list_item) notice: Actions: Start dependent-b:1 ( virt-143 ) > ... > Aug 23 17:14:18 virt-143 pacemaker-controld [59482] (abort_transition_graph) info: Transition 6 aborted by operation dependent-a_monitor_0 'modify' on virt-143: Event failed | magic=0:0;9:6:7:9f8accaf-aa51-415c-a2f4-99a6d5b15d73 cib=0.13.7 source=process_graph_event:548 complete=false > ... > Aug 23 17:14:18 virt-143 pacemaker-controld [59482] (do_state_transition) info: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE | input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd > ... > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (recurring_op_for_active) info: Start 10s-interval monitor for dependent-a:0 on virt-143 > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (recurring_op_for_active) info: Start 10s-interval monitor for dependent-b:0 on virt-143 > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (recurring_op_for_active) info: Start 10s-interval monitor for dependent-a:1 on virt-134 > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (recurring_op_for_active) info: Start 10s-interval monitor for dependent-b:1 on virt-134 > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave fence-virt-134 (Started virt-143) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave fence-virt-143 (Started virt-134) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave base-a:0 (Started virt-143) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave base-b:0 (Started virt-143) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave base-a:1 (Started virt-134) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave base-b:1 (Started virt-134) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave dependent-a:0 (Started virt-143) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (log_list_item) notice: Actions: Start dependent-b:0 ( virt-143 ) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (rsc_action_default) info: Leave dependent-a:1 (Started virt-134) > Aug 23 17:14:18 virt-143 pacemaker-schedulerd[59481] (log_list_item) notice: Actions: Start dependent-b:1 ( virt-134 ) > ... > Aug 23 17:14:18 virt-143 pacemaker-controld [59482] (execute_rsc_action) notice: Initiating start operation dependent-b_start_0 locally on virt-143 | action 34 > ... > Aug 23 17:14:18 virt-143 pacemaker-controld [59482] (execute_rsc_action) notice: Initiating start operation dependent-b_start_0 on virt-134 | action 43 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:6970 |