Bug 2010084
Summary: | VirtualDomain migration run twice and resource failed | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | tbsky <tbskyd> | ||||||
Component: | pacemaker | Assignee: | Reid Wahl <nwahl> | ||||||
Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||
Severity: | medium | Docs Contact: | Steven J. Levine <slevine> | ||||||
Priority: | unspecified | ||||||||
Version: | 8.5 | CC: | cluster-maint, kgaillot, msmazova, nwahl, slevine | ||||||
Target Milestone: | rc | Keywords: | Reopened, Triaged | ||||||
Target Release: | 8.9 | Flags: | pm-rhel:
mirror+
|
||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | pacemaker-2.1.6-1.el8 | Doc Type: | Bug Fix | ||||||
Doc Text: |
.Pacemaker Designated Controller elections no longer finalized until all pending actions are complete
When a cluster elects a new Designated Controller (DC), all nodes send their current history to the new DC, which saves it to the CIB. As a consequence, if actions were already in progress when a new DC is elected, and the actions finish after the nodes send their current history to the new DC, the actions' results could be lost. With this fix, DC elections are not finalized until all pending actions are complete and no action results are lost.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2023-11-14 15:32:34 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
tbsky
2021-10-03 15:28:52 UTC
Created attachment 1828624 [details]
migration log at destination node
*** Bug 2010086 has been marked as a duplicate of this bug. *** (In reply to tbsky from comment #0) > please check the migration-source.log file logged at node-1. resource kvm-16 > start migrate_to at "09:15:32" and success at "09:15:33". but later at > "09:15:33" pacemaker want to migrate again. it failed because the vm 'Not > active locally'. > > check the migration-destination.log logged at node-2. the first migration > seems ignored since it is busy doing other things. > > maybe the situation is like bug 2000595? The situation has the same trigger, the DC election is being repeated frequently, which causes each node to re-record its resource history in the CIB status section. It's a timing issue that goes like: Oct 02 09:15:32 love-1-test.example.com pacemaker-controld [3325405] (te_rsc_command) notice: Initiating migrate_to operation kvm-16_migrate_to_0 on love-2-test.example.com | action 41 ^ The first migration is started Oct 02 09:15:32 love-1-test.example.com pacemaker-based [3325400] (cib_process_request) info: Completed cib_replace operation for section configuration: OK (rc=0, origin=love-1-test.example.com/cibadmin/2, version=0.29071.0) ... Oct 02 09:15:32 love-1-test.example.com pacemaker-controld [3325405] (do_state_transition) info: State transition S_TRANSITION_ENGINE -> S_ELECTION | input=I_ELECTION cause=C_FSA_INTERNAL origin=do_cib_replaced ^ Something modifies the configuration with a full replacement (likely a pcs command, I'm guessing in a script or cron somewhere), which triggers a new DC election. Oct 02 09:15:33 love-1-test.example.com pacemaker-controld [3325405] (process_graph_event) info: Transition 14127 action 41 (kvm-16_migrate_to_0 on love-2-test.example.com) confirmed: ok | rc=0 call-id=19369 ^ The migration completes successfully and is recorded in the resource history Oct 02 09:15:33 love-1-test.example.com pacemaker-controld [3325405] (controld_delete_node_state) info: Deleting resource history for node love-2-test.example.com (via CIB call 142828) | xpath=//node_state[@uname='love-2-test.example.com']/lrm ^ However the node's resource history is immediately erased due to the election Oct 02 09:15:33 love-1-test.example.com pacemaker-based [3325400] (cib_perform_op) info: ++ <lrm_resource id="kvm-16" type="VirtualDomain" class="ocf" provider="heartbeat"> Oct 02 09:15:33 love-1-test.example.com pacemaker-based [3325400] (cib_perform_op) info: ++ <lrm_rsc_op id="kvm-16_last_failure_0" operation_key="kvm-16_monitor_0" operation="monitor" crm-debug-origin="build_active_RAs" crm_feature_set="3.11.0" transition-key="33:14062:7:4a7f8335-f28c-4644-9e9d-fa408252ce19" transition-magic="0:0;33:14062:7:4a7f8335-f28c-4644-9e9d-fa408252ce19" exit-reason="" on_node="love-2-test.example.com" call-id="19236" rc-code="0" op-status= Oct 02 09:15:33 love-1-test.example.com pacemaker-based [3325400] (cib_perform_op) info: ++ <lrm_rsc_op id="kvm-16_last_0" operation_key="kvm-16_start_0" operation="start" crm-debug-origin="build_active_RAs" crm_feature_set="3.11.0" transition-key="36:14123:0:4a7f8335-f28c-4644-9e9d-fa408252ce19" transition-magic="0:0;36:14123:0:4a7f8335-f28c-4644-9e9d-fa408252ce19" exit-reason="" on_node="love-2-test.example.com" call-id="19364" rc-code="0" op-status="0" interval Oct 02 09:15:33 love-1-test.example.com pacemaker-based [3325400] (cib_perform_op) info: ++ </lrm_resource> ^ When the node updates its resource history, it doesn't include the migration result, so the cluster thinks the migration still needs to be done I suspect this a timing issue related to the election-triggered history refresh and the migration action happening at the same time. The sequence appears to be: 1. The migration command completes and a new DC election is initiated around the same time. 2. For the election, the node grabs its resource history and sends it to the DC (before the migration result has been recorded). 3. For the migration, the node receives the result and records it in the resource history. 4. The DC receives the originally sent resource history, and overwrites the (newer) resource history in the CIB with it. The fix for bz2000595 (erasing the old history and overwriting it with the new in one operation) won't help here since the problem is a result arriving after the new history has already been grabbed. I'm not sure at this point what the fix should be. Perhaps the newly elected DC should re-request the resource history if the most recent action result arrived after the timestamp in the history. The best workaround in the meantime will be to identify what's repeatedly replacing the CIB configuration, and see if there's a way to reduce that frequency thanks again for the detailed analyze. I run scripts to create/migrate/delete vm at two nodes for testing cluster stability. the scripts will lock each other. so at anytime there will be only one "pcs" command which can run at whole cluster. unfortunately the result need time to spread, so locking didn't prevent racing. to workaround with bz2000595 I run "crm_mon -1" multiple times until I get the correct one. but this kind of racing seems hard to workaround. One possibility would be to use the --wait option with the pcs commands (the ones that support it). That will wait until the cluster settles before returning. Another way of doing the same thing would be to run "crm_resource --wait". indeed pcs has many command which has "--wait" option. I will try to test with it. thanks a lot for the hint! Fixed upstream by commit 39a497e thanks a lot for the fix! the story of the commit is a good read. it is great that we don't need to keep racing in mind when doing jobs. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. Why does it keep doing this... verified as SanityOnly in pacemaker-2.1.6-1.el8 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (pacemaker bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:6970 |