Bug 472786
Summary: | cluster view inconsistent after "service cman stop; service cman start" | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Nate Straz <nstraz> | ||||
Component: | cman | Assignee: | Christine Caulfield <ccaulfie> | ||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 5.3 | CC: | cfeist, cluster-maint, cward, edamato, jplans, matt, mrappa, rlerch, tao | ||||
Target Milestone: | rc | Keywords: | ZStream | ||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | cman-2.0.100-1.el5 | Doc Type: | Bug Fix | ||||
Doc Text: |
Cause: When a node leaves the cluster normally it sends a message to other nodes that set its state to LEAVING. It is only when the node actually disappears from openais that the state is set to DOWN.
Consequence: If the node is restarted quickly then the node UP message arrives before the expected node down message (which gets cancelled). But cman only looks for DOWN nodes when marking nodes as back up again. So the node appears to stay DOWN.
Fix: The check for a node transition to the UP state now also checks for nodes in LEAVING as well as DOWN states.
Result: Quickly restarting a node using cman_tool leave; cman_tool join correctly updates the node state in cman.
|
Story Points: | --- | ||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-09-02 11:06:09 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 510510 | ||||||
Attachments: |
|
I was able to easily reproduce this on RHEL 5.2. You can use "service cman restart" instead of the compound command. This is a bug that was fixed quite some time ago in STABLE2 but the patch never got into RHEL. It has now: commit 6325f9d1d135d2a86974a3ffc36f62d0693080d1 Author: Christine Caulfield <ccaulfie> Date: Wed Dec 3 10:46:20 2008 +0000 cman: Fix inconsistent state if a node leaves/joins quickly This is in git for 5.4 - do we want a 5.3 patch too ? It's not on the blocker list but it's a pretty stupid bug with a small fix. Glad I found this to verify I wasn't going crazy. 2 days fighting with this. Seeing this same issue as described here and can reproduce it in my simple 4 node cluster. The twist I'm seeing is when you add in clvmd via service clvmd stop; service cman stop; service cman start; service clvmd start and it becomes a real mess. The node that was removed and rejoined looks good Node Sts Inc Joined Name 1 M 1344 2009-01-27 08:30:24 www 2 M 1352 2009-01-27 08:30:27 xxx 3 M 1348 2009-01-27 08:30:26 yyy 4 M 1340 2009-01-27 08:30:24 zzz but the three remaining nodes not so much.. Node Sts Inc Joined Name 1 M 1340 2009-01-27 08:30:21 www 2 M 1352 2009-01-27 08:30:27 xxx 3 M 1348 2009-01-27 08:30:26 yyy 4 X 1344 zzz one of them saying Jan 27 08:31:40 www kernel: [ 236.425837] dlm: connect from non cluster node dlm_send then pegs the cpu on the three remaining nodes PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 9500 root 20 -5 0 0 0 R 100 0.0 4:39.65 dlm_send task list dlm_send R running task 0 9317 175 9318 9316 (L-TLB) dlm_recoverd D ffff81000102df80 0 9318 175 9317 (L-TLB) ffff81045c51ddd0 0000000000000046 ffff81000102f5a0 ffff81045c18f580 ffff81000001dc00 000000000000000a ffff810451d89860 ffff81010f71c100 00000096af1198d5 0000000000000181 ffff810451d89a48 0000000500000000 Call Trace: [<ffffffff8843ccb0>] :dlm:rcom_response+0x0/0xb [<ffffffff8843dd48>] :dlm:dlm_wait_function+0xdc/0x135 [<ffffffff8009db21>] autoremove_wake_function+0x0/0x2e [<ffffffff8843d59f>] :dlm:dlm_rcom_status+0xa4/0x179 [<ffffffff884399a2>] :dlm:dlm_recover_members+0x36d/0x45c [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4 [<ffffffff8843e817>] :dlm:dlm_recoverd+0x11d/0x47f [<ffffffff8843e6fa>] :dlm:dlm_recoverd+0x0/0x47f [<ffffffff80032360>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8009d909>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032262>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 and after about 9 minutes these three remaining nodes panic when the oom-killer starts a massacre automount invoked oom-killer: gfp_mask=0x200d2, order=0, oomkilladj=0 Call Trace: [<ffffffff800c39dd>] out_of_memory+0x8e/0x2f5 [<ffffffff8000f2eb>] __alloc_pages+0x245/0x2ce [<ffffffff8003213d>] read_swap_cache_async+0x45/0xd8 [<ffffffff800c9472>] swapin_readahead+0x60/0xd3 [<ffffffff80009027>] __handle_mm_fault+0x9bc/0xe5c [<ffffffff80066b9a>] do_page_fault+0x4cb/0x830 [<ffffffff80030d9a>] do_fork+0x148/0x1c1 [<ffffffff8005dde9>] error_exit+0x0/0x84 etc. Patch fixes the issue, suggest updating asap so no one has to endure my pain :) Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: When a node leaves the cluster normally it sends a message to other nodes that set its state to LEAVING. It is only when the node actually disappears from openais that the state is set to DOWN. Consequence: If the node is restarted quickly then the node UP message arrives before the expected node down message (which gets cancelled). But cman only looks for DOWN nodes when marking nodes as back up again. So the node appears to stay DOWN. Fix: The check for a node transition to the UP state now also checks for nodes in LEAVING as well as DOWN states. Result: Quickly restarting a node using cman_tool leave; cman_tool join correctly updates the node state in cman. ~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative. Verified with cman-2.0.110-1.el5 on i386 using new test case hokeypokey. I made it through 326 restarts before stopping the test. We will be including 20 iterations of hokeypokey in future regression testing. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1341.html |
Created attachment 324502 [details] /var/log/messages from all marathon nodes Description of problem: After quickly stopping and starting the cman service on one node in a cluster, the cluster membership becomes inconsistent across the cluster. Version-Release number of selected component (if applicable): cman-2.0.95-1.el5 openais-0.80.3-19.el5 How reproducible: 100% Steps to Reproduce: 1. Start a cluster 2. On one node run `service cman stop && service cman start` 3. Check cman_tool nodes on all cluster nodes Actual results: After running the command on marathon-01, marathon-03 and marathon-05 in sequence: marathon-01: Node Sts Inc Joined Name 1 M 5336 2008-11-24 09:47:33 marathon-01 2 M 5340 2008-11-24 09:47:33 marathon-02 3 X 5340 marathon-03 4 M 5340 2008-11-24 09:47:33 marathon-04 5 X 5340 marathon-05 marathon-02: Node Sts Inc Joined Name 1 X 5332 marathon-01 2 M 5312 2008-11-24 09:33:13 marathon-02 3 X 5324 marathon-03 4 M 5328 2008-11-24 09:33:14 marathon-04 5 X 5320 marathon-05 marathon-03: Node Sts Inc Joined Name 1 M 5348 2008-11-24 10:12:44 marathon-01 2 M 5348 2008-11-24 10:12:44 marathon-02 3 M 5344 2008-11-24 10:12:44 marathon-03 4 M 5348 2008-11-24 10:12:44 marathon-04 5 X 5348 marathon-05 marathon-04: Node Sts Inc Joined Name 1 X 5332 marathon-01 2 M 5328 2008-11-24 09:33:18 marathon-02 3 X 5328 marathon-03 4 M 5316 2008-11-24 09:33:17 marathon-04 5 X 5328 marathon-05 marathon-05 Node Sts Inc Joined Name 1 M 5356 2008-11-24 10:14:55 marathon-01 2 M 5356 2008-11-24 10:14:55 marathon-02 3 M 5356 2008-11-24 10:14:55 marathon-03 4 M 5356 2008-11-24 10:14:55 marathon-04 5 M 5352 2008-11-24 10:14:54 marathon-05 marathon-01: Version: 6.1.0 Config Version: 1 Cluster Name: marathon Cluster Id: 27036 Cluster Member: Yes Cluster Generation: 5356 Membership state: Cluster-Member Nodes: 5 Expected votes: 5 Total votes: 3 Quorum: 3 Active subsystems: 7 Flags: Dirty Ports Bound: 0 Node name: marathon-01 Node ID: 1 Multicast addresses: 239.192.105.6 Node addresses: 10.15.89.71 marathon-02: Version: 6.1.0 Config Version: 1 Cluster Name: marathon Cluster Id: 27036 Cluster Member: Yes Cluster Generation: 5356 Membership state: Cluster-Member Nodes: 5 Expected votes: 5 Total votes: 2 Quorum: 3 Activity blocked Active subsystems: 7 Flags: Dirty Ports Bound: 0 Node name: marathon-02 Node ID: 2 Multicast addresses: 239.192.105.6 Node addresses: 10.15.89.72 marathon-03: Version: 6.1.0 Config Version: 1 Cluster Name: marathon Cluster Id: 27036 Cluster Member: Yes Cluster Generation: 5356 Membership state: Cluster-Member Nodes: 5 Expected votes: 5 Total votes: 4 Quorum: 3 Active subsystems: 7 Flags: Dirty Ports Bound: 0 Node name: marathon-03 Node ID: 3 Multicast addresses: 239.192.105.6 Node addresses: 10.15.89.73 marathon-04: Version: 6.1.0 Config Version: 1 Cluster Name: marathon Cluster Id: 27036 Cluster Member: Yes Cluster Generation: 5356 Membership state: Cluster-Member Nodes: 5 Expected votes: 5 Total votes: 2 Quorum: 3 Activity blocked Active subsystems: 7 Flags: Dirty Ports Bound: 0 Node name: marathon-04 Node ID: 4 Multicast addresses: 239.192.105.6 Node addresses: 10.15.89.74 marathon-05: Version: 6.1.0 Config Version: 1 Cluster Name: marathon Cluster Id: 27036 Cluster Member: Yes Cluster Generation: 5356 Membership state: Cluster-Member Nodes: 5 Expected votes: 5 Total votes: 5 Quorum: 3 Active subsystems: 7 Flags: Dirty Ports Bound: 0 Node name: marathon-05 Node ID: 5 Multicast addresses: 239.192.105.6 Node addresses: 10.15.89.75 Expected results: Cluster membership should be consistent even after a quick stop & start of services. Additional info: