Description of problem: This happens whenever qdisk is in use if we shut down qdiskd. Effects are unknown. Version-Release number of selected component (if applicable): Linux tng3-2 2.6.18-48.el5 #1 SMP Mon Sep 17 17:26:31 EDT 2007 i686 i686 i386 GNU/Linux How reproducible: 100% Steps to Reproduce: 1. Start qdiskd on a node 2. Wait for it to become part of the quorate qdisk set 3. Stop qdiskd Actual results: dlm: closing connection to node 0 Expected results: Node 0 doesn't exist; it's a quorum disk. Additional info: The following trace may or may not be related to this problem but it bears noting since rgmanager got stuck in lockspace open/creation coincidentally after the kernel reported the above: clurgmgrd D 4D7C0F96 3244 2623 1 2908 (NOTLB) d2f27e90 00000086 c04876be 4d7c0f96 00000015 c047d3a8 00000001 df59baa0 df79e550 4d7f5cba 00000015 00034d24 00000000 df59bbac c13f4ee0 00000001 00000000 df59baa0 00000002 00000008 00000018 00000008 e03e24b8 e03e24b4 Call Trace: [<c04876be>] mntput_no_expire+0x11/0x6a [<c047d3a8>] link_path_walk+0xb3/0xbd [<c0604a50>] __mutex_lock_slowpath+0x45/0x74 [<c0604a8e>] .text.lock.mutex+0xf/0x14 [<e03d0773>] dlm_new_lockspace+0x1b/0x79b [dlm] [<c045467a>] find_get_page+0x18/0x38 [<c04570b1>] filemap_nopage+0x192/0x315 [<c045fd14>] __handle_mm_fault+0x353/0x87b [<e03d5d16>] device_write+0x30f/0x4b5 [dlm] [<e03d5a07>] device_write+0x0/0x4b5 [dlm] [<c0470217>] vfs_write+0xa1/0x143 [<c0470809>] sys_write+0x3c/0x63 [<c0404eff>] syscall_call+0x7/0xb
Either dlm_controld needs to ignore the disk it gets from cman_get_nodes(), or cman_get_nodes() shouldn't return the disk as a cluster member.
The options are: a) Change cman_get_nodes() to not return the quorum disk, and add an API call to get the qdisk name that cman_tool can call. b) Make everybody ignore nodeid 0. hmm, put like that option a) sounds like the best deal.
Created attachment 214471 [details] Patch to remove qdisk from cman_get_nodes() This is the patch to remove the qdisk from cman_get_nodes() and into its own API call, along with attendant changes to cman_tool. As a quick fix, getting dlm_controld to ignore nodeid 0 would be a much smaller patch!
This patch will definitely affect: rgmanager This patch may have effects on other parts including: ccsd lvm2-cluster
rgmanager reports the quorum disk status along with node status. Changes at least to clustat would be required in order to preserve output / apps which use clustat for information. (standard output snippet) Member Name ID Status ------ ---- ---- ------ tng3-1 1 Online, Local, rgmanager tng3-2 2 Online, rgmanager tng3-3 3 Online, rgmanager tng3-5 5 Online, rgmanager /dev/sdd1 0 Online, Quorum Disk (xml snippet) <nodes> <node name="tng3-1" state="1" local="1" estranged="0" rgmanager="1" qdisk="0" nodeid="0x00000001"/> <node name="tng3-2" state="1" local="0" estranged="0" rgmanager="1" qdisk="0" nodeid="0x00000002"/> <node name="tng3-3" state="1" local="0" estranged="0" rgmanager="1" qdisk="0" nodeid="0x00000003"/> <node name="tng3-5" state="1" local="0" estranged="0" rgmanager="1" qdisk="0" nodeid="0x00000005"/> <node name="/dev/sdd1" state="1" local="0" estranged="0" rgmanager="0" qdisk="1" nodeid="0x00000000"/> </nodes>
According to lon, I may be hitting this while doing recovery testing with qdisk. I'll either see a node deadlock during start-up *or* see that node eventually come up and then any lvm cmd will hang.
lvm2-cluster is fine as it is, or with the patch - it doesn't care about qdisk and already ignores nodeid 0. I'm not sure about ccsd but I think the same applies. Yes, I can believe this will cause problems with new nodes arriving in the cluster. Connection "0" in the DLM is the listening socket. So any new connections received after this message will be refused.
It appears that I am definately hitting this while running revolver in a qdisk cluster. Everytime I eventually end up seeing that "dlm: closing connection to node 0" along with clvmd stuck indlm:dlm_new_lockspace. Oct 4 09:49:45 taft-02 kernel: clvmd D ffffffff801405e0 0 5972 1 5997 5981 (NOTLB) Oct 4 09:49:45 taft-02 kernel: ffff81021198dd98 0000000000000086 00000000000000d0 00000000000000d0 Oct 4 09:49:45 taft-02 kernel: 000000000000000a ffff8101ffd52080 ffffffff802dcae0 00000025bed1ce44 Oct 4 09:49:45 taft-02 kernel: 00000000000010c7 ffff8101ffd52268 ffff810200000000 ffff81021b7bf820 Oct 4 09:49:45 taft-02 kernel: Call Trace: Oct 4 09:49:45 taft-02 kernel: [<ffffffff800610e7>] wait_for_completion+0x79/0xa2 Oct 4 09:49:45 taft-02 kernel: [<ffffffff800884ac>] default_wake_function+0x0/0xe Oct 4 09:49:45 taft-02 kernel: [<ffffffff8014129e>] kobject_register+0x33/0x3a Oct 4 09:49:45 taft-02 kernel: [<ffffffff8845726b>] :dlm:dlm_new_lockspace+0x734/0x88b Oct 4 09:49:45 taft-02 kernel: [<ffffffff8845cdc7>] :dlm:device_write+0x414/0x5ca Oct 4 09:49:45 taft-02 kernel: [<ffffffff800161c7>] vfs_write+0xce/0x174 Oct 4 09:49:45 taft-02 kernel: [<ffffffff80016a94>] sys_write+0x45/0x6e Oct 4 09:49:45 taft-02 kernel: [<ffffffff8005b28d>] tracesys+0xd5/0xe0
Created attachment 215951 [details] Make dlm_controld always ignore node ID 0 Patch marks node ID 0 as dead.
Just a note that running with the fix (in cman-2.0.73-1.1.x86_64.rpm) that lon built appears to solve this issue.
Re-modified.
Mar 27 14:18:06 molly openais[1587]: [CMAN ] lost contact with quorum device However, there was no indication of "dlm: Closing connection to node 0". This test was performed on 2.0.81. I also tested using the test case for the bz directly: [root@molly ~]# ./cman_port_bug process_cman_event - PORTOPENED 2 process_cman_event - STATECHANGE 0 cman_is_listening(0x1f1d2010, 1, 192) = 0 cman_is_listening(0x1f1d2010, 2, 192) = 1 process_cman_event - PORTOPENED 1 process_cman_event - STATECHANGE 0 cman_is_listening(0x1f1d2010, 1, 192) = -1 (errno = 107) cman_is_listening(0x1f1d2010, 2, 192) = 1 process_cman_event - STATECHANGE 0 cman_is_listening(0x1f1d2010, 1, 192) = -1 (errno = 107) cman_is_listening(0x1f1d2010, 2, 192) = 1 process_cman_event - STATECHANGE 0 cman_is_listening(0x1f1d2010, 1, 192) = 0 cman_is_listening(0x1f1d2010, 2, 192) = 1 process_cman_event - STATECHANGE 0 cman_is_listening(0x1f1d2010, 1, 192) = 0 cman_is_listening(0x1f1d2010, 2, 192) = 1 process_cman_event - STATECHANGE 0 cman_is_listening(0x1f1d2010, 1, 192) = 0 cman_is_listening(0x1f1d2010, 2, 192) = 1 At no point during the join/boot process did I receive the error message that the node had the port open (but without a PORTOPENED message).
Marking verified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0347.html