Bug 650033 - clvmd locks when there is DLM problem
Summary: clvmd locks when there is DLM problem
Keywords:
Status: CLOSED DUPLICATE of bug 599327
Alias: None
Product: Fedora
Classification: Fedora
Component: corosync
Version: 13
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: ---
Assignee: Steven Dake
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-05 03:37 UTC by Madison Kelly
Modified: 2016-04-26 17:08 UTC (History)
10 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2010-12-23 16:01:58 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Trace from pastebin location (9.48 KB, text/plain)
2010-11-29 14:54 UTC, Zdenek Kabelac
no flags Details

Description Madison Kelly 2010-11-05 03:37:25 UTC
Description of problem:

I debated whether this is a clvmd or dlm issue.

For some time, I've run into problems where clvmd would lock hard enough that 'kill -9 `pidof clvmd`' would not stop it. This would force a reboot, or often a fence as reboot would never complete, to restore the node. Further, I would often (always?) also then need reboot the surviving node before I could restart clvmd on the first node. Herein lies the major problem.

I've finally found a reliable, if stupid, way to trigger this condition. If you set the MTU of the totem interface on one node to be higher than the second (say, 2000 vs 1500), then try to start clvmd, it will start successfully on the higher MTU node and hang while starting on the second, lower MTU node. After this, you will not be able to stop or restart clvmd on the first node until it is rebooted.

Regardless of the cause, clvmd (or dlm if it's the source), should eventually fail and exit rather than endlessly wait. This is a very big problem because of the impact on the second node.

Version-Release number of selected component (if applicable):

- All are 64bit packages
cman-3.0.17-1
Corosync Cluster Engine, version '1.2.8' SVN revision '3035'
Cluster LVM daemon version: 2.02.73(2) (2010-09-18)
dlm_controld 3.0.17 (built Oct  7 2010 06:55:07)
Linux kernel 2.6.32, myoung's 170 Xen dom0

How reproducible:

100% with the intentional misconfiguration, random otherwise.

Steps to Reproduce:
1. Set the MTU of the totem interface of one node to be higher than another.
2. Create a DRBD device (or possibly use an iSCSI device) and use it to create a clustered PV -> VG -> LV
3. Configure a simple 2-node cluster
4. set the LV to clustered
5. Start clvmd
  
Actual results:

clvmd will forever wait while starting. Trying to stop it with ctrl+c will fail. Killing the start PID will return the terminal, but 'clvmd -T30' will remain. This process is not killable. Second node may have started clvmd successfully, but it will not be locked and stopping clvmd will cause the same infinite wait.

Expected results:

On error, the clvmd (or dlm) should timeout and exit, returning control of the node. Failing this, a timeout should trigger a fence call against the node that caused the timeout.

Additional info:

Below is a link to the log file starting when cman (corosync) first starts on the node with the lower MTU (which was know to fail). 

Line 61 is where clvmd was started. On line 111, after the second timeout message, there is a different message (dlm: writequeue empty for nodeid 1) tha may be relevant. The messages continued until the node was fenced (reboot hung).

http://pastebin.com/qNQ0B7GN

Again, the MTU issue is artificial, but it reliably reproduces an issue I've seen seemingly randomly happen for some time.

Comment 1 Zdenek Kabelac 2010-11-29 14:54:53 UTC
Created attachment 463503 [details]
Trace from pastebin location

Trace from pastebin to keep it local to bz.
(Removed common prefix from the trace)

Comment 2 Zdenek Kabelac 2010-11-29 14:56:54 UTC
Looks like system is inside some internal dlm code. Could be possibly some misconfiguration? So asking DLM maintainer what he thinks about this problem.

Comment 3 David Teigland 2010-11-29 15:36:57 UTC
If there are issues other than MTU, we should try to identify them specifically instead of trying to work around them in other layers, without ever identifying the root cause.

For the MTU misconfiguration, if corosync cannot operate with that setting, it should validate or detect the misconfig and report an error at startup.

All that is not to say that that it wouldn't be a good idea to make higher level subsystems (dlm, clvm, other) interruptable, and capable of cleanly backing out,
but the problems identified so far do not seem to justify that amount of work.

Comment 4 Madison Kelly 2010-11-29 16:26:06 UTC
There are other issues, unfortunately, I've not been able to identify them yet.

Comment 5 Milan Broz 2010-12-23 14:56:12 UTC
Reassigning to corosync - this is not clvmd fault. Not sure if there is enough info, though.

Please can you explain what are the other issues mentioned in comment #4?

Comment 6 Steven Dake 2010-12-23 16:01:58 UTC
Bug # 599327 addresses the problem where corosync cannot be exited when totem is unable to form a configuration (the MTU problem described in this bugzilla).

Regards
-steve

*** This bug has been marked as a duplicate of bug 599327 ***


Note You need to log in before you can comment on or make changes to this bug.