Red Hat Bugzilla – Bug 142874
Assertion failed on line 128 of file /home/snark/code/head/cluster/dlm-kernel/src/reccomms.c
Last modified: 2009-04-16 16:29:47 EDT
Created attachment 108570 [details]
dumps from dlm
Again, not sure if this is everything, think I'm limited by the scrollback
Created attachment 108739 [details]
Full dlm assert dump
Got this one again. (finally)
Turned on screen logging, got full output this time. Also included output from
other nodes. clocks are synced accross all nodes.
Created attachment 108836 [details]
email describing how this bug was hit
This is a copy of the email sent on the linux-cluster mailing list.
When the dlm reports -ENOBUFS (-105) it means that no kernel memory
could be allocated to send a network message. Obviously, the reccomms
function asserts when it sees this, and the remote_stage function
doesn't (but it probably should.)
It's not clear that there's anything wrong with the dlm here. Reducing
the drop_count in lock_dlm might help simply by causing gfs to cache
fewer locks and reduce memory usage.
comment #3 is related to bug 139738, not this one
*** This bug has been marked as a duplicate of 142844 ***
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.