Red Hat Bugzilla – Bug 409221
DLM: panic after device_write
Last modified: 2008-05-21 11:02:53 EDT
Description of problem:
While trying to do LVM2 cluster testing I ran into a problem where the systems
are panicking and dropping into the system monitor. I was able to get
backtraces in a few instances and found that the common function was device_write.
Version-Release number of selected component (if applicable):
Easily without LVM
Steps to Reproduce:
1. On multiple nodes in the cluster use dlm_tool to join and leave a lockspace.
Unable to handle kernel paging request for data at address 0x28002482100922a0
Faulting instruction address: 0xc00000000035d820
cpu 0x0: Vector: 300 (Data Access) at [c00000006f777510]
pc: c00000000035d820: ._spin_lock+0x20/0x88
lr: d000000000a74f08: .dlm_user_add_ast+0xec/0x330 [dlm]
current = 0xc0000000016582b0
paca = 0xc000000000474e00
pid = 2856, comm = clvmd
enter ? for help
[c00000006f777810] d000000000a74f08 .dlm_user_add_ast+0xec/0x330 [dlm]
[c00000006f7778c0] d000000000a603a0 .dlm_add_ast+0x3c/0x158 [dlm]
[c00000006f777960] d000000000a64138 .queue_cast+0x12c/0x15c [dlm]
[c00000006f7779f0] d000000000a6619c .do_unlock+0xcc/0xf4 [dlm]
[c00000006f777a80] d000000000a66d38 .unlock_lock+0x64/0xa0 [dlm]
[c00000006f777b20] d000000000a6a300 .dlm_user_unlock+0xc4/0x1a4 [dlm]
[c00000006f777c10] d000000000a74b80 .device_write+0x4f0/0x78c [dlm]
[c00000006f777cf0] c0000000000ebda8 .vfs_write+0x118/0x200
[c00000006f777d90] c0000000000ec518 .sys_write+0x4c/0x8c
[c00000006f777e30] c0000000000086a4 syscall_exit+0x0/0x40
--- Exception: c00 (System Call) at 000000000fc8f37c
Activating VGs: Unable to handle kernel paging request for data at address
Faulting instruction address: 0xc0000000000e2464
cpu 0x0: Vector: 300 (Data Access) at [c000000071597620]
pc: c0000000000e2464: .cache_alloc_refill+0x124/0x264
lr: c0000000000e2404: .cache_alloc_refill+0xc4/0x264
current = 0xc00000000274dae0
paca = 0xc000000000474e00
pid = 2817, comm = clvmd
enter ? for help
[c000000071597950] c0000000000e2bf0 .kmem_cache_alloc+0xac/0xd8
[c0000000715979e0] d000000000a6ddf8 .allocate_lkb+0x28/0x60 [dlm]
[c000000071597a60] d000000000a66fd8 .create_lkb+0x24/0x198 [dlm]
[c000000071597b00] d000000000a6bb10 .dlm_user_request+0x68/0x20c [dlm]
[c000000071597c10] d000000000a74aa4 .device_write+0x414/0x78c [dlm]
[c000000071597cf0] c0000000000ebda8 .vfs_write+0x118/0x200
[c000000071597d90] c0000000000ec518 .sys_write+0x4c/0x8c
[c000000071597e30] c0000000000086a4 syscall_exit+0x0/0x40
--- Exception: c00 (System Call) at 000000000fc4f37c
SP (f75ee730) is in userspace
System should not panic during testing.
Once I got a set of logical volumes created, restarting clvmd on any node would
cause it to panic.
I took libdlm+dlm_tool from cvs HEAD, compiled as 64 bit, and they work fine.
Next, same libdlm+dlm_tool from cvs HEAD, compiled as 32 bit, and they work fine.
# ldd ./dlm_tool | grep libdlm
libdlm.so.DEVEL => /usr/lib/libdlm.so.DEVEL (0x0ff90000)
# ldd /usr/sbin/clvmd | grep libdlm
libdlm.so.2 => /usr/lib/libdlm.so.2 (0x0fba0000)
# ls -l /usr/lib/libdlm.so.DEVEL
lrwxrwxrwx 1 root root 26 Dec 3 15:44 /usr/lib/libdlm.so.DEVEL ->
and /usr/lib/libdlm.so.2 -> libdlm.so.2.0.73*, but I'm not sure what
that really means, if anything much.
Next, tried clvmd with the new libdlm:
# rm /usr/lib/libdlm.so.2
# cd /usr/lib; ln -s libdlm.so.DEVEL.1196717827 libdlm.so.2
and clvmd now starts up fine on all the nodes; and also shuts down
fine on all.
I'm using cvs HEAD because I can't get libdlm to compile from the RHEL5
branch, probably something dumb I'm doing. But, there's no difference
between libdlm source in RHEL5 and HEAD. So, despite my ignorance about
all this build stuff, I'm inclined to say that the code is fine, and there's
something wrong with the ppc rpm builds.
Chris, can you take a look at the builds to see if there is something odd there?
Adding the TestBlocker flag as this is preventing me from getting through ppc
Moving this up to a beta blocker.
Some documentation on how to work with these ppc nodes.
doral, basic, newport, kent
Use qe's "console" program, e.g. 'console kent', from some
machine that has it installed (use null.msp.redhat.com if you'd like).
On some errors, the machines will drop you into the system monitor over
the console. To reboot from there, you do 'zr'. You can get a backtrace
from there too with 't'.
You can also use sysrq:
Jan 11 13:39:45 <refried> dct: you can send it a sysrq
Jan 11 13:39:52 <refried> ^ecl0 b
Jan 11 13:41:16 <refried> that's in sequence ^E c l 0 b
Jan 11 13:39:22 <dean> dct: you can always use the apc if all else fails.
(although I think these machines take forever to come up when power cycled)
Compile userland programs on basic, the others are missing some rpm's to build;
cd cluster/dlm/tests/usertest/; gcc dlmtest2.c -I../../lib -o dlmtest2 -ldlm
I then scp this to the other nodes.
Compile experimental dlm.ko modules using the rhel51 linux source tree in
- cd linux-rhel51/fs/dlm
- edit files, add printk's, etc
- remain in fs/dlm dir to build...
- make -C /lib/modules/`uname -r`/build M=`pwd`
- insmod ./dlm.ko
The tree with my own dlm debugging is /root/linux-dct/. Most of my own
debugging is trying to determine whether there could be a race handling
userland lkb's or a refcounting problem with userland lkb's. I haven't
found anything wrong, though.
In my own testing, I've been starting a limited set of the clustering stuff:
mount -t configfs none /sys/kernel/config
mount -t debugfs none /sys/kernel/debug
I ran make_panic on gfs on the four nodes all weekend without a problem.
I've reproduced the problems just running dlmtest2 stress, so it's not the
fault of clvmd.
The way we've usually been reproducing the problem is running
'service clvmd start' on one more more nodes, sometimes it takes a couple
times doing stop/start before seeing it, sometimes right away. service
clvmd start activates a number of lv's, so some dlm locking takes place.
(lvm was set up by qe's 'activator' lvm test.)
This is massive memory corruption caused by the compat32 code not checking the
lock name length when it copies the lock information from userspace 32 bit
structure to 64 bit kernel space.
The dlm_unlock call does not specify a name length in the structure passed into
the kernel, so it can contain garbage. This causes the kernel to try and copy
<garbage> bytes into it's kernel 64bit version of the data structure. However,
it has only allocated enough memory to hold the bare structure, not any sort of
The proper fix is two-fold:
1) Fix libdlm to zero namelen before passing it into the kernel. This will fix
the bug and is the easiest thing to do if building kernels is a problem in the
2) Proper bounds checking of the input data in the kernel. Doing just 1) leaves
an exploitable DoS bug.
I'll produce patches for these in the morning.
userland fix checked into RHEL5 branch:
Checking in libdlm.c;
/cvs/cluster/cluster/dlm/lib/libdlm.c,v <-- libdlm.c
new revision: 126.96.36.199; previous revision: 188.8.131.52
Created attachment 291975 [details]
Patch for RHEL-5 kernel
This is the patch for the RHEL-5 kernel.
You can download this test kernel from http://people.redhat.com/dzickus/el5
I can get through the LVM test suite with the 5.2 bits.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.