Bug 143449

Summary: Kernel Oops starting clvmd: kernel BUG at mm/slab.c:1453!
Product: [Retired] Red Hat Cluster Suite Reporter: Adam "mantis" Manthei <amanthei>
Component: dlmAssignee: Christine Caulfield <ccaulfie>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: cluster-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-05-10 13:21:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Adam "mantis" Manthei 2004-12-21 00:16:37 UTC
Description of problem:
while trying to reproduce bug #143448, I ran into the following:
DLM <CVS> (built Dec 13 2004 15:02:57) installed
kmem_cache_create: duplicate cache dlm_conn
------------[ cut here ]------------
kernel BUG at mm/slab.c:1453!
invalid operand: 0000 [#1]
Modules linked in: dlm(U) parport_pc lp parport autofs4 i2c_dev
i2c_core cman(U) sunrpc md5 ipv6 dm_mod button battery ac uhci_hcd
hw_random e1000 floppy ext3 jbd qla2300 qla2xxx scsi_transport_fc
sd_mod scsi_mod
CPU:    0
EIP:    0060:[<c014997f>]    Tainted: GF     VLI
EFLAGS: 00010202   (2.6.9-1.906_EL)
EIP is at kmem_cache_create+0x417/0x48a
eax: c0312877   ebx: f68fa480   ecx: c03fca80   edx: f8a0c462
esi: f8a0c46b   edi: f8a0c46b   ebp: f68fa608   esp: f4ef6ea4
ds: 007b   es: 007b   ss: 0068
Process clvmd (pid: 17708, threadinfo=f4ef6000 task=f4d37730)
Stack: 23499435 c0000000 f4d37730 f8a0c462 00000080 00000000 00000000
f71fec80
       fffffff4 f8a01c14 00000000 00000000 00000000 1d244b3c 00000000
0000000a
       f8a0c3fb 00000000 00000000 00000000 f6210a50 00000000 f6afa300
00000005
Call Trace:
 [<f8a01c14>] lowcomms_start+0x21a/0x2c5 [dlm]
 [<f89fdec0>] threads_start+0x20/0x3e [dlm]
 [<f89fdeff>] init_internal+0x17/0x30 [dlm]
 [<f89fe7f0>] dlm_new_lockspace+0x38/0x5f [dlm]
 [<f89f726c>] register_lockspace+0xa3/0x149 [dlm]
 [<f89f7cb3>] do_user_create_lockspace+0x21/0x32 [dlm]
 [<f89f8e29>] dlm_write+0x153/0x1ab [dlm]
 [<c01623a9>] vfs_write+0xb6/0xe2
 [<c0162473>] sys_write+0x3c/0x62
 [<c0301bfb>] syscall_call+0x7/0xb
Code: eb 04 19 c0 0c 01 85 c0 75 29 ff 74 24 0c 68 77 28 31 c0 e8 d9
66 fd ff 58 b9 80 ca 3f c0 5a ff 05 80 ca 3f c0 0f 8e 71 1b 00 00 <0f>
0b ad 05 c2 27 31 c0 8b 6d 00 eb 86 8b 54 24 04 b8 00 f0 ff


Version-Release number of selected component (if applicable):
ccs-0.9-0
kernel-2.6.9-1.906_EL
dlm-kernel-2.6.9-3.1
kernel-2.6.9-1.641_EL
kernel-utils-2.4-13.1.37
cman-kernel-2.6.9-3.3
fence-1.3-1
dlm-1.0-0.pre9.1
cman-1.0-0.pre5.0
GFS-kernel-2.6.9-4.2

How reproducible:
I've not yet tried yet

Steps to Reproduce:
1. start and stop the clvmd init.d script a few times
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Christine Caulfield 2005-01-26 16:45:42 UTC
I'm not really sure what's causing this. I can only assume that the
"dlm_conn" slab cache is not being deleted when lowcomms shuts down
(presumably because it's still got things in it).

This might help:

Checking in lowcomms.c;
/cvs/cluster/cluster/dlm-kernel/src/lowcomms.c,v  <--  lowcomms.c
new revision: 1.22.2.3; previous revision: 1.22.2.2
done

If anyone needs to reopen this, can you include any other kernel
messages above the BUG() if there are any. 

Comment 2 Christine Caulfield 2005-05-10 13:21:34 UTC

*** This bug has been marked as a duplicate of 157295 ***