Bug 213005 - [dlm-kernel] DLM oops in kref_put when umounting
Summary: [dlm-kernel] DLM oops in kref_put when umounting
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-10-30 15:27 UTC by Christine Caulfield
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version: 5.0.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-11-17 20:05:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to fix (this has gone upstream to Steve) (1015 bytes, patch)
2006-11-02 14:42 UTC, Christine Caulfield
no flags Details | Diff

Description Christine Caulfield 2006-10-30 15:27:36 UTC
Description of problem:
Quite often when umounting a GFS filesystem on my 8 node cluster one node will
oops as shown below. This hangs any further DLM operations in the cluster until
that node is rebooted.

Version-Release number of selected component (if applicable):


How reproducible:
Very easily on my 8-node i686 cluster

Steps to Reproduce:
1. Mount a GFS filesystem on all 8 nodes
2. umount it on all 8 nodes
3. repeat until it oopses (this usually happens very quickly for me)
  
Actual results:
Oops on at least one node.

Expected results:
Clean umount on all nodes.

Additional info:
Oops text:

BUG: unable to handle kernel paging request at virtual address db79d830
 printing eip:                     
c01f54d1                           
*pde = 0006e067                    
*pte = 1b79d000                    
Oops: 0000 [#1]                    
SMP DEBUG_PAGEALLOC                
Modules linked in: lock_nolock lock_dlm dlm gfs2 configfs sctp ipv6 dm_round_rob
in iscsi_tcp libiscsi scsi_transport_iscsi dm_multipath
CPU:    0                          
EIP:    0060:[<c01f54d1>]    Not tainted VLI
EFLAGS: 00010206   (2.6.19-rc3 #2) 
EIP is at kref_put+0x55/0x7c       
eax: db79d830   ebx: db79d830   ecx: c9014f40   edx: c8d25f60
esi: c01f4f7d   edi: d4ea67dc   ebp: d241ff14   esp: d241fefc
ds: 007b   es: 007b   ss: 0068     
Process dlm_controld (pid: 3871, ti=d241e000 task=d2bdd5b0 task.ti=d241e000)
Stack: d2bdd5b0 bf9e2726 d241ff20 00000046 00000000 db79d818 d241ff24 c01f4939
       db79d830 c01f4f7d d241ff3c c01914d1 db79d818 00000008 c8d37dc8 c9014f40
       d241ff70 c015ad7c c8d37dc8 c9014f40 00000000 00000000 00000000 c8d25f60
Call Trace:                        
 [<c0103d85>] show_trace_log_lvl+0x26/0x3c
 [<c0103e38>] show_stack_log_lvl+0x9d/0xa5
 [<c01041e8>] show_registers+0x1af/0x249
 [<c01045a9>] die+0x1dd/0x2c6      
 [<c0111932>] do_page_fault+0x488/0x562
 [<c03294b9>] error_code+0x39/0x40 
 [<c01f4939>] kobject_put+0x1f/0x21
 [<c01914d1>] sysfs_release+0x31/0x7d
 [<c015ad7c>] __fput+0xdc/0x1a7    
 [<c015ae5e>] fput+0x17/0x19       
 [<c01586d7>] filp_close+0x61/0x6a 
 [<c0158f0d>] sys_close+0x7c/0xb1  
 [<c0102ecd>] sysenter_past_esp+0x56/0x8d
 =======================           
Code: 75 29 c7 44 24 0c 55 b9 39 c0 c7 44 24 08 35 00 00 00 c7 44 24 04 5e f0 34
c0 c7 04 24 ba c7 33 c0 e8 38 49 f2 ff e8 b0 e9 f0 ff <8b> 03 48 74 0c 90 ff 0b
0f 94 c0 31 d2 84 c0 74 0a 89 1c 24 ff
EIP: [<c01f54d1>] kref_put+0x55/0x7c SS:ESP 0068:d241fefc

Comment 1 Christine Caulfield 2006-11-02 14:42:57 UTC
Created attachment 140135 [details]
Patch to fix (this has gone upstream to Steve)

Comment 2 Kiersten (Kerri) Anderson 2006-11-02 15:34:30 UTC
Devel ACK and posting for beta2 blocker status.  Problem has not yet shown up in
QE testing but will most likely impact the mount-stress tests.  Patch has been
posted to rhkernel-list and would like to consider it for the final beta2 kernel
respin.

Comment 3 Rob Kenna 2006-11-02 15:44:46 UTC
Moved to RHEL5 beta and dlm-kernel.  Also provided pm_ack.

Comment 4 Jay Turner 2006-11-02 15:55:34 UTC
QE ack for RHEL5B2.

Comment 5 Kiersten (Kerri) Anderson 2006-11-02 16:49:04 UTC
Changing component to kernel for patch tracking.

Comment 6 Don Zickus 2006-11-06 03:28:11 UTC
in kernel-2.6.18-1.2744.el5

Comment 7 Jay Turner 2006-11-17 20:05:15 UTC
Patch confirmed in 2.6.18-1.2747.el5.


Note You need to log in before you can comment on or make changes to this bug.