Description of problem: Running looping mount-unmount on multiple nodes would eventually trigger something like this: BUG: unable to handle kernel paging request at virtual address 94949494 printing eip: c0434f13 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: gfs lock_nolock lock_dlm dlm gfs2 sd_mod sg configfs +iscsi_tcp libiscsi scsi_transport_iscsi scsi_mod autofs4 hidp rfcomm l2cap +bluetooth sunrpc ipv6 parport_pc lp parport floppy 8139too 8139cp serio_raw +pcspkr mii dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd ehci_hcd ohci_hcd +uhci_hcd CPU: 0 EIP: 0060:[<c0434f13>] Not tainted VLI EFLAGS: 00010246 (2.6.20 #2) EIP is at queue_work+0x3a/0x4d eax: 94949494 ebx: 00000000 ecx: caabe8d4 edx: d7469944 esi: 00000000 edi: df81715c ebp: c07a6da8 esp: c07a6da0 ds: 007b es: 007b ss: 0068 Process umount.gfs (pid: 23091, ti=c07a6000 task=d7932030 task.ti=cbe05000) Stack: df81712c caf802f4 c07a6db0 e09e3715 c07a6e08 c05f5ee3 00000046 00000246 dd0ecc3c f157c0ef df81712c df81715c d2c94220 ffffffff 00000000 c0609616 c0609673 00000246 dd0ecc2c 00000103 d2c94220 df81712c c062a198 d2c94220 Call Trace: [<c0404e2e>] show_trace_log_lvl+0x1a/0x2f [<c0404ede>] show_stack_log_lvl+0x9b/0xa3 [<c040507a>] show_registers+0x194/0x26a [<c0405271>] die+0x121/0x280 [<c062bf4b>] do_page_fault+0x3e9/0x4b5 [<c062a7c4>] error_code+0x7c/0x84 [<e09e3715>] lowcomms_data_ready+0x25/0x27 [dlm] [<c05f5ee3>] tcp_data_queue+0x521/0xa74 [<c05f7cb4>] tcp_rcv_established+0x72d/0x7e1 [<c05fd043>] tcp_v4_do_rcv+0x28/0x2d0 [<c05ff4df>] tcp_v4_rcv+0x87d/0x8f2 [<c05e6b25>] ip_local_deliver+0x168/0x213 [<c05e6985>] ip_rcv+0x418/0x450 [<c05ca60f>] netif_receive_skb+0x2db/0x35a [<e0871ed7>] cp_rx_poll+0x34d/0x480 [8139cp] [<c05cc144>] net_rx_action+0x9c/0x194 [<c042af98>] __do_softirq+0x64/0xc6 [<c0405fcd>] do_softirq+0x5c/0xac [<c042ade5>] local_bh_enable+0x7e/0x8a [<c05cc47a>] dev_queue_xmit+0x23e/0x260 [<c05eb3e2>] ip_output+0x1e4/0x21e [<c05eabdf>] ip_queue_xmit+0x3a3/0x3e4 [<c05f8a7d>] tcp_transmit_skb+0x613/0x641 [<c05fa426>] __tcp_push_pending_frames+0x6dc/0x798 [<c05fb256>] tcp_send_fin+0x135/0x13f [<c05f0b8b>] tcp_close+0x22b/0x52b [<c0608dbc>] inet_release+0x43/0x49 [<c05c17b2>] sock_release+0x17/0x68 [<e09e2a90>] close_connection+0x1f/0x67 [dlm] [<e09e3d42>] dlm_lowcomms_stop+0xdd/0x13c [dlm] [<e09e0e53>] threads_stop+0xd/0x14 [dlm] [<e09e10a3>] dlm_release_lockspace+0x249/0x267 [dlm] [<e09f6ebb>] gdlm_unmount+0x27/0x55 [lock_dlm] [<e0a6c62e>] gfs2_unmount_lockproto+0x19/0x35 [gfs2] [<e0ac7761>] gfs_lm_unmount+0x1a/0x1c [gfs] [<e0ad3e2c>] gfs_put_super+0x178/0x1ae [gfs] [<c04733b2>] generic_shutdown_super+0x55/0xbe [<c047343b>] kill_block_super+0x20/0x32 [<e0ad0f67>] gfs_kill_sb+0x8/0xa [gfs] [<c04734fb>] deactivate_super+0x5d/0x6f [<c0484a8e>] mntput_no_expire+0x42/0x72 [<c0477928>] path_release_on_umount+0x15/0x18 [<c0485bf6>] sys_umount+0x1e3/0x217 [<c0485c43>] sys_oldumount+0x19/0x1b [<c0403e98>] syscall_call+0x7/0xb ======================= Code: 0f ba 2a 00 19 c0 31 db 85 c0 75 2c 8b 1d 2c 96 81 c0 8d 41 08 39 41 08 8d +42 04 0f 45 de 39 42 04 74 04 0f 0b eb fe 8b 01 f7 d0 <8b> 04 98 bb 01 00 00 00 +e8 28 ff ff ff 89 d8 5b 5e 5d c3 89 c2 EIP: [<c0434f13>] queue_work+0x3a/0x4d SS:ESP 0068:c07a6da0 Entering kdb (current=0xd7932030, pid 23091) on processor 0 Oops: Oops due to oops @ 0xc0434f13 eax = 0x94949494 ebx = 0x00000000 ecx = 0xcaabe8d4 edx = 0xd7469944 esi = 0x00000000 edi = 0xdf81715c esp = 0xc07a6da0 eip = 0xc0434f13 ebp = 0xc07a6da8 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00010246 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xc07a6d68 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
This request was evaluated by Red Hat Kernel Team for inclusion in a Red Hat Enterprise Linux maintenance release, and has moved to bugzilla status POST.
patch posted to rhkernel http://post-office.corp.redhat.com/archives/rhkernel-list/2007-June/msg00158.html
patch reposted to rhkernel http://post-office.corp.redhat.com/archives/rhkernel-list/2007-June/msg01069.html
in 2.6.18-24.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0959.html