Bug 143658 - Kernel panic stopping cluster, removing modules
Kernel panic stopping cluster, removing modules
Status: CLOSED WORKSFORME
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: gnbd (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Ben Marzinski
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-12-23 09:52 EST by Derek Anderson
Modified: 2009-04-16 15:57 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-02-10 13:54:41 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Derek Anderson 2004-12-23 09:52:33 EST
Description of problem:
I had a three node cluster running cluster_rebuild/cluster_cleanup
through the night and received this Oops and panic on two of the
nodes.  It appeared to happen in the process of removing modules in
the cleanup stage. From the modules linked in list it seems that all
of the GFS modules had already been removed, so this may be a kernel
problem.  I'll file it here for now and see if it can be reproduced.

releasing gnbd class
Unable to handle kernel paging request at virtual address e02e3220
 printing eip:
e02e3220
*pde = 014d7067
Oops: 0000 [#1]
SMP
Modules linked in: ipv6 parport_pc lp parport autofs4 sunrpc e1000
floppy sg microcode uhci_hcd ehci_hcd button battery ac ext3 jbd
qla2300 qla2xxx scsi_transport_fc sd_mod scsi_mod
CPU:    0
EIP:    0060:[<e02e3220>]    Not tainted VLI
EFLAGS: 00010246   (2.6.9)
EIP is at 0xe02e3220
eax: c40b4e00   ebx: c40b4e00   ecx: 00000000   edx: c40b4ef4
esi: 00000007   edi: c40b4ff4   ebp: c40b4e20   esp: c03a9f2c
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c03a8000 task=c0342ac0)
Stack: c02c7abe c40b4e00 007829b8 c02c89f0 c40b4e00 c02c88b0 c13fa9e0
c03a9f54
       c012b2f9 c03a9f54 c03a9f54 c03a9f54 00000004 00000001 c03a6788
c03ddb40
       00000000 c01273c6 0000000a 00000046 00000000 c03db484 00452007
c012740d
Call Trace:
 [<c02c7abe>] tcp_write_err+0x7e/0x130
 [<c02c89f0>] tcp_keepalive_timer+0x140/0x2b0
 [<c02c88b0>] tcp_keepalive_timer+0x0/0x2b0
 [<c012b2f9>] run_timer_softirq+0xd9/0x170
 [<c01273c6>] __do_softirq+0xb6/0xd0
 [<c012740d>] do_softirq+0x2d/0x30
 [<c01189e5>] smp_apic_timer_interrupt+0x85/0xf0
 [<c010699e>] apic_timer_interrupt+0x1a/0x20
 [<c0104030>] default_idle+0x0/0x40
 [<c0104059>] default_idle+0x29/0x40
 [<c01040eb>] cpu_idle+0x3b/0x50
 [<c03aa8ec>] start_kernel+0x18c/0x1d0
 [<c03aa370>] unknown_bootoption+0x0/0x180
Code:  Bad EIP value.
 <0>Kernel panic - not syncing: Fatal exception in interrupt

Version-Release number of selected component (if applicable):
DEVEL.1103743813 (built Dec 22 2004 13:31:43)

How reproducible:
First time seen.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Ben Marzinski 2005-01-06 15:55:48 EST
I've tried all day, and I can't reproduce this. Since the gnbd module
isn't even linked in when this happens, I doubt that this is a gnbd
bug.
Comment 2 Derek Anderson 2005-02-10 13:54:41 EST
I haven't seen this again.  Closing.

Note You need to log in before you can comment on or make changes to this bug.