Bug 118728 - repeatedly cycling the network causes a kernel panic
repeatedly cycling the network causes a kernel panic
Status: CLOSED CANTFIX
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
ia64 Linux
high Severity high
: ---
: ---
Assigned To: John W. Linville
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-03-19 11:33 EST by Rick Burchett
Modified: 2007-11-30 17:07 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-11-01 14:03:23 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rick Burchett 2004-03-19 11:33:51 EST
Description of problem: Repeatedly turning on and off of a group of
ethernet interfaces (ifup and ifdown) in a system causes a kernel
panic.  The types of ethernet NICs don't seem to matter.  I've done
this with Broadcom 10/100/1000 cards, e1000 cards and quad cards (DEC
4 port 10/100 cards).  It doesn't seem to happen with just two
interfaces, but with 4 and above it always happens.


Version-Release number of selected component (if applicable):
RHEL 3.0 Update 1 (2.4.21-9.EL)

How reproducible:  Repeatedly ifup and ifdown interfaces (more than 2;
less than 7); the kernel panic will happen within the hour


Steps to Reproduce:
1
.use these scripts:

#!/usr/bin/ksh
# startMcaGen by Rick Burchett
mcaGen2 eth1&
sleep 1
mcaGen2 eth2&
sleep 1
mcaGen2 eth3&
sleep 1
mcaGen2 eth4&


#!/usr/bin/ksh
#mcaGen2 by Rick Burchett
#set -x
# get starting seconds
let sSeconds=$(date +%s)

getElapsedTime() {
   #get current seconds
   let cSeconds=$(date +%s)
   ((eSeconds=cSeconds - sSeconds))
   #compute elapsed time
   ((hours=(eSeconds / 60)/60))
   ((eSeconds-=hours * 60 * 60))
   ((minutes=eSeconds / 60))
   ((seconds=eSeconds - minutes * 60))

   print "Elapsed time: $hours:$minutes:$seconds"
}


while true
do
        echo "Shutting down $1"
                ifdown $1
        echo "$1 shutdown; Sleeping for 15 seconds"
        getElapsedTime
        #sleep 15
        echo "Starting up $1"
        ifup $1
        echo -n "Pinging $1"
        echo
        # INSERT AN ADDRESS TO PING HERE WITH THE PING COMMAND
                  #ping -c 100 -i 0 <ipaddr>
        echo "Sleeping for 15 seconds"
        getElapsedTime
        sleep 15
        getElapsedTime
done


2. Edit startMcaGen to reflect your environment
   edit mcaGen2 to work in your environment
3. Run startMcaGen
  
Actual results:Kernel panic after several minutes (less than an hour
in all observed tests)


Expected results: No kernel panic


Additional info:
Sleeping for 15 seconds
Elapsed time: 0:18:47
Elapsed time: 0:18:47
Elapsed time: 0:18:45
Shutting down eth9
Shutting down eth7
Elapsed time: 0:18:44
Shutting down eth10
Elapsed time: 0:18:46
Shutting down eth8
eth7 shutdown; Sleeping for 15 seconds
eth8 shutdown; Sleeping for 15 seconds
Elapsed time: 0:18:47
Starting up eth7
eth10 shutdown; Sleeping for 15 seconds
Elapsed time: 0:18:46
ip_tables: (C) 2000-2002 Netfilter core team
Starting up eth8
eth9 shutdown; Sleeping for 15 seconds
Elapsed time: 0:18:44
Starting up eth10
Elapsed time: 0:18:45
Starting up eth9
Unable to handle kernel paging request at virtual address a0000000004710d0
modprobe[13965]: Oops 8813272891392

Pid: 13965, comm:             modprobe
EIP is at .plt [iptable_filter] 0x130 (2.4.21-9.EL)
psr : 0000101008026038 ifs : 8000000000000001 ip  :
[<a0000000017111f0>]    Tainted: GF
unat: 0000000000000000 pfs : 0000000000000286 rsc : 0000000000000003
rnat: 0000000000000000 bsps: e000000004cabd00 pr  : 80000000f5a59619
ldrs: 0000000000000000 ccv : 0000000080000000 fpsr: 0009804c0270033f
b0  : a0000000017104c0 b6  : e00000000471ef80 b7  : a000000001710440
f6  : 1003e6db6db6db6db6db7 f7  : 1003e6000000000027e00
f8  : 1003e000000000002ec5f f9  : 1003efffffffffffffe03
r1  : a000000001711050 r2  : a000000001711050 r3  : 0000000000000000
r8  : a000000000008008 r9  : 0000000000000001 r10 : 0000000000000060
r11 : 0000000000000003 r12 : e00000011bd87e50 r13 : e00000011bd80000
r14 : 0000000080000000 r15 : a0000000004710d0 r16 : a000000001711240
r17 : 0000000000000004 r18 : e000000004b6a844 r19 : 000000007fffffff
r20 : 0000000000000004 r21 : e000000004b6a6c0 r22 : e000000004acad20
r23 : 0000000080000000 r24 : 0000000000000000 r25 : 0000000000000000
r26 : 0000000000000000 r27 : 0000000000000000 r28 : 0000000000000000
r29 : 0000000080000000 r30 : 0000000000000000 r31 : a000000001710048

Call Trace: [<e000000004415620>] sp=0xe00000011bd87a60
bsp=0xe00000011bd81318 show_stack [kernel] 0x80
[<e000000004430550>] sp=0xe00000011bd87c20 bsp=0xe00000011bd812e8 die
[kernel] 0x1b0
[<e0000000044527f0>] sp=0xe00000011bd87c20 bsp=0xe00000011bd81290
ia64_do_page_fault [kernel] 0x310
[<e00000000440e6e0>] sp=0xe00000011bd87cb0 bsp=0xe00000011bd81290
ia64_leave_kernel [kernel] 0x0
[<a0000000017111f0>] sp=0xe00000011bd87e50 bsp=0xe00000011bd81288 .plt
[iptable_filter] 0x130
[<a0000000017104c0>] sp=0xe00000011bd87e50 bsp=0xe00000011bd81260 fini
[iptable_filter] 0x80
[<e00000000448c6f0>] sp=0xe00000011bd87e50 bsp=0xe00000011bd81230
free_module [kernel] 0x390
[<e00000000448a340>] sp=0xe00000011bd87e50 bsp=0xe00000011bd81180
sys_delete_module [kernel] 0x3c0
[<e00000000440e6c0>] sp=0xe00000011bd87e60 bsp=0xe00000011bd81180
ia64_ret_from_syscall [kernel] 0x0
Kernel panic: Fatal exception
Comment 1 Ernie Petrides 2005-10-03 19:38:26 EDT
Reassigning to John Linville.

Rick Burchett, does this bug report need to remain private to HP?  If not,
could you please uncheck the "HP Confidential Group" box below?  Thanks.
Comment 2 John W. Linville 2005-10-04 11:00:55 EDT
Is this still a problem w/ U6 kernels?  I think this may be a duplicate of bug 
151054. 
 
Could you try a U6 (or later) kernel, and post the results?  Alternatively, 
you can try the test kernels here: 
 
   http://people.redhat.com/linville/kernels/rhel3/ 
 
Thanks! 
Comment 3 John W. Linville 2005-11-01 14:02:30 EST
Closed due to lack of reponse.  Please reopen when the requested information 
becomes available...thanks! 

Note You need to log in before you can comment on or make changes to this bug.