Red Hat Bugzilla – Bug 119519
(NET 83815)Kernel panic on overwhelming number of TCP requests
Last modified: 2007-11-30 17:10:39 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1)
Description of problem:
When running a large number (1,000,000) of tests using ab from Apache
2.0.46 (newer versions of ab have a few problems, therefore this one)
to test Apache 2.0.49, the kernel panics intermittently. It has
happened at least 5 times so far on my system.
The command I used to test Apache 2.0.49 (compiled from source with
only --enable-shared option) was:
ab -c 10 -k -n 1000000 http://localhost/inquiry.sm
The options mean:
-c: concurrency level
-k: keep-alive requests on
-n: the total number of requests to perform
Sidenote: The file being fetched, inquiry.sm, is a mod_spin macro file
(mod_spin is an Apache 2 module concocted by yours truly). This means
that a shared library is linked into Apache 2, some code in it
executed and the result then served out to the world. Now, even if
mod_spin has gazillion bugs (probably true), the kernel should not
panic. Therefore, the problem is with the kernel.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Download Apache 2.0.49, compile with --enable-shared, install.
Alternatively, use some other software for the same test.
2. Overwhelm the system it with huge number of concurrent TCP requests.
Actual Results: Kernel panics. I will provide some more details once
the machine dies again and I catch its death messages somehow (if
anyone can tell me what's the easiest way to do this, that would be
Expected Results: The kernel should not panic.
Additional info: The test was done on HP Pavillion ZE4201 notebook.
More info is here:
OK, the thing finally crashed (jeez - I can't believe I'm actually
saying this :-). Here is what I see on the screen (hand typed, sorry
if there are errors - my vision is getting blurry from all the hex):
[<c020f01e>] netif_receive_skb [kernel] 0x13e (0xc2f01d1c)
[<c020f15d>] process_backlog [kernel] 0x6d (0xc2f01d3c)
[<c020f26a>] net_rx_action [kernel] 0x6a (0xc2f01d54)
[<c0121e45>] do_softirq [kernel] 0x95 (0xc2f01d70)
[<c02173e5>] .txt.lock.netfilter [kernel] 0xb6 (0xc2f01d88)
[<c02297c0>] ip_queue_xmit2 [kernel] 0x0 (0xc2f01db0)
[<c02284b3>] ip_queue_xmit [kernel] 0x483 (0xc2f01dc8)
[<c02297c0>] ip_queue_xmit2 [kernel] 0x0 (0xc2f01de0)
[<c023df8a>] tcp_v4_send_check [kernel] 0x4a (0xc2f01dfc)
[<c0238928>] tcp_transmit_skb [kernel] 0x3b8 (0xc2f01e1c)
[<c0239604>] tcp_write_xmit [kernel] 0x184 (0xc2f01e1c)
[<c022dfce>] tcp_sendmsg [kernel] 0x5de (0xc2f01e84)
[<c01188a0>] recalc_task_prio [kernel] 0x90 (0xc2f01ea4)
[<c024bb02>] inet_recvmsg [kernel] 0x52 (0xc2f01ee0)
[<c024bb62>] inet_sendmsg [kernel] 0x42 (0xc2f01efc)
[<c0206f9b>] sock_sendmsg [kernel] 0x6b (0xc2f01f10)
[<c020722e>] sock_write [kernel] 0xae (0xc2f01f54)
[<c0144103>] sys_write [kernel] 0xa3 (0xc2f01f94)
[<c0109747>] system_call [kernel] 0x33 (0xc2f01fc0)
Code: 0f 0b 62 00 03 0c 28 c0 e9 6c fd ff ff 8d 74 26 00 55 57 56
<0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing
Hope this helps.
Well, the panic message and register dump has scrolled off..
When it dies, is the keyboard still responsive? If so, enable sysrq
key and capture thread and register dumps, then sync and reboot. The
traces should show up in /var/log/messages.
Or else use a serial console to the box.
How long does it take to crash? Please run slabtop every so many
seconds and save the output.
OK, I'll try what you suggested. BTW, I had sysrq key support in and I
tried to sync, but noting showed up in /var/log/messages. Not sure
why. I'll be more careful next time I crash the box and I'll try to
catch more info.
As for the time needed to crash the box - that varies. Sometimes it'll
go down after a few minutes or so, sometimes it needs half an hour.
Sometimes it'll run through all the tests just fine.
Thanks for the hints.
Switched to FC2. We'll see what 2.6 does with it.
Not sure if the new problem is related this bug or not, but I'll put
the info here anyway. It may be useful.
The platform this time is FC2, kernel 2.6.5-1.358. I'm using Apache
2.0.49 (compiled from source) with libapreq2 (from current CVS) to
upload some files through an HTML form (all to/from localhost - I'm
not connected to the network at all). When I do what with a relatively
big file (around 9.5 MB), on occasion the machine will hang. There is
nothing on the screen or log files that would suggest the error type -
everything just freezes.
Not sure if this is some kind of hardware problem (I've bumped the
BIOS up on this notebook to the latest available from HP) or if it is
kernel related like before.
Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases,
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/