From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050513 Firefox/1.0.4
Description of problem:
When running our application, the kernel will give following panic (with kernel 2.6.9-5.EL):
Kernel panic - not syncing: net/ipv4/tcp_timer.c:211: spin_lock(net/ipv4/tcp_minisocks.c:e36dc0a0) already locked by net/ipv4/tcp_ipv4.c/1790
With SMP kernel (2.6.9-5.EL) the system hangs completely (will not take input from the console) and will not give any kernel panic message.
Our application uses QUEUE target with iptables to handle packet filtering in user space. It also uses TCP/UDP/multicast sockets and is quite CPU extensive. Our application is completely user space stuff.
This happens an all EL4 systems we have tried. Does not happen with EL3 or other 2.6.X kernel based distros we have tried.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Start the application
2. After few minutes to few hours the system stops responding
Actual Results: Kernel panic or system hangs completely
Expected Results: System should stay responsive
Similar bug entered for FC3:
James, as author of the QUEUE iptables code could you take a quick
look at this one? By far, the one constant is that everyone seeing
this bug is using QUEUE.
There also seems to be no indication that TUX is being used, but it
would be nice for the bug reporter to tell if they are in fact using
TUX as that adds yet another variable.
TUX is not used when this happens.
Created attachment 116818 [details]
Test program to recreate the issue.
(In reply to comment #1)
I gather it's this (patch upstream and also in url):
Is the patch (mentioned by James) included in any RH4 kernels?
It's in current RHEL4 kernel cvs, so it should be available in the U2 kernel to
be released very soon.
We have now done two days of testing with 2.6.9-11 kernel that is patched with
the given patch. The system is more stable but it has had panic three times with
Kernel panic - not syncing: Fatal exception in interrupt
Part of the stack trace (handwritten so may contain some typos):
We are trying to collect more information using netdump utility. But if anybody
has any ideas, those are welcome.
I submitted a new bug about the case with the patched kernel
I just checked the 2.6.9-22 and I did not found the patch (I found the patch for
IPv6 though). Does anyone have information in which version this is included?
The issue you encountered may be fixed in U4 public beta kernel.
Can you please give it a try, and post your results here?