Bug 169600 - SMP kernel crash when use as LVS router
SMP kernel crash when use as LVS router
Status: CLOSED DUPLICATE of bug 174990
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Thomas Graf
Brian Brock
:
Depends On:
Blocks: 181409
  Show dependency treegraph
 
Reported: 2005-09-29 22:59 EDT by soul916
Modified: 2014-06-18 04:28 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-29 13:31:55 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description soul916 2005-09-29 22:59:32 EDT
Description of problem:
When use LVS in RHEL4 with SMP kernel and SMP machine, the kernel will crash 
without error message.
I use RHEL4 as LVS router, on DELL 1850 server.
I use kernel version 2.6.9-17.EL and 2.6.9-11.EL all the SMP kernels have the 
problem, none-SMP kernels seem haven't this issue.
When use SMP kernel as LVS router, after a few hours, the machine will crash 
with out error message, the hole system have no response, the monitor have no 
display and the keyboard NumLock couldn't light.

Version-Release number of selected component (if applicable):
2.6.9-17.EL and 2.6.9-11.EL

How reproducible:
Config a lvs router use SMP machine and use SMP kernel.
Use some script to generate load on the router
After afew hours the router will crash

Steps to Reproduce:
1.config a lvs router use SMP machine and SMP kernel
2.write script generate load on the router
3.After afew hours the router will crash
  
Actual results:


Expected results:
The system will crash after a few hours

Additional info:
Comment 1 Jason Baron 2005-10-07 11:45:50 EDT
ok. it would be helpful if we could get a trace of the crash, if there is
one...Can you hook up a serial console? Is there anything in /var/log/messages?
The relevant script might also be helpful. thanks.
Comment 2 soul916 2005-10-12 00:44:47 EDT
Are your sure before release RHEL4, redhat have tested the LVS function?
I'm sure when the system crash, there is not any thing about pulse, kernel or
nanny in /var/log/messages. I read the log carefully, when the system crash the
last log  is cron excute /usr/bin/mrtg or /usr/lib/sa/sa1.
I find the crash will appear under non-SMP kernel also, but the system uptime is
longer than SMP kernel before crash.
I test with 2.6.9-22.EL SMP and none-SMP kernel the system crashed.
I find the script is not necessary. You could config a LVS cluster use RHEL4,
and then the system will crash in few days without any load.
I'm sorry, but I couldn't hook up serial console to the server.
Comment 3 Ralf Sticklies 2005-12-04 04:18:35 EST
On console following shows up:

Kernel panic - not syncing: fs/block_dev.c:396: spin_lock 
fs/block_dev.c:c035d88

see also 
http://archive.linuxvirtualserver.org/html/lvs-users/2005-07/msg00124.html

probably solution: new kernel needed: 
http://archive.linuxvirtualserver.org/html/lvs-users/2005-07/msg00131.html
Comment 4 Pascal Gauthier 2006-03-26 17:24:10 EST
Is there any developement on this one?? I have the same problem. When I check on
the CentOS bugzilla, they have added these two patches:


http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/broken-out/ipvs-deadlock-fix.patch
[^]
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/broken-out/cancel_rearming_delayed_work.patch
[^]


REF: http://bugs.centos.org/view.php?id=1201
Comment 5 Ian Neubert 2006-03-29 13:25:57 EST
I am seeing the same crashes on two single CPU i386 boxes running LVS +
keepalived together. They crashed about every other day. I have since
reinstalled both with Fedora Core 4 with kernel 2.6.15-1.1833_FC4, and have had
no problems.
Comment 6 Jason Baron 2006-03-29 13:31:55 EST
ok, this looks like a dup of 174990, which we have patch for in the current
rhel4 kernel. please find test kernels at: http://people.redhat.com/~jbaron/rhel4/

*** This bug has been marked as a duplicate of 174990 ***
Comment 7 Ian Neubert 2006-03-29 19:59:42 EST
Jason: just to clarify is this patch included in 2.6.9-34.EL? Or is it pending
inclusion and is currently only in your test kernel?

Unfortunetly, I'm not able to access bug 174990 to check myself. Thanks!
Comment 8 Jason Baron 2006-05-05 06:47:04 EDT
Patch is not in -34. Its currently only in my test kernel, but this is the beta
kernel for U4.
Comment 9 Linda Wang 2006-05-09 17:48:33 EDT
errata tool clean up, add to U4 CANFIX list for tracking purposes.

Note You need to log in before you can comment on or make changes to this bug.