Description of problem: When use LVS in RHEL4 with SMP kernel and SMP machine, the kernel will crash without error message. I use RHEL4 as LVS router, on DELL 1850 server. I use kernel version 2.6.9-17.EL and 2.6.9-11.EL all the SMP kernels have the problem, none-SMP kernels seem haven't this issue. When use SMP kernel as LVS router, after a few hours, the machine will crash with out error message, the hole system have no response, the monitor have no display and the keyboard NumLock couldn't light. Version-Release number of selected component (if applicable): 2.6.9-17.EL and 2.6.9-11.EL How reproducible: Config a lvs router use SMP machine and use SMP kernel. Use some script to generate load on the router After afew hours the router will crash Steps to Reproduce: 1.config a lvs router use SMP machine and SMP kernel 2.write script generate load on the router 3.After afew hours the router will crash Actual results: Expected results: The system will crash after a few hours Additional info:
ok. it would be helpful if we could get a trace of the crash, if there is one...Can you hook up a serial console? Is there anything in /var/log/messages? The relevant script might also be helpful. thanks.
Are your sure before release RHEL4, redhat have tested the LVS function? I'm sure when the system crash, there is not any thing about pulse, kernel or nanny in /var/log/messages. I read the log carefully, when the system crash the last log is cron excute /usr/bin/mrtg or /usr/lib/sa/sa1. I find the crash will appear under non-SMP kernel also, but the system uptime is longer than SMP kernel before crash. I test with 2.6.9-22.EL SMP and none-SMP kernel the system crashed. I find the script is not necessary. You could config a LVS cluster use RHEL4, and then the system will crash in few days without any load. I'm sorry, but I couldn't hook up serial console to the server.
On console following shows up: Kernel panic - not syncing: fs/block_dev.c:396: spin_lock fs/block_dev.c:c035d88 see also http://archive.linuxvirtualserver.org/html/lvs-users/2005-07/msg00124.html probably solution: new kernel needed: http://archive.linuxvirtualserver.org/html/lvs-users/2005-07/msg00131.html
Is there any developement on this one?? I have the same problem. When I check on the CentOS bugzilla, they have added these two patches: http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/broken-out/ipvs-deadlock-fix.patch [^] http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/broken-out/cancel_rearming_delayed_work.patch [^] REF: http://bugs.centos.org/view.php?id=1201
I am seeing the same crashes on two single CPU i386 boxes running LVS + keepalived together. They crashed about every other day. I have since reinstalled both with Fedora Core 4 with kernel 2.6.15-1.1833_FC4, and have had no problems.
ok, this looks like a dup of 174990, which we have patch for in the current rhel4 kernel. please find test kernels at: http://people.redhat.com/~jbaron/rhel4/ *** This bug has been marked as a duplicate of 174990 ***
Jason: just to clarify is this patch included in 2.6.9-34.EL? Or is it pending inclusion and is currently only in your test kernel? Unfortunetly, I'm not able to access bug 174990 to check myself. Thanks!
Patch is not in -34. Its currently only in my test kernel, but this is the beta kernel for U4.
errata tool clean up, add to U4 CANFIX list for tracking purposes.