Bug 176939 - Crash while running IPVS
Crash while running IPVS
Status: CLOSED DUPLICATE of bug 167398
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Thomas Graf
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2006-01-04 10:48 EST by daryl herzmann
Modified: 2014-06-18 04:28 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2007-02-01 13:06:53 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
netdump log (1.64 KB, text/plain)
2006-01-04 10:48 EST, daryl herzmann
no flags Details
netdump log (31.49 KB, text/plain)
2006-02-13 12:10 EST, daryl herzmann
no flags Details

  None (edit)
Description daryl herzmann 2006-01-04 10:48:47 EST
Description of problem:
I have a Sun Fire x2100 that runs as a redundant LVS Director.  The machine will
lock up after various amounts of time as running as the LVS Director.  The
machine appears to be stable when not running IPVS.

While running rhel4u2 kernel, the machine would hard lock and not invoke
netdump.  I installed the rhel4u3 beta kernel and now am getting a netdump.

Version-Release number of selected component (if applicable):
2.6.9-27.EL #1 Tue Dec 20 19:11:47 EST 2005 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
After running IPVS for a while, the machine will hard lock.

Additional info:
Will attach netdump log file.  Note that netdump never will fully reboot the
machine.  The netdump server logs lots of these messages:

Dec 30 23:32:01 server netdump[2472]: Got too many timeouts in handshaking,
ignoring client x.x.x.x 

Dec 30 23:32:04 server netdump[2472]: Got too many timeouts waiting for
SHOW_STATUS for client x.x.x.x, rebooting it

Comment 1 daryl herzmann 2006-01-04 10:48:47 EST
Created attachment 122761 [details]
netdump log
Comment 2 daryl herzmann 2006-01-26 10:18:27 EST

The lockups continue.  They seem to be about 7-10 days apart.  The netdump
failures continue as well.  Perhaps I should file a seperate bug on netdump failing?

Comment 3 Jason Baron 2006-01-26 10:28:07 EST
Yes, please file a separate bug for the netdump failuers. thanks.
Comment 4 daryl herzmann 2006-01-26 11:17:20 EST
thanks.  Done

Comment 5 daryl herzmann 2006-02-13 12:10:15 EST
Created attachment 124570 [details]
netdump log 

Wow, for whatever reason netdump worked this time when the machine crashed!  I
have attached what appears to be the full 'log'.  The machine also rebooted
properly as well.  Thanks!
Comment 6 daryl herzmann 2006-04-10 11:29:59 EDT

Just for an update.  This machine hasn't crashed since the upgrade to
2.6.9-34.ELsmp.  Uptime is currently at 33 days, which is at least 3 times as
long as it would ever stay up previously.  I am getting some of these messages
in dmesg now:

eth1: too many iterations (6) in nv_nic_irq.

Not sure if that is related or not.  eth1 is using 'forcedeth'

Comment 7 daryl herzmann 2006-04-18 23:52:34 EDT
Oh well, just crashed again and netdump didn't work :(  My luck ran out
Comment 8 daryl herzmann 2006-06-07 16:30:56 EDT
Another crash and netdump failed again.
Comment 9 Jason Baron 2006-08-18 11:22:16 EDT
hmmh, have you tried U4? There are likely relevant fixes there. 
Comment 10 daryl herzmann 2006-08-18 11:24:42 EDT
Hi Jason,

I reinstalled both machines to EL3 and haven't had a single crash on either IPVS
director. Knock on wood!

Comment 11 Lon Hohberger 2007-02-01 09:51:28 EST

This looks strikingly related to #220149
Comment 13 daryl herzmann 2007-02-01 09:58:30 EST

Just to update.  I continue to use EL3 on both directors without issue.  Sorry
that I can't help with this bug anymore.

Comment 14 Lon Hohberger 2007-02-01 10:08:08 EST
Daryl, I am thinking this is just a repeat of #167398.  If the box runs out of
memory because of IPVS, that can cause pretty serious problems.
Comment 15 Lon Hohberger 2007-02-01 13:06:53 EST
Closing -> Dup of 167398

*** This bug has been marked as a duplicate of 167398 ***

Note You need to log in before you can comment on or make changes to this bug.