Red Hat Bugzilla – Bug 176939
Crash while running IPVS
Last modified: 2014-06-18 04:28:43 EDT
Description of problem:
I have a Sun Fire x2100 that runs as a redundant LVS Director. The machine will
lock up after various amounts of time as running as the LVS Director. The
machine appears to be stable when not running IPVS.
While running rhel4u2 kernel, the machine would hard lock and not invoke
netdump. I installed the rhel4u3 beta kernel and now am getting a netdump.
Version-Release number of selected component (if applicable):
2.6.9-27.EL #1 Tue Dec 20 19:11:47 EST 2005 x86_64 x86_64 x86_64 GNU/Linux
After running IPVS for a while, the machine will hard lock.
Will attach netdump log file. Note that netdump never will fully reboot the
machine. The netdump server logs lots of these messages:
Dec 30 23:32:01 server netdump: Got too many timeouts in handshaking,
ignoring client x.x.x.x
Dec 30 23:32:04 server netdump: Got too many timeouts waiting for
SHOW_STATUS for client x.x.x.x, rebooting it
Created attachment 122761 [details]
The lockups continue. They seem to be about 7-10 days apart. The netdump
failures continue as well. Perhaps I should file a seperate bug on netdump failing?
Yes, please file a separate bug for the netdump failuers. thanks.
Created attachment 124570 [details]
Wow, for whatever reason netdump worked this time when the machine crashed! I
have attached what appears to be the full 'log'. The machine also rebooted
properly as well. Thanks!
Just for an update. This machine hasn't crashed since the upgrade to
2.6.9-34.ELsmp. Uptime is currently at 33 days, which is at least 3 times as
long as it would ever stay up previously. I am getting some of these messages
in dmesg now:
eth1: too many iterations (6) in nv_nic_irq.
Not sure if that is related or not. eth1 is using 'forcedeth'
Oh well, just crashed again and netdump didn't work :( My luck ran out
Another crash and netdump failed again.
hmmh, have you tried U4? There are likely relevant fixes there.
I reinstalled both machines to EL3 and haven't had a single crash on either IPVS
director. Knock on wood!
This looks strikingly related to #220149
Just to update. I continue to use EL3 on both directors without issue. Sorry
that I can't help with this bug anymore.
Daryl, I am thinking this is just a repeat of #167398. If the box runs out of
memory because of IPVS, that can cause pretty serious problems.
Closing -> Dup of 167398
*** This bug has been marked as a duplicate of 167398 ***