Bug 235177

Summary: netdump times out over bonded interfaces
Product: Red Hat Enterprise Linux 4 Reporter: Bryn M. Reeves <bmr>
Component: kernelAssignee: Thomas Graf <tgraf>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: agospoda, peterm, rkhan, rrajaram, sputhenp, tao
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-06-13 20:55:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryn M. Reeves 2007-04-04 10:47:07 UTC
Description of problem:
Netdump is configured on a host using bonded interfaces (mode 1). When a crash
is triggered on the client the server logs the following messages and fails to
create a vmcore:

Mar 12 08:52:01 oraoms netdump[3375]: Got too many timeouts in handshaking,
ignoring client 10.201.1.100 
Mar 12 08:52:04 oraoms netdump[3375]: Got too many timeouts waiting for
SHOW_STATUS for client 10.201.1.100, rebooting it 
Mar 12 08:52:56 oraoms netdump[3375]: Got too many timeouts in handshaking,
ignoring client 10.201.1.100 
Mar 12 08:52:59 oraoms netdump[3375]: Got too many timeouts waiting for
SHOW_STATUS for client 10.201.1.100, rebooting it 


Version-Release number of selected component (if applicable):
netdump-server-0.7.16-2-i386
netdump-0.7.16-2-i386
2.6.9-42.0.3.ELsmp [netdum client]

How reproducible:
100%

Steps to Reproduce:
1. Configure a mode 1 bonded interface
2. Configure netdump to use the bonded interface
3. Trigger a crash (e.g. sysrq-c)
  
Actual results:
Mar 12 08:52:01 oraoms netdump[3375]: Got too many timeouts in handshaking,
ignoring client 10.201.1.100 
Mar 12 08:52:04 oraoms netdump[3375]: Got too many timeouts waiting for
SHOW_STATUS for client 10.201.1.100, rebooting it 
Mar 12 08:52:56 oraoms netdump[3375]: Got too many timeouts in handshaking,
ignoring client 10.201.1.100 
Mar 12 08:52:59 oraoms netdump[3375]: Got too many timeouts waiting for
SHOW_STATUS for client 10.201.1.100, rebooting it 

Expected results:
A vmcore created and populated on the server.

Additional info:
Netpoll support for bonded interfaces was introduced in RHEL4 U4 (2.6.9-37.EL):

* Fri May 19 2006 Jason Baron <jbaron> [2.6.9-37]

-Introduce netpoll over bonded interfaces (Thomas Graf) [174184 126164 190162
146164]

Comment 1 RHEL Program Management 2007-05-09 05:01:53 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 3 Andy Gospodarek 2007-05-24 20:38:30 UTC
This should be resolved in 4.5, right Thomas?

Comment 4 Thomas Graf 2007-05-24 21:59:23 UTC
It's probably the known arp issue, please try and add a permanent arp entry
on the netdump server for the corresponding client and see if the problem
disappears.

Comment 7 RHEL Program Management 2007-09-07 19:35:53 UTC
This request was previously evaluated by Red Hat Product Management
for inclusion in the current Red Hat Enterprise Linux release, but
Red Hat was unable to resolve it in time.  This request will be
reviewed for a future Red Hat Enterprise Linux release.

Comment 8 Thomas Graf 2008-06-13 20:55:51 UTC

*** This bug has been marked as a duplicate of 239551 ***