Bug 201049 - nbd-server on RHEL4U3 kernel fails to accept nbd-client connection
nbd-server on RHEL4U3 kernel fails to accept nbd-client connection
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
All Linux
medium Severity high
: ---
: ---
Assigned To: Neil Horman
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-08-02 11:21 EDT by Mike Snitzer
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-16 07:19:03 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Mike Snitzer 2006-08-02 11:21:44 EDT
Description of problem:

When running the nbd-client and nbd-server on separate RHEL4U3 2.6.9-34.ELsmp
based servers the nbd-client connection to an nbd-server fails to negotiate.  

There is an active thread on the nbd-general mailing list that speaks to this
issue with the nbd-server.  The relevant threads can be found here:
http://sourceforge.net/mailarchive/forum.php?thread_id=26829697&forum_id=40388
http://sourceforge.net/mailarchive/forum.php?thread_id=28939130&forum_id=40388

The executive summary of those threads is: the nbd-server's accept() is hanging,
leaving the nbd-client's negotitation hanging.  The concern is that the kernel
is what is causing this bug in nbd's user-space socket code. 

When a stock kernel.org kernel >= 2.6.15 is used on the nbd-server system the
hanging nbd-client negotiation with the nbd-server has not been reproduced.

Version-Release number of selected component (if applicable):
2.6.9-34.ELsmp

How reproducible:
This nbd-server accept() hang happens frequently in production on numerous
servers but a reliable synthetic reproducer test has not been identified. 

Steps to Reproduce:
1. start nbd-server (be it version 2.7.3 thru 2.8.5) on a server running
2.6.9-34.ELsmp
2. make nbd-client connection to nbd-server while server is busy with moderate IO
3. rebooting nbd-client system (or restarting nbd-client) eventually causes the
nbd-client's negociation to hang
  
Actual results:
nbd-client will eventually hang waiting for nbd-server's accept() with the
following output: 
Negotiation: ..

Expected results:

the nbd-client should complete the connection to the nbd-server with something like:

Negotiation: ..size = 102400KB
bs=1024, sz=102400

Additional info:
Comment 1 Neil Horman 2006-08-16 07:19:03 EDT
This thread that you provided :
http://sourceforge.net/mailarchive/forum.php?thread_id=28939130&forum_id=40388
Seems to give the answer that you need already.  This isn't a bug in the kernels
tcp accept code, but rather an application bug.  You need to use version 2.8.6
or later of the ndb server code.  Upgrade your nbd software and try again.

Note You need to log in before you can comment on or make changes to this bug.