Red Hat Bugzilla – Bug 201049
nbd-server on RHEL4U3 kernel fails to accept nbd-client connection
Last modified: 2007-11-30 17:07:26 EST
Description of problem:
When running the nbd-client and nbd-server on separate RHEL4U3 2.6.9-34.ELsmp
based servers the nbd-client connection to an nbd-server fails to negotiate.
There is an active thread on the nbd-general mailing list that speaks to this
issue with the nbd-server. The relevant threads can be found here:
The executive summary of those threads is: the nbd-server's accept() is hanging,
leaving the nbd-client's negotitation hanging. The concern is that the kernel
is what is causing this bug in nbd's user-space socket code.
When a stock kernel.org kernel >= 2.6.15 is used on the nbd-server system the
hanging nbd-client negotiation with the nbd-server has not been reproduced.
Version-Release number of selected component (if applicable):
This nbd-server accept() hang happens frequently in production on numerous
servers but a reliable synthetic reproducer test has not been identified.
Steps to Reproduce:
1. start nbd-server (be it version 2.7.3 thru 2.8.5) on a server running
2. make nbd-client connection to nbd-server while server is busy with moderate IO
3. rebooting nbd-client system (or restarting nbd-client) eventually causes the
nbd-client's negociation to hang
nbd-client will eventually hang waiting for nbd-server's accept() with the
the nbd-client should complete the connection to the nbd-server with something like:
Negotiation: ..size = 102400KB
This thread that you provided :
Seems to give the answer that you need already. This isn't a bug in the kernels
tcp accept code, but rather an application bug. You need to use version 2.8.6
or later of the ndb server code. Upgrade your nbd software and try again.