Bug 154040 - Kernel RPC client doesn't reuse TCP port when reconnecting
Kernel RPC client doesn't reuse TCP port when reconnecting
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
high Severity high
: ---
: ---
Assigned To: Ric Wheeler
Brian Brock
: Reopened
Depends On:
Blocks: 430698
  Show dependency treegraph
Reported: 2005-04-06 14:31 EDT by Chuck Lever
Modified: 2010-03-16 13:41 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2010-03-16 13:41:27 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Chuck Lever 2005-04-06 14:31:42 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050225 Firefox/1.0.1

Description of problem:
NFS servers use a duplicate reply cache to detect client retransmissions and minimize the risk of replaying non-idempotent RPC requests.  Such reply caches are typically keyed on client IP address, client port, and RPC XID.

If an NFS server or the network causes an RPC over TCP socket to be dropped, the Linux RPC client does not reconnect to the server using the same port number.  This means that any items in the server's DRC that are keyed to the old port number are no longer usable, and the server can't detect replays on the new connection.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.  Start an I/O workload on an NFS client (TCP mount)
2.  Note the TCP connection information via "netstat"
3.  Cause the connection to drop (temporary server or network outage)
4.  Note the TCP connection again

Actual Results:  The client reconnects to the server using a different port on the new connection than was used on the old one.

Expected Results:  The client should reconnect to the server using the same port, regardless of TCP TIME_WAIT requirements.  (Note that this is the behavior exhibited by the NFS/RPC reference implementation on Solaris).

Additional info:

This is a potential data corruption bug, so I am marking the severity of this bugzilla "High."  This is a problem in all releases of RHEL.

Note that the "expected behavior" is the same as exhibited by the NFS/RPC reference implementation on Solaris.

I am working on a fix for 2.6 mainline.
Comment 3 RHEL Product and Program Management 2007-05-09 07:25:38 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 12 RHEL Product and Program Management 2008-08-02 22:27:41 EDT
Product Management has reviewed and declined this request.  You may appeal this
decision by reopening this request.

Note You need to log in before you can comment on or make changes to this bug.