Bug 62500 - write() to socket doesn't set PSH flag on packets = MTU
Summary: write() to socket doesn't set PSH flag on packets = MTU
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Linux
Classification: Retired
Component: kernel
Version: 7.1
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Arjan van de Ven
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-04-01 21:01 UTC by John Dalbec
Modified: 2007-04-18 16:41 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2002-06-14 12:25:20 UTC
Embargoed:


Attachments (Terms of Use)
Patch to set the PUSH flag on the last packet of outgoing TCP messages even when packet size = MSS (308 bytes, patch)
2002-06-13 18:55 UTC, John Dalbec
no flags Details | Diff
Patch which is installed and fixed the PSH problem. (1.09 KB, patch)
2002-06-14 12:25 UTC, David Miller
no flags Details | Diff

Description John Dalbec 2002-04-01 21:01:37 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.79 [en] (Windows NT 5.0; U)

Description of problem:
Calling write() on a TCP/IP socket causes the PSH flag not to be set on packets of length equal to the MTU.  This is not a problem with many TCP/IP stacks and is not a 
problem unless all the packets sent by the write() have length equal to the MTU.  My specific problem is that the exim mailer (3.34) cannot deliver emails of length 1448 
bytes or 2896 bytes to SMTP running on IBM VM/ESA 2.4 because they fit into one or two packets of length equal to the MTU.  Apparently the VM TCP/IP stack will 
queue locally destined packets up to a total of 4096 bytes unless it sees a PSH flag.

Version-Release number of selected component (if applicable):
2.4.9-31


How reproducible:
Always

Steps to Reproduce:
1. Run tcpdump.
2. Send a message via exim.  You may need to adjust the message length to get it to fit exactly in a number of packets equal to the MTU.  Make sure you're not using 
TLS.


Actual Results:  The packets containing the message data do not have the PSH flag set.

Expected Results:  At least the last data packet should have the PSH flag set.

Additional info:

See RFC 793.  I would argue that the write() interface is equivalent to an implementation of send() without a push flag for the purposes of interpreting the RFC.

Comment 1 Arjan van de Ven 2002-04-01 21:24:40 UTC
This is a bug in the VM tcp/ip stack; please ask IBM for the hotfix for this.

Comment 2 John Dalbec 2002-04-01 21:41:40 UTC
Do you know of a hotfix for this or do you want me to ask them for one?  When we consulted IBM, they said this was a Linux bug.  Can you help me find evidence to 
the contrary?

Comment 3 Arjan van de Ven 2002-04-01 21:52:33 UTC
The TCP RFSs state rather clearly that the user should not be made to wait
indefinitly for queued recieved data just because PSH is not set. PSH is advisory.

Comment 4 Arjan van de Ven 2002-04-01 21:54:24 UTC
s/RFS/RFC/

Comment 5 John Dalbec 2002-04-02 16:20:17 UTC
I don't know my way around the RFCs.  :-(  Could you give me an RFC number and section?

Comment 6 Arjan van de Ven 2002-04-03 11:53:54 UTC
for example:
rfc793 section 2.8:
  There is no necessary relationship between push functions and segment
  boundaries.  The data in any particular segment may be the result of a
  single SEND call, in whole or part, or of multiple SEND calls.

  The purpose of push function and the PUSH flag is to push data through
  from the sending user to the receiving user.  It does not provide a
  record service.

  There is a coupling between the push function and the use of buffers
  of data that cross the TCP/user interface.  Each time a PUSH flag is
  associated with data placed into the receiving user's buffer, the
  buffer is returned to the user for processing even if the buffer is
  not filled.  If data arrives that fills the user's buffer before a
  PUSH is seen, the data is passed to the user in buffer size units.



while this states the behavior of what to do WHEN you get a PUSH, it also states
that there doesn't need to be a relation between segment boundaries and PUSHes.
Also NOWHERE does it say you MUST send a PUSH (other than in final packets and
urgent packets)....

Comment 7 John Dalbec 2002-04-03 14:51:52 UTC
RFC 1122, 4.2.2.2:

            A TCP MAY implement PUSH flags on SEND calls.  If PUSH flags
            are not implemented, then the sending TCP: (1) must not
            buffer data indefinitely, and (2) MUST set the PSH bit in
            the last buffered segment (i.e., when there is no more
            queued data to be sent).

Does Linux implement PUSH flags on SEND calls?  I can't find any userland settings for this.

Comment 8 Michael K. Johnson 2002-04-03 15:13:00 UTC
Dave, can you reconcile these two quotes in context?

Comment 9 David Miller 2002-04-03 16:46:41 UTC
It looks like indeed we are required to fix this.
I'll cook up a patch.


Comment 10 John Dalbec 2002-04-04 18:52:50 UTC
Since davem acknowledges that there is a problem, I'm reopening the bug.

Comment 11 John Dalbec 2002-06-13 18:55:07 UTC
Created attachment 60886 [details]
Patch to set the PUSH flag on the last packet of outgoing TCP messages even when packet size = MSS

Comment 12 David Miller 2002-06-14 04:26:10 UTC
We already installed a fix into our kernel sources, there is no need
for you to provide a new one and this bug should be closed.


Comment 13 John Dalbec 2002-06-14 12:07:49 UTC
This bug is *not* fixed in kernel 2.4.9-34;  I had to install a kernel with this patch to fix it.
It looks like you fixed do_tcp_sendpages but tcp_sendmsg needs to have the same fix applied.

Comment 14 David Miller 2002-06-14 12:25:15 UTC
Created attachment 60965 [details]
Patch which is installed and fixed the PSH problem.

Comment 15 David Miller 2002-06-14 12:27:06 UTC
I attached the patch which is installed and fixes the problem.
As you can clearly see, it modifies tcp_sendmsg which is where
the problem is. It does not change do_tcp_sendpages, that case
actually got this bit right.

And if you look it is nearly identical to your patch.

It is in our tree, I remember sending this out to Arjan several
times.


Comment 16 John Dalbec 2002-06-14 12:38:32 UTC
OK, so this will be in the next errata kernel (whenever that comes out...)?


Note You need to log in before you can comment on or make changes to this bug.