Bug 1803806 - TCP small-write output delay
Summary: TCP small-write output delay
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 31
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-17 13:44 UTC by Jeremy Harris
Modified: 2020-11-24 17:07 UTC (History)
17 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-11-24 17:07:57 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Packet capture showing the issue (9.62 KB, application/octet-stream)
2020-02-17 13:44 UTC, Jeremy Harris
no flags Details
Packet capture with workaround (8.64 KB, application/octet-stream)
2020-02-17 13:45 UTC, Jeremy Harris
no flags Details
kernel log since boot (94.94 KB, text/plain)
2020-02-17 13:46 UTC, Jeremy Harris
no flags Details

Description Jeremy Harris 2020-02-17 13:44:41 UTC
Created attachment 1663516 [details]
Packet capture showing the issue

1. Please describe the problem:

Small (< 1MSS) writes/sends are delayed by about 200ms.

This is with TCP_NODELAY set, and with tcp_autocork set to either 0 or 1.  If a TCP_CORK/off setsockopt is done immediately after the send() call, the
segment with the data does go out promptly (see attached packet captures: "plain" without the additional setsockopt, "uncork" with).

Application-level debug timestamps show the send() call being made promptly, and the ~200ms delay appearing before data arrives to read (per. a select() call).

It feels like an autocork operation, but... why?


2. What is the Version-Release number of the kernel:

  5.4.18-200.fc31.x86_64


3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

  Unknown

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

  100% in test here.
  - application is doing request/response traffic, with small packet sizes, on a loopback TCP connection
  - observe with wireshark

  See sample pcaps attached; traffic is SMTP (on a nonstandard port).  Frames 14, 18 in "plain" are the smoking gun.

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

   TBC

6. Are you running any modules that not shipped with directly Fedora's kernel?:

   None known

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Comment 1 Jeremy Harris 2020-02-17 13:45:51 UTC
Created attachment 1663518 [details]
Packet capture with workaround

Comment 2 Jeremy Harris 2020-02-17 13:46:29 UTC
Created attachment 1663519 [details]
kernel log since boot

Comment 3 Jeremy Harris 2020-02-18 19:48:11 UTC
The effect of the issue is, for a testcase of about 85 short SMTP messages going down one SMTP connection:
20 seconds wallclock time with the bug versus 4.3 seconds with the workaround (x86_64 laptop, loopback).

Comment 4 Justin M. Forbes 2020-03-03 16:27:22 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 31 kernel bugs.

Fedora 31 has now been rebased to 5.5.7-200.fc31.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 32, and are still experiencing this issue, please change the version to Fedora 32.

If you experience different issues, please open a new bug report for those.

Comment 5 Jeremy Harris 2020-03-04 12:55:35 UTC
Problem still present on 5.5.7-200.fc31.x86_64

Comment 6 Jeremy Harris 2020-05-08 13:05:15 UTC
Still present in 5.6.8-200.fc31.x86_64
Also present in 5.6.8-300.fc32.x86_64

In 4.13.16-100.fc25.x86_64 the extraneous delay seems to be about 120ms; not quite so bad but still unpleasant.
In 3.14.27-100.fc19.x86_64 it is 200ms again.
So I'd not say this is a regression - but it does have major impact on request/response traffic.

Comment 7 Ben Cotton 2020-11-03 17:08:04 UTC
This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 8 Ben Cotton 2020-11-24 17:07:57 UTC
Fedora 31 changed to end-of-life (EOL) status on 2020-11-24. Fedora 31 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.