Bug 128215

Summary: Extreme network slowdown (NOT tcp_window_scaling)
Product: [Fedora] Fedora Reporter: Tim Waugh <twaugh>
Component: kernelAssignee: David Miller <davem>
Status: CLOSED RAWHIDE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: bnocera, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.7-1.499 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-07-29 16:12:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Fix for TCP window wrapping bug none

Description Tim Waugh 2004-07-20 09:18:20 UTC
Description of problem:
I'm seeing extreme network slowdowns for some connections.

x86_64 (192.168.1.6)
2.6.7-1.488
 |
 | 10Mbps Ethernet
 |
HUB (IBM)
 |
 | 10Mbps Ethernet
 |
i686 (192.168.1.1)
2.6.7-1.492

192.168.1.6 is running 2.6.7-1.488 instead of 492 because 492 causes
its USB mouse not to work.

The TCP connection is initiated from 192.168.1.6 and is a VNC session
over SSH.  Everything goes well until suddenly 192.168.1.6 drops its
TCP window from 65k to 128 bytes -- and 192.168.1.1 is pausing for
1/20th of a second before sending each packet.

Version-Release number of selected component (if applicable):
2.6.7-1.488

How reproducible:
Enough to prevent me using 192.168.1.6 as a terminal.  It doesn't seem
to be something I can trigger, but just sometimes happens.

Here is some of the captured traffic -- I have the full connection
from 'tcpdump -w' if you'd like it.

09:45:47.359735 IP 192.168.1.6.32773 > 192.168.1.1.ssh: S
1369854677:1369854677(0) win 5840 <mss 1460,sackOK,timestamp 229918
0,nop,wscale 0>
09:45:47.360167 IP 192.168.1.1.ssh > 192.168.1.6.32773: S
1566339641:1566339641(0) ack 1369854678 win 5792 <mss
1460,sackOK,timestamp 87666523 229918>
09:45:47.360200 IP 192.168.1.6.32773 > 192.168.1.1.ssh: . ack 1 win
5840 <nop,nop,timestamp 229918 87666523>
09:45:47.443991 IP 192.168.1.1.ssh > 192.168.1.6.32773: P 1:26(25) ack
1 win 5792 <nop,nop,timestamp 87666607 229918>
[...]
09:47:35.399458 IP 192.168.1.6.32773 > 192.168.1.1.ssh: P
39777:39825(48) ack 1528466 win 65160 <nop,nop,timestamp 337971 87774586>
09:47:35.399901 IP 192.168.1.1.ssh > 192.168.1.6.32773: . ack 39825
win 9120 <nop,nop,timestamp 87774587 337971>
09:47:35.446854 IP 192.168.1.1.ssh > 192.168.1.6.32773: P
1528466:1529330(864) ack 39825 win 9120 <nop,nop,timestamp 87774634
337971>
09:47:35.447117 IP 192.168.1.6.32773 > 192.168.1.1.ssh: P
39825:39873(48) ack 1529330 win 128 <nop,nop,timestamp 338019 87774634>
09:47:35.447703 IP 192.168.1.1.ssh > 192.168.1.6.32773: . ack 39873
win 9120 <nop,nop,timestamp 87774635 338019>
09:47:35.702146 IP 192.168.1.1.ssh > 192.168.1.6.32773: P
1529330:1529458(128) ack 39873 win 9120 <nop,nop,timestamp 87774890
338019>
09:47:35.741505 IP 192.168.1.6.32773 > 192.168.1.1.ssh: . ack 1529458
win 128 <nop,nop,timestamp 338314 87774890>
[...]
09:47:37.349815 IP 192.168.1.6.32773 > 192.168.1.1.ssh: . ack 1530498
win 128 <nop,nop,timestamp 339922 87776538>
09:47:37.562882 IP 192.168.1.1.ssh > 192.168.1.6.32773: P
1530498:1530626(128) ack 40305 win 9120 <nop,nop,timestamp 87776751
339922>
09:47:37.562906 IP 192.168.1.6.32773 > 192.168.1.1.ssh: . ack 1530626
win 128 <nop,nop,timestamp 340135 87776751>
09:47:37.773680 IP 192.168.1.1.ssh > 192.168.1.6.32773: P
1530626:1530754(128) ack 40305 win 9120 <nop,nop,timestamp 87776962
340135>
09:47:37.773705 IP 192.168.1.6.32773 > 192.168.1.1.ssh: . ack 1530754
win 128 <nop,nop,timestamp 340346 87776962>
09:47:37.983735 IP 192.168.1.1.ssh > 192.168.1.6.32773: P
1530754:1530882(128) ack 40305 win 9120 <nop,nop,timestamp 87777172
340346>
09:47:37.983758 IP 192.168.1.6.32773 > 192.168.1.1.ssh: . ack 1530882
win 128 <nop,nop,timestamp 340556 87777172>
09:47:38.123170 IP 192.168.1.6.32773 > 192.168.1.1.ssh: P
40305:40353(48) ack 1530882 win 128 <nop,nop,timestamp 340695 87777172>
[...]

Note that 192.168.1.1 is delaying by 0.2s for each packet by this
point, and yet netstat -ntoc output looked like this:

Proto Recv-Q Send-Q Local Address           Foreign Address        
State Timer
tcp    57030      0 127.0.0.1:34985         127.0.0.1:5901         
ESTABLISHED off (0.00/0/0)
tcp     2886  98304 127.0.0.1:5901          127.0.0.1:34985        
ESTABLISHED unkn-4 (5.26/0/0)
tcp        0  58528 ::ffff:192.168.1.1:22   ::ffff:192.168.1.:32773
ESTABLISHED unkn-4 (0.08/0/0)
[...]
Proto Recv-Q Send-Q Local Address           Foreign Address        
State       Timer
tcp    57030      0 127.0.0.1:34985         127.0.0.1:5901         
ESTABLISHED off (0.00/0/0)
tcp     2886  98304 127.0.0.1:5901          127.0.0.1:34985        
ESTABLISHED unkn-4 (4.25/0/0)
tcp        0  57888 ::ffff:192.168.1.1:22   ::ffff:192.168.1.:32773
ESTABLISHED unkn-4 (0.07/0/0)
Proto Recv-Q Send-Q Local Address           Foreign Address        
State       Timer
tcp    57030      0 127.0.0.1:34985         127.0.0.1:5901         
ESTABLISHED off (0.00/0/0)
tcp     2886  98304 127.0.0.1:5901          127.0.0.1:34985        
ESTABLISHED unkn-4 (3.24/0/0)
tcp        0  57208 ::ffff:192.168.1.1:22   ::ffff:192.168.1.:32773
ESTABLISHED unkn-4 (0.07/0/0)
[...]

In other words, it was not an application delay but a kernel delay.

192.168.1.1 has a National Semi|DP83815 (MacPhyter).
192.168.1.6 has an nVidia Corporation|nForce3.

Comment 1 Tim Waugh 2004-07-20 11:20:56 UTC
This also happens when 192.168.1.1 is running 2.6.6-1.435 (and
196.168.1.6 is still running 2.6.7-1.488).

Comment 2 David Miller 2004-07-20 18:21:59 UTC
What you are seeing is "window scaling", that is why the advertised
window drops right after connection setup.  This is a standard TCP
feature that has been around for years, and now Linux is advertising
a window scale of "7" by default.

Some device (router or NAT box usually, some cases it is a DSL box
doing port forwarding) is corrupting the connection due to the
presence of window scaling.

It is not a Linux bug.


Comment 3 Tim Waugh 2004-07-21 08:37:56 UTC
I don't think that's it, because the tcpdump capture was performed
*on* 192.168.1.6, and so the 192.168.1.6 packets are accurate.  I know
about the window scaling problem with some hardware, and that's why I
reported this separately.  This looks different.

The packets where the problem happens are these four:

09:47:35.399458 IP 192.168.1.6.32773 > 192.168.1.1.ssh: P
39777:39825(48) ack 1528466 win 65160 <nop,nop,timestamp 337971 87774586>

Here the window is advertised as 65160 *bytes* -- there is no wscale
option in this packet.

09:47:35.399901 IP 192.168.1.1.ssh > 192.168.1.6.32773: . ack 39825
win 9120 <nop,nop,timestamp 87774587 337971>
09:47:35.446854 IP 192.168.1.1.ssh > 192.168.1.6.32773: P
1528466:1529330(864) ack 39825 win 9120 <nop,nop,timestamp 87774634
337971>

There is nothing unusual about the reverse traffic.

09:47:35.447117 IP 192.168.1.6.32773 > 192.168.1.1.ssh: P
39825:39873(48) ack 1529330 win 128 <nop,nop,timestamp 338019 87774634>

Suddenly the window is 128 *bytes* -- again, there is no wscale option
here.  These packets were captured *on* 192.168.1.6 using 'tcpdump -w
capture -i eth0'.

Comment 4 Tim Waugh 2004-07-21 08:44:26 UTC
> and now Linux is advertising
> a window scale of "7" by default.

Incidentally, look at the trace: wscale 0 appears in the first SYN,
not wscale 7.  In fact, there are no wscale options in the entire
thing other than that initiating SYN.

Full pcap file available if you want it.

Comment 5 Tim Waugh 2004-07-21 09:18:45 UTC
I have also tried disabling tcp_window_scaling on both machines, but
still have similar problems: at some point during the connection
packets will

(a) fragmented due to a small window size (1072 on 192.168.1.6 running
2.6.7-1.488 compared to 23168 on 192.168.1.1 running 2.6.6-1.435)

(b) delayed at 192.168.1.1, presumably due to it filling up the
advertised window on 192.168.1.6.

Again netstat -ntoc on 192.168.1.1 shows a send buffer slowly emptying.

So this can't be a window scaling problem alone.

Comment 6 Tim Waugh 2004-07-21 11:46:28 UTC
Fixing platform field.

Comment 7 Tim Waugh 2004-07-21 13:42:57 UTC
This seems to be linked to tcp_moderate_rcvbuf.  I haven't seen a
repeat of the problem since setting it to zero.

Comment 8 David Miller 2004-07-21 23:36:01 UTC
First of all, window scale options don't go into the
individual packets.  They only appear in the initial
SYN/SYN+ACK exchange, and then if enabled they apply
for the entirety of the connection.

What you will see happen is that the end running the
newer 2.6.x kernels will advertise a window scale of
"7" and the other end (something running 2.4.x for
example) will use a window scale of "0".

Please do an experiment for me, set "tcp_default_win_scale" to
zero on the 2.6.x system(s).  Let me know what happens.
I hope this isn't some x86_64 weird miscompilation issue.


Comment 9 Tim Waugh 2004-07-22 08:59:18 UTC
They are both runing 2.6.x.  I've observed the problem symptoms
several times this morning, and this is the state of things:

x86_64 (192.168.1.6)
2.6.7-1.488
tcp_moderate_rcvbuf=1
tcp_window_scaling=1
tcp_default_win_scale=0

i686 (192.168.1.1)
2.6.7-1.435
tcp_moderate_rcvbuf=0
tcp_window_scaling=0
tcp_default_win_scale=0

I've now changed tcp_moderate_rcvbuf to 0 on 192.168.1.6 so I can do
some work.  I have never seen the problem symptoms with
tcp_moderate_rcvbuf=0 on 192.168.1.6.

Comment 10 David Miller 2004-07-22 20:53:43 UTC
Thanks to your excellent reporting I think I know where the
problem is, we're letting the window overflow 16-bits somewhere.
I'll try to figure out what check is missing.


Comment 11 David Miller 2004-07-22 21:12:34 UTC
Created attachment 102158 [details]
Fix for TCP window wrapping bug

This should fix it.  We were only validating the
range of the window field on connection startup,
not in tcp_select_window() where it needs to happen
as well.

Comment 12 Tim Waugh 2004-07-26 15:05:10 UTC
This patch certainly fixes the problem here.  Thanks!