Description of problem: I set up my FC3 machine as my firewalling gateway when a previous machine died. Network throughput seemed poor, and some sites as good as stopped working altogether. Examining a tcpdump of the poor connections showed something very odd: incoming packets were frequently getting dropped. But if I connect from a machine _behind_ the firewall, the incoming packets don't get dropped. I have a reliable broadband connection. TCP pushes seem to get through okay. It appears to be when we've got past slow start, and the remote host tries to send multiple packets at once, that the packets get dropped. Here's a sample of a tcpdump from a connection to a remote news server, requesting the news active file, just with "telnet news.individual.net nntp". The push at the top is me sending the "list active" line: 00:51:19.644839 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: P 2552681086:2552681099(13) ack 3452392115 win 6432 <nop,nop,timestamp 273176760 3387426> 00:51:19.714611 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.718526 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . 1:1449(1448) ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.718652 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: . ack 1449 win 8688 <nop,nop,timestamp 273176834 3387447> 00:51:19.727559 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . 1449:2897(1448) ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.727690 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: . ack 2897 win 11584 <nop,nop,timestamp 273176843 3387447> 00:51:19.730451 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . 4097:5545(1448) ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.730566 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: . ack 2897 win 11584 <nop,nop,timestamp 273176846 3387447,nop,nop,sack sack 1 {4097:5545} > 00:51:19.772992 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . 5545:6993(1448) ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.773116 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: . ack 2897 win 11584 <nop,nop,timestamp 273176888 3387447,nop,nop,sack sack 1 {4097:6993} > 00:51:19.778491 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . 8441:9889(1448) ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.778618 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: . ack 2897 win 11584 <nop,nop,timestamp 273176894 3387447,nop,nop,sack sack 2 {8441:9889}{4097:6993} > 00:51:19.794663 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . 9889:11337(1448) ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.794764 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: . ack 2897 win 11584 <nop,nop,timestamp 273176910 3387447,nop,nop,sack sack 2 {8441:11337}{4097:6993} > 00:51:19.803763 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . 11337:12785(1448) ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.803809 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: . ack 2897 win 11584 <nop,nop,timestamp 273176919 3387447,nop,nop,sack sack 2 {8441:12785}{4097:6993} > 00:51:19.853043 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145: . 2897:4345(1448) ack 13 win 49152 <nop,nop,timestamp 3387447 273176760> 00:51:19.853102 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.58145 > individual.net.nntp: . ack 6993 win 14480 <nop,nop,timestamp 273176968 3387447,nop,nop,sack sack 2 {4097:4345}{8441:12785} > And so on. This particular remote server plays very poorly with TCP retransmits later on, and assumes the worst with congestion, and eventually the 250KiB or so active file download times out after 15 minutes. This site is worse than most, but from what I can see, it affects all connections. Here is the same operation, performed by a (Windows) machine behind the same gateway with no configuration changes: 00:53:22.108688 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: P 11:13(2) ack 1 win 17265 00:53:22.151819 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . ack 13 win 49152 00:53:22.155356 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 1:1461(1460) ack 13 win 49152 00:53:22.156509 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: P 2921:4097(1176) ack 13 win 49152 00:53:22.158246 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: . ack 1461 win 17520 <nop,nop,sack sack 1 {2921:4097} > 00:53:22.161174 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 1461:2921(1460) ack 13 win 49152 00:53:22.163170 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: . ack 4097 win 17520 00:53:22.166279 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 4097:5557(1460) ack 13 win 49152 00:53:22.172699 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 5557:7017(1460) ack 13 win 49152 00:53:22.174601 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: . ack 7017 win 17520 00:53:22.179170 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 7017:8477(1460) ack 13 win 49152 00:53:22.184477 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 8477:9937(1460) ack 13 win 49152 00:53:22.186498 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: . ack 9937 win 17520 00:53:22.204593 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 9937:11397(1460) ack 13 win 49152 00:53:22.206028 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 11397:12857(1460) ack 13 win 49152 00:53:22.208011 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: . ack 12857 win 17520 00:53:22.210327 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 12857:14317(1460) ack 13 win 49152 00:53:22.215131 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 14317:15777(1460) ack 13 win 49152 00:53:22.217085 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: . ack 15777 win 16900 00:53:22.221728 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 15777:17237(1460) ack 13 win 49152 00:53:22.224848 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 17237:18697(1460) ack 13 win 49152 00:53:22.226741 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: . ack 18697 win 13980 00:53:22.232569 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 18697:20157(1460) ack 13 win 49152 00:53:22.237553 IP individual.net.nntp > cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720: . 20157:21617(1460) ack 13 win 49152 00:53:22.239545 IP cpc4-cmbg5-3-0-cust166.cmbg.cable.ntl.com.4720 > individual.net.nntp: . ack 21617 win 11060 As you can see, there's one packet lost, but that's just an abberant glitch. The download proceeds happily with no further loss. Note that the point that things go wrong in the first dump is the point in the second dump when two packets get sent by the remote host consecutively. Disabling SACKs makes no difference. I have tried adding a blanket allowance for news.individual.net as an experiment: iptables -I INPUT 1 -m tcp -p tcp -s individual.net -j ACCEPT but that made no difference. I can send my /etc/sysconfig/iptables if it would be helpful, although I am reluctant to do that for security reasons. I even briefly disabled my iptables rules completely to see if that helped, but it didn't. And lsmod showed the ipt* modules had been unloaded. Just ask if there is more I can do to debug the problem, although I don't have any easy means to spy externally on what gets sent between the gateway machine and the firewall router as I have no spare hub. I may be able to get one. I would be surprised if it didn't show the packets coming in though. I haven't seen any reports of anything similar, which seems all the odder. Version-Release number of selected component (if applicable): 2.6.12-1.1376_FC3 for athlon
Created attachment 118829 [details] tcpdump showing packet loss from direct connection to remote host
Created attachment 118830 [details] tcpdump showing no real packet loss from connection behind firewall to remote host
Those tcpdumps formatted really badly inline. I have attached them as files to make them easier to read.
I have now determined that the NIC was getting CRC errors, but it was a card specific issue - faulty hardware. Sigh. Not sure why it manifested differently between NAT and not, but a switch to another NIC fixed the issue. Closing.