Description of problem: With the recent kernel upgrade to 4.19.5-200.fc28.x86_64 a TX Errors problem appeared with the module "r8169.ko". We have default MTU=9000 setting for all our kickstart installs. Unfortunately, below given lspci id ethernet hardware began to fail consistently when MTU is set to 9000. 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06) If I manually set "ifconfig enp3s0 mtu 1500" no TX errors appear in ifconfig. If MTU is set to 8192, still TX errors occur, however at much lower rate. At 9000 it becomes a blocker. Version-Release number of selected component (if applicable): Kernel 4.19.5-200.fc28.x86_64 - Module r8169 How reproducible: kernel-4.18.18-200.fc28.x86_64 did not have this problem. On many machines, upgrading kernel cause MTU=9000 TX Errors problem. Steps to Reproduce: 1. Start with kernel 4.18 2. Set enp3s0 to MTU=9000 using ifconfig. 3. Test with iperf3 4. Upgrade kernel from 4.18 to 4.19 5. Test again with iperf3 Actual results: This is the iperf3 result that I get with MTU=9000 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 463 KBytes 3.79 Mbits/sec 12 8.74 KBytes [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 13 8.74 KBytes [ 5] 2.00-3.00 sec 122 KBytes 1.00 Mbits/sec 9 8.74 KBytes [ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes [ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 4 8.74 KBytes [ 5] 5.00-6.00 sec 122 KBytes 1.00 Mbits/sec 3 8.74 KBytes [ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes [ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes [ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 4 8.74 KBytes [ 5] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 708 KBytes 578 Kbits/sec 49 sender ----------------------------------------------------------- # ifconfig enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.203.54 netmask 255.255.255.0 broadcast 192.168.203.255 inet6 fe80::b6b5:2fff:fedd:11f9 prefixlen 64 scopeid 0x20<link> ether b4:b5:2f:dd:11:f9 txqueuelen 1000 (Ethernet) RX packets 1253998 bytes 1259641265 (1.1 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 2512095 bytes 4874193021 (4.5 GiB) TX errors 79 dropped 0 overruns 0 carrier 0 collisions 0 Expected results: "TX Errors 0" at ifconfig and Transfer Bitrates about 400-600Mbits/s at least.
Hi, The first thing to do is make sure it's still a problem in the latest Rawhide kernels. If it is, the issue should be reported upstream. The folks/lists to email are: * Realtek linux nic maintainers <nic_swsd> * Heiner Kallweit <hkallweit1> * "David S. Miller" <davem> * netdev.org * linux-kernel.org Please feel free to Cc me (jcline) as well.
Hello Jeremy, Today, I tried same tthings with rawhide's kernel. I can confirm that same behaviour still is exhibited with kernel 4.20.0-0.rc6.git2.1.fc30.x86_64 from rawhide repository. So I lowered MTU value back to 1500.
What is the exact network chip model (output of dmesg | grep XID) ?
What you also could do to check whether issue is with the r8169 driver or somewhere else in the network stack: Replace r8169 driver in 4.19 with the one from 4.18 and build a kernel. Then check whether issue is still there.
Hello Heiner, I tried your suggestions. Results are as follows: # dmesg | grep XID [ 2.385607] r8169 0000:03:00.0 eth0: RTL8168e/8111e, b4:b5:2f:dd:11:f9, XID 2c200000, IRQ 27 -------------------------------------------------------------------------------- I replaced rawhide 4.20 kernel's r8169 module with 4.18's. Kernel and replaced module details are as follows: # uname -a Linux pc 4.20.0-0.rc6.git2.1.fc30.x86_64 #1 SMP Thu Dec 13 22:58:18 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux # modinfo r8169 filename: /lib/modules/4.20.0-0.rc6.git2.1.fc30.x86_64/kernel/drivers/net/ethernet/realtek/r8169.ko.xz firmware: rtl_nic/rtl8107e-2.fw firmware: rtl_nic/rtl8107e-1.fw firmware: rtl_nic/rtl8168h-2.fw firmware: rtl_nic/rtl8168h-1.fw firmware: rtl_nic/rtl8168g-3.fw firmware: rtl_nic/rtl8168g-2.fw firmware: rtl_nic/rtl8106e-2.fw firmware: rtl_nic/rtl8106e-1.fw firmware: rtl_nic/rtl8411-2.fw firmware: rtl_nic/rtl8411-1.fw firmware: rtl_nic/rtl8402-1.fw firmware: rtl_nic/rtl8168f-2.fw firmware: rtl_nic/rtl8168f-1.fw firmware: rtl_nic/rtl8105e-1.fw firmware: rtl_nic/rtl8168e-3.fw firmware: rtl_nic/rtl8168e-2.fw firmware: rtl_nic/rtl8168e-1.fw firmware: rtl_nic/rtl8168d-2.fw firmware: rtl_nic/rtl8168d-1.fw version: 2.3LK-NAPI license: GPL description: RealTek RTL-8169 Gigabit Ethernet driver author: Realtek and the Linux r8169 crew <netdev.org> srcversion: 1E60506EAE4F9AE11D6E4D4 alias: pci:v00000001d00008168sv*sd00002410bc*sc*i* alias: pci:v00001737d00001032sv*sd00000024bc*sc*i* alias: pci:v000016ECd00000116sv*sd*bc*sc*i* alias: pci:v00001259d0000C107sv*sd*bc*sc*i* alias: pci:v00001186d00004302sv*sd*bc*sc*i* alias: pci:v00001186d00004300sv*sd*bc*sc*i* alias: pci:v00001186d00004300sv00001186sd00004B10bc*sc*i* alias: pci:v000010ECd00008169sv*sd*bc*sc*i* alias: pci:v000010FFd00008168sv*sd*bc*sc*i* alias: pci:v000010ECd00008168sv*sd*bc*sc*i* alias: pci:v000010ECd00008167sv*sd*bc*sc*i* alias: pci:v000010ECd00008161sv*sd*bc*sc*i* alias: pci:v000010ECd00008136sv*sd*bc*sc*i* alias: pci:v000010ECd00008129sv*sd*bc*sc*i* depends: mii retpoline: Y intree: Y name: r8169 vermagic: 4.18.12-200.fc28.x86_64 SMP mod_unload sig_id: PKCS#7 signer: sig_key: sig_hashalgo: md4 signature: 30:82:02:CF:06:09:2A:86:48:86:F7:0D:01:07:02:A0:82:02:C0:30: 82:02:BC:02:01:01:31:0D:30:0B:06:09:60:86:48:01:65:03:04:02: 01:30:0B:06:09:2A:86:48:86:F7:0D:01:07:01:31:82:02:99:30:82: 02:95:02:01:01:30:70:30:63:31:0F:30:0D:06:03:55:04:0A:0C:06: 46:65:64:6F:72:61:31:22:30:20:06:03:55:04:03:0C:19:46:65:64: 6F:72:61:20:6B:65:72:6E:65:6C:20:73:69:67:6E:69:6E:67:20:6B: 65:79:31:2C:30:2A:06:09:2A:86:48:86:F7:0D:01:09:01:16:1D:6B: 65:72:6E:65:6C:2D:74:65:61:6D:40:66:65:64:6F:72:61:70:72:6F: 6A:65:63:74:2E:6F:72:67:02:09:00:FA:51:0C:66:37:68:AF:A2:30: 0B:06:09:60:86:48:01:65:03:04:02:01:30:0D:06:09:2A:86:48:86: F7:0D:01:01:01:05:00:04:82:02:00:8D:FD:8B:D5:00:85:B7:E2:AA: 7D:05:A1:64:05:DD:4A:3B:75:FE:F1:C7:C1:8E:CC:49:8D:15:A2:45: EE:EA:51:7E:FE:91:92:15:0B:E1:DE:DF:8E:22:8E:97:E1:7B:AB:31: CF:C1:67:5A:7B:FB:30:E2:8C:0F:F6:67:97:4D:93:9F:23:3B:C1:73: 11:D5:9F:B7:76:F0:44:E4:4F:A5:EB:FF:0E:DD:47:09:D7:55:3A:27: 97:AA:0E:1D:75:87:08:84:E4:D0:F4:AE:F3:63:A1:02:2B:49:C5:EB: 30:03:7B:85:AE:A1:E8:D0:DA:71:EB:B7:7D:A6:98:35:D8:64:FE:E2: 6D:76:6A:7B:42:E3:62:EB:A4:89:7B:10:0C:3F:74:86:9D:06:45:1D: 71:D6:A9:EB:BD:15:17:7A:29:23:68:06:B6:83:F3:E7:86:61:CE:AF: 39:5B:B3:EE:29:B0:75:7E:1D:B5:F2:3B:F9:2C:A4:BF:2C:E0:EC:A2: A2:AD:D9:35:32:89:3A:7A:CD:27:20:11:49:F0:16:C0:82:CA:FE:CA: EF:CE:96:39:1D:5C:E0:6C:7D:EE:58:6C:5F:36:E1:A5:BF:9A:4A:50: 12:23:42:C0:C7:0E:0A:4B:8C:24:4C:1E:E3:BD:0E:6F:E0:27:32:C2: 55:3C:96:47:7A:B6:08:8F:CB:C8:B6:C4:79:72:40:B0:D1:BB:A5:7C: 0F:6E:99:46:97:69:BB:13:43:60:8C:3F:5E:90:A2:BE:DB:D7:5A:92: C4:5E:C8:95:7A:EC:C3:22:D4:E5:CE:C8:A0:73:33:09:3D:F9:CF:FE: 3F:53:52:20:EB:BF:29:BD:61:2E:55:B1:82:9C:D5:5B:71:57:67:A6: 8D:60:AD:0C:06:32:80:16:01:41:B2:E0:98:A4:06:33:58:46:49:E7: 08:FF:4D:CD:85:7E:C5:5C:43:D5:26:9B:F2:30:3F:97:65:46:56:D7: 05:F4:3E:E9:81:F6:BF:C1:0F:34:F5:ED:BC:AB:C3:12:F0:48:E9:5A: 0C:45:57:75:83:BE:B1:7C:46:40:BF:EE:59:07:1A:40:9E:8D:48:FA: 1F:D0:2D:AD:36:A2:0D:AB:B7:27:48:C2:51:DA:BE:C6:02:C9:F0:A9: DA:E6:43:2E:54:B5:BF:14:3C:E6:F8:FB:1C:46:B3:BB:A5:A7:11:D2: E3:BD:A0:34:63:ED:F0:20:69:57:3A:AE:6D:9D:43:D9:30:DA:8E:73: 47:3B:A0:6E:2A:DD:CA:ED:AC:80:DA:5D:6D:0C:87:D4:C5:65:37:A3: F2:80:22:60:4E:57:31:3F:20:7D:6D:44:0B:39:A9:D0:EB:33:B3:D2: 80:54:9C parm: use_dac:Enable PCI DAC. Unsafe on 32 bit PCI slot. (int) parm: debug:Debug verbosity level (0=none, ..., 16=all) (int) ------------------------------------------------------------------------------------------------- After replacing 4.20's r8169 module with 4.18's, ifconfig did not report a TX error. However, bitrates seems 50% slower than another reference system. This looks quirky, since I remember having higher values with this machine. Following is the iperf3 results (First RX then TX) of the test system: # iperf3 -s ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 108 MBytes 906 Mbits/sec [ 5] 1.00-2.00 sec 112 MBytes 941 Mbits/sec [ 5] 2.00-3.00 sec 112 MBytes 941 Mbits/sec [ 5] 3.00-4.00 sec 112 MBytes 941 Mbits/sec [ 5] 4.00-5.00 sec 112 MBytes 941 Mbits/sec [ 5] 5.00-6.00 sec 112 MBytes 941 Mbits/sec [ 5] 6.00-7.00 sec 112 MBytes 941 Mbits/sec [ 5] 7.00-8.00 sec 112 MBytes 941 Mbits/sec [ 5] 8.00-9.00 sec 112 MBytes 942 Mbits/sec [ 5] 9.00-10.00 sec 112 MBytes 941 Mbits/sec [ 5] 10.00-10.04 sec 4.28 MBytes 936 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.04 sec 1.10 GBytes 938 Mbits/sec receiver ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 43.5 MBytes 365 Mbits/sec 0 119 KBytes [ 5] 1.00-2.00 sec 44.7 MBytes 375 Mbits/sec 0 126 KBytes [ 5] 2.00-3.00 sec 44.7 MBytes 375 Mbits/sec 0 184 KBytes [ 5] 3.00-4.00 sec 44.9 MBytes 376 Mbits/sec 0 191 KBytes [ 5] 4.00-5.00 sec 44.3 MBytes 372 Mbits/sec 0 235 KBytes [ 5] 5.00-6.00 sec 45.0 MBytes 377 Mbits/sec 0 267 KBytes [ 5] 6.00-7.00 sec 44.2 MBytes 371 Mbits/sec 0 267 KBytes [ 5] 7.00-8.00 sec 44.2 MBytes 371 Mbits/sec 0 279 KBytes [ 5] 8.00-9.00 sec 44.2 MBytes 371 Mbits/sec 0 279 KBytes [ 5] 9.00-10.00 sec 45.2 MBytes 379 Mbits/sec 0 387 KBytes [ 5] 10.00-10.04 sec 1.62 MBytes 360 Mbits/sec 0 387 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 446 MBytes 373 Mbits/sec 0 sender # ifconfig enp3s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000 inet 192.168.203.54 netmask 255.255.255.0 broadcast 192.168.203.255 inet6 fe80::b6b5:2fff:fedd:11f9 prefixlen 64 scopeid 0x20<link> ether b4:b5:2f:dd:11:f9 txqueuelen 1000 (Ethernet) RX packets 1967046 bytes 2483699476 (2.3 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1589794 bytes 2176194825 (2.0 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ------------------------------------------------------------------------------------------------------------------- Kernel 4.18.12-200.fc28.x86_64 with MTU=9000 reference system's iperf3 results are as follows: # iperf3 -s ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 108 MBytes 904 Mbits/sec [ 5] 1.00-2.00 sec 112 MBytes 941 Mbits/sec [ 5] 2.00-3.00 sec 112 MBytes 941 Mbits/sec [ 5] 3.00-4.00 sec 112 MBytes 941 Mbits/sec [ 5] 4.00-5.00 sec 112 MBytes 942 Mbits/sec [ 5] 5.00-6.00 sec 112 MBytes 941 Mbits/sec [ 5] 6.00-7.00 sec 112 MBytes 941 Mbits/sec [ 5] 7.00-8.00 sec 112 MBytes 941 Mbits/sec [ 5] 8.00-9.00 sec 112 MBytes 941 Mbits/sec [ 5] 9.00-10.00 sec 112 MBytes 941 Mbits/sec [ 5] 10.00-10.04 sec 4.54 MBytes 941 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate [ 5] 0.00-10.04 sec 1.10 GBytes 938 Mbits/sec receiver ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 78.7 MBytes 660 Mbits/sec 0 228 KBytes [ 5] 1.00-2.00 sec 80.0 MBytes 671 Mbits/sec 0 228 KBytes [ 5] 2.00-3.00 sec 80.5 MBytes 675 Mbits/sec 0 250 KBytes [ 5] 3.00-4.00 sec 80.0 MBytes 671 Mbits/sec 0 273 KBytes [ 5] 4.00-5.00 sec 80.3 MBytes 674 Mbits/sec 0 288 KBytes [ 5] 5.00-6.00 sec 80.4 MBytes 675 Mbits/sec 0 288 KBytes [ 5] 6.00-7.00 sec 79.9 MBytes 670 Mbits/sec 0 304 KBytes [ 5] 7.00-8.00 sec 80.8 MBytes 678 Mbits/sec 0 304 KBytes [ 5] 8.00-9.00 sec 80.8 MBytes 678 Mbits/sec 0 304 KBytes [ 5] 9.00-10.00 sec 80.8 MBytes 678 Mbits/sec 0 304 KBytes [ 5] 10.00-10.04 sec 4.29 MBytes 924 Mbits/sec 0 448 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 807 MBytes 674 Mbits/sec 0 sender
Strange, so the issue doesn't seem to be only the drivers fault. Few more things to test: 1. Could you please dump the chip registers (ethtool -d <if>) under both kernel versions and check for differences? 2. I have only a spec for RTl8168b (which has a 7K jumbo limit) but according to this spec the MTPS value mutiplied by 128 determeiens the max tx packet size. With 0x3f this would be 8.064 bytes. You said with MTU 8.192 you still get few errors, how is it if you lower this to e.g. MTU 8.000? 3. The vendor driver uses a different setting for MTPS with jumbo packets. This shouldn't be an issue because the value of the mainline driver worked with 4.18 and is unchanged in 4.19. But just to test it: Could you please replace 0x3f with 0x24 in the following code piece and test? static void r8168e_hw_jumbo_enable(struct rtl8169_private *tp) { RTL_W8(tp, MaxTxPacketSize, 0x3f); RTL_W8(tp, Config3, RTL_R8(tp, Config3) | Jumbo_En0); RTL_W8(tp, Config4, RTL_R8(tp, Config4) | 0x01); rtl_tx_performance_tweak(tp, PCI_EXP_DEVCTL_READRQ_512B); } Most appreciated would be if you could bisect to find the commit which reduces performance with jumbo packets and/or causes the tx errors.
I am not adept at programming. I'll try my best.. 1.> Linux 4.20.0-0.rc6.git2.1.fc30.x86_64 #1 SMP Thu Dec 13 22:58:18 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux Left: "vermagic: 4.18.12-200.fc28.x86_64" Right: "vermagic: 4.20.0-0.rc6.git2.1.fc30.x86_64" #diff ethtool.418 ethtool.420 -y RealTek RTL8168e/8111e registers: RealTek RTL8168e/8111e registers: -------------------------------------------------------- -------------------------------------------------------- 0x00: MAC Address b4:b5:2f:dd:11:f9 0x00: MAC Address b4:b5:2f:dd:11:f9 0x08: Multicast Address Filter 0x00400840 0x00800080 0x08: Multicast Address Filter 0x00400840 0x00800080 0x10: Dump Tally Counter Command 0x0d970000 0x00000002 | 0x10: Dump Tally Counter Command 0x0f202000 0x00000002 0x20: Tx Normal Priority Ring Addr 0x05b8b000 0x00000002 | 0x20: Tx Normal Priority Ring Addr 0xfda5f000 0x00000001 0x28: Tx High Priority Ring Addr 0x04000000 0x04000000 0x28: Tx High Priority Ring Addr 0x04000000 0x04000000 0x30: Flash memory read/write 0x00000000 0x30: Flash memory read/write 0x00000000 0x34: Early Rx Byte Count 0 0x34: Early Rx Byte Count 0 0x36: Early Rx Status 0x00 0x36: Early Rx Status 0x00 0x37: Command 0x0c 0x37: Command 0x0c Rx on, Tx on Rx on, Tx on 0x3C: Interrupt Mask 0x803f 0x3C: Interrupt Mask 0x803f SERR LinkChg RxNoBuf TxErr TxOK RxErr RxOK SERR LinkChg RxNoBuf TxErr TxOK RxErr RxOK 0x3E: Interrupt Status 0x0000 0x3E: Interrupt Status 0x0000 0x40: Tx Configuration 0x2f200700 0x40: Tx Configuration 0x2f200700 0x44: Rx Configuration 0x0002870e 0x44: Rx Configuration 0x0002870e 0x48: Timer count 0xec17aa78 | 0x48: Timer count 0xb7376f74 0x4C: Missed packet counter 0x17ada2 | 0x4C: Missed packet counter 0x37729e 0x50: EEPROM Command 0x00 0x50: EEPROM Command 0x00 0x51: Config 0 0x00 0x51: Config 0 0x00 0x52: Config 1 0x0f 0x52: Config 1 0x0f 0x53: Config 2 0x1c 0x53: Config 2 0x1c 0x54: Config 3 0x64 0x54: Config 3 0x64 0x55: Config 4 0x54 0x55: Config 4 0x54 0x56: Config 5 0x82 0x56: Config 5 0x82 0x58: Timer interrupt 0x00000000 0x58: Timer interrupt 0x00000000 0x5C: Multiple Interrupt Select 0x0000 0x5C: Multiple Interrupt Select 0x0000 0x60: PHY access 0x8001796d 0x60: PHY access 0x8001796d 0x64: TBI control and status 0x00000000 0x64: TBI control and status 0x00000000 0x68: TBI Autonegotiation advertisement (ANAR) 0xf030 0x68: TBI Autonegotiation advertisement (ANAR) 0xf030 0x6A: TBI Link partner ability (LPAR) 0x0000 0x6A: TBI Link partner ability (LPAR) 0x0000 0x6C: PHY status 0xf3 0x6C: PHY status 0xf3 0x84: PM wakeup frame 0 0x00000000 0xec17d3db | 0x84: PM wakeup frame 0 0x00000000 0xb7378a43 0x8C: PM wakeup frame 1 0x00000000 0x00000088 0x8C: PM wakeup frame 1 0x00000000 0x00000088 0x94: PM wakeup frame 2 (low) 0x00000000 0x04000000 0x94: PM wakeup frame 2 (low) 0x00000000 0x04000000 0x9C: PM wakeup frame 2 (high) 0x00004000 0x10000000 0x9C: PM wakeup frame 2 (high) 0x00004000 0x10000000 0xA4: PM wakeup frame 3 (low) 0x04000002 0x04000100 0xA4: PM wakeup frame 3 (low) 0x04000002 0x04000100 0xAC: PM wakeup frame 3 (high) 0x04000000 0x00000000 0xAC: PM wakeup frame 3 (high) 0x04000000 0x00000000 0xB4: PM wakeup frame 4 (low) 0x00000000 0x00000000 0xB4: PM wakeup frame 4 (low) 0x00000000 0x00000000 0xBC: PM wakeup frame 4 (high) 0x00000000 0x00000000 0xBC: PM wakeup frame 4 (high) 0x00000000 0x00000000 0xC4: Wakeup frame 0 CRC 0x0000 0xC4: Wakeup frame 0 CRC 0x0000 0xC6: Wakeup frame 1 CRC 0x0000 0xC6: Wakeup frame 1 CRC 0x0000 0xC8: Wakeup frame 2 CRC 0x0000 0xC8: Wakeup frame 2 CRC 0x0000 0xCA: Wakeup frame 3 CRC 0x0000 0xCA: Wakeup frame 3 CRC 0x0000 0xCC: Wakeup frame 4 CRC 0x0000 0xCC: Wakeup frame 4 CRC 0x0000 0xDA: RX packet maximum size 0x4000 0xDA: RX packet maximum size 0x4000 0xE0: C+ Command 0x20e1 0xE0: C+ Command 0x20e1 VLAN de-tagging VLAN de-tagging RX checksumming RX checksumming 0xE2: Interrupt Mitigation 0x5151 0xE2: Interrupt Mitigation 0x5151 TxTimer: 5 TxTimer: 5 TxPackets: 1 TxPackets: 1 RxTimer: 5 RxTimer: 5 RxPackets: 1 RxPackets: 1 0xE4: Rx Ring Addr 0xffb11000 0x00000001 | 0xE4: Rx Ring Addr 0xfdac0000 0x00000001 0xEC: Early Tx threshold 0x3f 0xEC: Early Tx threshold 0x3f 0xF0: Func Event 0x00000000 0xF0: Func Event 0x00000000 0xF4: Func Event Mask 0x00000000 0xF4: Func Event Mask 0x00000000 0xF8: Func Preset State 0x0003ffff 0xF8: Func Preset State 0x0003ffff 0xFC: Func Force Event 0x00000000 0xFC: Func Force Event 0x00000000 Linux 4.18.12-200.fc28.x86_64 #1 SMP Thu Oct 4 15:46:35 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux vermagic: 4.18.12-200.fc28.x86_64 RealTek RTL8168e/8111e registers: -------------------------------------------------------- 0x00: MAC Address 7c:4b:e6:34:5a:13 0x08: Multicast Address Filter 0x00400040 0x00800081 0x10: Dump Tally Counter Command 0x11bf8000 0x00000002 0x20: Tx Normal Priority Ring Addr 0x0ea47000 0x00000002 0x28: Tx High Priority Ring Addr 0x3c042200 0x6c0468b2 0x30: Flash memory read/write 0x00000000 0x34: Early Rx Byte Count 0 0x36: Early Rx Status 0x00 0x37: Command 0x0c Rx on, Tx on 0x3C: Interrupt Mask 0x803f SERR LinkChg RxNoBuf TxErr TxOK RxErr RxOK 0x3E: Interrupt Status 0x0000 0x40: Tx Configuration 0x2f200700 0x44: Rx Configuration 0x0002870e 0x48: Timer count 0x5e57d8b0 0x4C: Missed packet counter 0x57d972 0x50: EEPROM Command 0x00 0x51: Config 0 0x00 0x52: Config 1 0x0f 0x53: Config 2 0x1c 0x54: Config 3 0x64 0x55: Config 4 0x55 0x56: Config 5 0x82 0x58: Timer interrupt 0x00000000 0x5C: Multiple Interrupt Select 0x0000 0x60: PHY access 0x8005cde1 0x64: TBI control and status 0x00000000 0x68: TBI Autonegotiation advertisement (ANAR) 0xf030 0x6A: TBI Link partner ability (LPAR) 0x0000 0x6C: PHY status 0xf3 0x84: PM wakeup frame 0 0x00000000 0x5e57e4f2 0x8C: PM wakeup frame 1 0x00000000 0x788026e2 0x94: PM wakeup frame 2 (low) 0x78641350 0x680420f0 0x9C: PM wakeup frame 2 (high) 0x4920a390 0x680c2353 0xA4: PM wakeup frame 3 (low) 0xbc0222f6 0x4c260337 0xAC: PM wakeup frame 3 (high) 0x4c2620d2 0x00000000 0xB4: PM wakeup frame 4 (low) 0x00000000 0x00000000 0xBC: PM wakeup frame 4 (high) 0x00000000 0x00000000 0xC4: Wakeup frame 0 CRC 0x0000 0xC6: Wakeup frame 1 CRC 0x0000 0xC8: Wakeup frame 2 CRC 0x0000 0xCA: Wakeup frame 3 CRC 0x0000 0xCC: Wakeup frame 4 CRC 0x0000 0xDA: RX packet maximum size 0x4000 0xE0: C+ Command 0x20e1 VLAN de-tagging RX checksumming 0xE2: Interrupt Mitigation 0x5151 TxTimer: 5 TxPackets: 1 RxTimer: 5 RxPackets: 1 0xE4: Rx Ring Addr 0x0ea49000 0x00000002 0xEC: Early Tx threshold 0x3f 0xF0: Func Event 0x00000000 0xF4: Func Event Mask 0x00000000 0xF8: Func Preset State 0x0003ffff 0xFC: Func Force Event 0x00000000 I'll continue posting when I find some more time. Perhaps tomorrow..
(In reply to Heiner Kallweit from comment #6) > 3. The vendor driver uses a different setting for MTPS with jumbo packets. > This shouldn't be an issue because the value of the mainline driver worked > with 4.18 and is unchanged in 4.19. > But just to test it: Could you please replace 0x3f with 0x24 in the > following code piece and test? > > static void r8168e_hw_jumbo_enable(struct rtl8169_private *tp) > { > RTL_W8(tp, MaxTxPacketSize, 0x3f); > RTL_W8(tp, Config3, RTL_R8(tp, Config3) | Jumbo_En0); > RTL_W8(tp, Config4, RTL_R8(tp, Config4) | 0x01); > rtl_tx_performance_tweak(tp, PCI_EXP_DEVCTL_READRQ_512B); > } esenkweb, I started a build with this change: https://koji.fedoraproject.org/koji/taskinfo?taskID=31941290 It should be done in a couple hours after which you can install it directly from Koji.
I just tested jumbo packets between a RTL8168g on Linux and a RTL8168h on Windows with iperf3. With MTU 9000 I got single retries but a good rate, with MTU 8000 everything was fine. Changing interrupt coalesce setting helped to avoid the retries at MTU 9000. So what you could try: Reduce the interupt coalesce settings. With "ethtool -c <if>" you can see the current settings, you could try reducing tx-packets and tx-usecs with "ethtool -C".
Hello Heiner, Sorry for my late response, I've been sick for a while. As you might have you guessed, everyday admin tasks do not easily let me breathe to run tests... Anyway, I tried to profile erroneous behaviour more precisely this time. I could not yet try the build you suggested yet. I'll try to present you a brief report of my findings: First of all I tried to recreate TX errors with iperf3, however, I could not manage to repeat them; nevertheless, the impact on bitrates were still there... Then I tried pinging with -to be- mentioned parameters and I got what I wanted. TX errors appeared after some specific payload lengths. ------------------------------------------------------------------------------------------------------------------------------------------------- Linux pc 4.20.0-0.rc6.git2.1.fc30.x86_64 #1 SMP Thu Dec 13 22:58:18 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux ###Module vermagic: 4.20.0-0.rc6.git2.1.fc30.x86_64 SMP mod_unload### AND ###Module vermagic: 4.18.12-200.fc28.x86_64 SMP mod_unload### Both modules exhibit the same behaviour under kernel 4.20 -- iperf3 tests are around below sample when MTU is higher than 1500 no matter what value I put. I tried 2K,3K,4K,5K,6K,7K,8K and 8000. If I put MTU 1501 below bitrates with some narrow standard deviation are observed throughout all range up to 9000. [ 5] 5.00-6.00 sec 112 MBytes 941 Mbits/sec (RX) [ 5] 5.00-6.00 sec 44.2 MBytes 371 Mbits/sec (TX) -- With MTU 1500, significant bitrate raise is observed. Approximately double: [ 5] 4.00-5.00 sec 112 MBytes 941 Mbits/sec (RX) [ 5] 5.00-6.00 sec 73.1 MBytes 613 Mbits/sec (TX) -- As you may have noticed, RX transfers are not impeded. As it is clear that TX bandwidth suffers significantly, during these tests I could not manage to observe TX errors raising in ifconfig reports. However, following ping statement triggered TX errors: ## ping -M do -s 8149 192.168.203.54 (This one, consistently raise "TX errors" value in ifconfig.) ## ping -M do -s 8148 192.168.203.54 (With 1 byte lower payload no TX errors occur at pinged host.) --------------------------------------------------------------------------------------------------------------------------------- Now let's see what happens with same tests with my reference host having same mainboard running: Linux pc 4.18.12-200.fc28.x86_64 #1 SMP Thu Oct 4 15:46:35 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux ###Module vermagic: 4.18.12-200.fc28.x86_64 SMP mod_unload Below samples, again, have a very narrow standard deviation. -- iperf3 test with MTU 9000: [ 5] 5.00-6.00 sec 112 MBytes 941 Mbits/sec (RX) [ 5] 5.00-6.00 sec 80.7 MBytes 677 Mbits/sec (TX) -- iperf3 test with MTU 1500: [ 5] 5.00-6.00 sec 112 MBytes 941 Mbits/sec (RX) [ 5] 5.00-6.00 sec 110 MBytes 924 Mbits/sec (TX) -- iperf3 test with MTU 1501: [ 5] 5.00-6.00 sec 112 MBytes 942 Mbits/sec (RX) [ 5] 5.00-6.00 sec 77.7 MBytes 652 Mbits/sec (TX) -- pinging reference host with "ping -M do -s 8972" and with random below values does not produce TX errors even with -f parameter. ----------------------------------------------------------------------------------------------------------------------------------------------- So, it feels like (as a noob in programming), with kernels higher than 4.18, beginnging with 4.19x, somethings might have changed in upper layers beyond r8196 module. I see a strange correlation between TX bitrates, feels like, there have always been a problem with MTU 9000 and it aggravated beginning with kernel 4.19x series. I have to leave office right now, I hope, I'll find time to continue next week. I'll continue trying your suggestions and answer your questions.
Today I tried your koji build: Linux pc 4.19.14-300.rhbz1660095.fc29.x86_64 #1 SMP Thu Jan 10 17:28:44 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Module r8169 vermagic: 4.19.14-300.rhbz1660095.fc29.x86_64 SMP mod_unload Unfortunately, TX transfer rates seem worse than ever. RX Rates look unaffected. I'll only paste TX rates below: For MTU 9000: ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 288 KBytes 2.36 Mbits/sec 3 8.74 KBytes [ 5] 1.00-2.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes [ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 3 8.74 KBytes [ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes [ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes [ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes [ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes [ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 0 8.74 KBytes [ 5] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec 1 8.74 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 288 KBytes 235 Kbits/sec 10 sender For MTU 8000: ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 287 KBytes 2.35 Mbits/sec 10 7.76 KBytes [ 5] 1.00-2.00 sec 116 KBytes 954 Kbits/sec 13 7.76 KBytes [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 2 7.76 KBytes [ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 0 7.76 KBytes [ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 1 7.76 KBytes [ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec 0 7.76 KBytes [ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 0 7.76 KBytes [ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 4 7.76 KBytes [ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 2 7.76 KBytes [ 5] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec 1 7.76 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 404 KBytes 329 Kbits/sec 33 sender For MTU 1500: ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 76.6 MBytes 642 Mbits/sec 0 243 KBytes [ 5] 1.00-2.00 sec 76.2 MBytes 640 Mbits/sec 0 257 KBytes [ 5] 2.00-3.00 sec 76.2 MBytes 640 Mbits/sec 0 284 KBytes [ 5] 3.00-4.00 sec 75.0 MBytes 629 Mbits/sec 0 284 KBytes [ 5] 4.00-5.00 sec 76.2 MBytes 640 Mbits/sec 0 284 KBytes [ 5] 5.00-6.00 sec 76.2 MBytes 640 Mbits/sec 0 315 KBytes [ 5] 6.00-7.00 sec 76.2 MBytes 640 Mbits/sec 0 315 KBytes [ 5] 7.00-8.00 sec 76.2 MBytes 640 Mbits/sec 0 328 KBytes [ 5] 8.00-9.00 sec 75.0 MBytes 629 Mbits/sec 0 328 KBytes [ 5] 9.00-10.00 sec 76.2 MBytes 640 Mbits/sec 0 328 KBytes [ 5] 10.00-10.04 sec 2.50 MBytes 535 Mbits/sec 0 328 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 763 MBytes 637 Mbits/sec 0 sender For MTU 1501: ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 48.2 MBytes 405 Mbits/sec 0 200 KBytes [ 5] 1.00-2.00 sec 47.5 MBytes 398 Mbits/sec 0 208 KBytes [ 5] 2.00-3.00 sec 46.2 MBytes 388 Mbits/sec 0 217 KBytes [ 5] 3.00-4.00 sec 47.5 MBytes 398 Mbits/sec 0 225 KBytes [ 5] 4.00-5.00 sec 46.2 MBytes 388 Mbits/sec 0 233 KBytes [ 5] 5.00-6.00 sec 46.2 MBytes 388 Mbits/sec 0 233 KBytes [ 5] 6.00-7.00 sec 47.5 MBytes 398 Mbits/sec 0 243 KBytes [ 5] 7.00-8.00 sec 47.5 MBytes 398 Mbits/sec 0 243 KBytes [ 5] 8.00-9.00 sec 46.2 MBytes 388 Mbits/sec 0 243 KBytes [ 5] 9.00-10.00 sec 46.2 MBytes 388 Mbits/sec 0 243 KBytes [ 5] 10.00-10.04 sec 2.50 MBytes 516 Mbits/sec 0 243 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 472 MBytes 394 Mbits/sec 0 sender There seems to be a significant drop at MTU 1501. I tried with 1000 byte steps and go smaller until pinpointing an abrupt breaking point at MTU 7940 bytes. Until this breaking point, TX bitrate measurements were hovering around 400-450 Mbits/sec. With MTU 7940: ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 52.1 MBytes 437 Mbits/sec 1 216 KBytes [ 5] 1.00-2.00 sec 53.6 MBytes 450 Mbits/sec 3 216 KBytes [ 5] 2.00-3.00 sec 53.4 MBytes 448 Mbits/sec 0 223 KBytes [ 5] 3.00-4.00 sec 53.4 MBytes 448 Mbits/sec 1 216 KBytes [ 5] 4.00-5.00 sec 53.4 MBytes 448 Mbits/sec 0 216 KBytes [ 5] 5.00-6.00 sec 53.4 MBytes 448 Mbits/sec 1 216 KBytes [ 5] 6.00-7.00 sec 53.4 MBytes 448 Mbits/sec 0 223 KBytes [ 5] 7.00-8.00 sec 53.4 MBytes 448 Mbits/sec 0 223 KBytes [ 5] 8.00-9.00 sec 53.4 MBytes 448 Mbits/sec 0 223 KBytes [ 5] 9.00-10.00 sec 53.4 MBytes 448 Mbits/sec 5 216 KBytes [ 5] 10.00-10.04 sec 2.17 MBytes 460 Mbits/sec 0 216 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 535 MBytes 447 Mbits/sec 11 sender With MTU 7941: ----------------------------------------------------------- Server listening on 5201 ----------------------------------------------------------- [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 285 KBytes 2.33 Mbits/sec 13 15.4 KBytes [ 5] 1.00-2.00 sec 116 KBytes 947 Kbits/sec 9 15.4 KBytes [ 5] 2.00-3.00 sec 0.00 Bytes 0.00 bits/sec 8 15.4 KBytes [ 5] 3.00-4.00 sec 0.00 Bytes 0.00 bits/sec 5 7.70 KBytes [ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec 1 7.70 KBytes [ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec 0 7.70 KBytes [ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec 1 7.70 KBytes [ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 0 7.70 KBytes [ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 0 7.70 KBytes [ 5] 9.00-10.00 sec 123 KBytes 1.01 Mbits/sec 5 7.70 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.04 sec 524 KBytes 427 Kbits/sec 42 sender From this point onwards, TX rates go like the latter above. Fiddling with ethtool coalesce settings did not make any significant change. RX rates still go unaffected at around 900 Mbits/sec.
Thanks for the analysis. Then I think what's left is to bisect to find the change(s) causing the issues with jumbo packets from a certain size.
This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Hello, Today I noticed that this bug does not exhibit itself in below mentioned kernel. Linux pc 5.0.6-100.fc28.x86_64 #1 SMP Wed Apr 3 16:14:34 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux FYI All the best...