Hide Forgot
Description of problem: I update the kernel of the computers a few days ago to kernel-3.11.9-100.fc18.x86_64 and today to kernel-3.11.10-100.fc18.x86_64 on my computers. Two of the computers work correctly on IPv4, but on IPv6 the speed is very slow. I have mounted the following test: Computer A: RedHat Enterprise 6.5 with IPv6 and 1 Gbit connection and IPv6 X:X:X:222::2. Computer B1, B2, ...: Personal computers with IPv6 and 1 Gbit connection. All computers connected to the same switch. I run the command iperf -V -s in the RedHat server. If I run sequentially the command iperf -V -c X:X:X:222::2 on the computers, I obtain a speed of 900-1000 Mbits/sec, but in two computer I obtain a speed of 60-70Mbits/sec. If I reverse the test* for the two computers that have a slow speed, the speed is 900-1000 Mbits/sec. * Computer B execute iperf -V -s and computer A execute iperf -V -c. The kernels work fine in motherboards: Asus P6T WS PRO Intel DP35DP Intel S3210SH Dell PowerEdge 840 But fail in motherboards: Asus P8P67 LE Asus P8H77-V LE Version-Release number of selected component (if applicable): kernel-3.11.9.100 and kernel-3.11.10-100 How reproducible: Install these kernels in a computer with the Asus P8P67 LE or Asus P8H77-V LE motherboards and execute the command: iperf -V -c <IPv6 iperf server address> The speed is very slow. Steps to Reproduce: 1. Install kernel 3.11.9-100 or 3.11.10-100 in a Asus P8P67 LE or Asus P8H77-V LE motherboard. 2. Execute the command iperf -V -c <IPv6 iperf server address> with a 1Gb connection. 3. The speed is 60-70 Mbits/sec Actual results: The iperf -V -c <IPv6 iperf server address> return a 60-70 Mbits/sec Expected results: The iperf -V -c <IPv6 iperf server address> return a 900-1000 Mbits/sec Additional info: On IPv4 the computers work fine and the iperf -c <IPv4 iperf server address> return a 900-1000 Mbits/sec on all computers.
I have updated the systems to Fedora 19 (kernel 3.11.10-200.fc19.x86_64) and the problem remain.
I have updated the kernel to version 3.12.5-200.fc19.x86_64 and the problem remain.
Can we get an lspci from the two bad computers? Also the output of ip a ethtool -i <interface> ethtool -k <interface> Thanks, Michele
Created attachment 844978 [details] Output of lspci and ethtool commands
The new kernel 3.12.6-200.fc19.x86_64 don't solved the problem.
do you have a binary tcpdump capture of the iperf session (and possibly the ipv4 variant of the session for comparison)?
Created attachment 845088 [details] Output of the iperf command (IPv4 and IPv6) Hi, I run the iperf commands and the output of these commands are: [root@amparo ~]# iperf -c 147.156.223.157 -t 1 ------------------------------------------------------------ Client connecting to 147.156.223.157, TCP port 5001 TCP window size: 22.9 KByte (default) ------------------------------------------------------------ [ 3] local 147.156.222.34 port 42647 connected with 147.156.223.157 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 112 MBytes 939 Mbits/sec [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 1 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 34195 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 8.38 MBytes 70.2 Mbits/sec I capture the traffic with: tcpdump -i p5p1 -w IPv4 host 147.156.223.157 tcpdump -i p5p1 -w IPv6 host 2001:720:1014:222::2 And the files IPv4 and IPv6 are zipped in the data.zip file. Thanks, Enrique
IPv6 traffic seems to "pause" every ~0.2s, hence the bad performance. You mention that: """ The kernels work fine in motherboards: Asus P6T WS PRO Intel DP35DP Intel S3210SH Dell PowerEdge 840 But fail in motherboards: Asus P8P67 LE Asus P8H77-V LE """ Do I understand correctly that it is the same r8169 card in all those motherboards? Or do the other motherboards have different NICs? (In which case the issue would seem to be more r8169 related)
Hi Michele, No, only the Asus P6T WS PRO, the Asus P8P67 LE and the ASUS P8H77-V LE have similar NICs: Asus P6T WS PRO -> Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 02) Asus P8P57 LE -> Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller (rev 09) Asus P8H77-V LE -> Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06) But the computer that has an Asus P6T WS PRO motherboard (slopez) work fine: [root@slopez ~]# iperf -V -c 2001:720:1014:222::2 -t 1 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:88:22cf:30ff:fef1:a3df port 36165 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 110 MBytes 922 Mbits/sec So that's why I put all the motherboards models which had tested. I apologize if that has confused you. I think that the problem is the driver, but I don't understand why it works fine on a slopez (Asus P6T WS PRO)... the NIC rev? Thanks, Enrique
Hi Enrique, ah I see now. Odd indeed. Could you maybe expand between which boxes you've done all the tests and their results? It might very well be that we find out that it is a single NIC/Box slowing down the IPv6 test (it might be either the client or the server slowing down the ipv6 run). At least I'd hope so, otherwise the plot thickens quite a bit. thanks, Michele
Created attachment 845610 [details] Computer description and iperf test result Hi Michele, The server always is mirror.uv.es (2001:720:1014:222::2), a S5520HC motherboard with Red Hat Enterprise 6.5 x86_64. Its NICs are Intel Corporation 82575EB Gigabit Network Connection (rev 02) and driver igb. I attached a ZIP with two files: TarjetaRed.pdf -> Table with the name of the clients, motherboard, NIC and driver. ResIperf.txt -> Output of run the iperf command in the clients. Note that the iperf server always is mirror.uv.es and the test is runnning sequentially in the clients. If it's any help, the kernel prior to 3.11.9 working properly (I can't remember the version). Thanks, Enrique
Hi Enrique, ok so to recap. The netperf server is always mirror.uv.es (2001:720:1014:222::2) with igb. The tests with the clients are: r8169 - Asus P6T WS PRO [root@slopez ~]# iperf -V -c 2001:720:1014:222::2 -t 1 [ 3] 0.0- 1.0 sec 110 MBytes 925 Mbits/sec e1000 - Intel S3210SH [root@crash ~]# iperf -V -c 2001:720:1014:222::2 -t 1 [ 3] 0.0- 1.0 sec 71.6 MBytes 600 Mbits/sec e1000e - Intel DP35DP [root@crunch1 ~]# iperf -V -c 2001:720:1014:222::2 -t 1 [ 3] 0.0- 1.0 sec 111 MBytes 927 Mbits/sec e1000e - Intel DP35DP [root@crunch2 ~]# iperf -V -c 2001:720:1014:222::2 -t 1 [ 3] 0.0- 1.0 sec 111 MBytes 928 Mbits/sec tg3 - Dell PE 840 [root@smagris1 ~]# iperf -V -c 2001:720:1014:222::2 -t 1 [ 3] 0.0- 1.0 sec 96.4 MBytes 808 Mbits/sec tg3 - Dell PE 840 [root@smagris2 ~]# iperf -V -c 2001:720:1014:222::2 -t 1 [ 3] 0.0- 1.0 sec 99.2 MBytes 832 Mbits/sec tg3 - Dell PE 840 [root@smagris3 ~]# iperf -V -c 2001:720:1014:222::2 -t 1 [ 3] 0.0- 1.0 sec 97.6 MBytes 818 Mbits/sec r8169 - Asus P8P67 LE [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 1 [ 3] 0.0- 1.0 sec 7.75 MBytes 62.7 Mbits/sec r8169 - Asus P8H77-V LE [root@cordon3] Could not be tested, but is a slow one Since you mentioned that some previous kernel used to work correctly, my recommendation would be to start bisecting a bit. Start with installing a 3.10.x kernel and then take it from there. If you need some help with bisecting let me (there are many guides around) hth, Michele
Hi Michele, I have downloaded and installed the kernel-3.9.5-301.fc19.x86_64, the first Fedora 19 kernel. When I run the iperf command, the output is: [root@amparo ~]# uname -a Linux amparo 3.9.5-301.fc19.x86_64 #1 SMP Tue Jun 11 19:39:38 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 1 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 45826 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 111 MBytes 927 Mbits/sec ! Work fine !. The last kernel that works fine is the before of the kernel-3.11.9-200.fc19.x86_64 (I think that is the kernel-3.11.8-200.fc19.x86_64), but I can not find it in the mirrors. Can you provides me this kernel version?. Is there a repository of old kernels versions?. I would like to download and install the versions 3.11.8 and 3.11.9 to give you exactly the version where it fails. Thanks, Enrique
Hi Enrique, you can find all the builds in koji. Specifically the kernel ones are here: http://koji.fedoraproject.org/koji/packageinfo?packageID=8 3.11.8 for fc19 is here: http://koji.fedoraproject.org/koji/buildinfo?buildID=478117 regards, Michele
Hi Michele, I didn't know this repository, thanks. I download and install the kernels version 3.11.8 and 3.11.9 and I run the iperf command. The outputs are: [root@amparo ~]# uname -a Linux amparo 3.11.8-200.fc19.x86_64 #1 SMP Wed Nov 13 16:29:59 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 1 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 49638 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 111 MBytes 928 Mbits/sec ! Work fine ! [root@amparo ~]# uname -a Linux amparo 3.11.9-200.fc19.x86_64 #1 SMP Wed Nov 20 21:22:24 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 1 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 49989 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 8.25 MBytes 68.9 Mbits/sec ! Don't work fine ! The problem appears in the kernel 3.11.9, the kernel 3.11.8 works properly. I think that we should focus the problem in the r8169 module, although it works fine on a computer, but you know better the kernel. Regards, Enrique
Hi Enrique, interesting. This makes the potential changesets much smaller: $ git log --oneline v3.11.8..v3.11.9 56a766f media: sh_vou: almost forever loop in sh_vou_try_fmt_vid_out() 1e7c2cd usbcore: set lpm_capable field for LPM capable root hubs a35bbad usb: fail on usb_hub_create_port_device() errors b2c2f76 usb: fix cleanup after failure in hub_configure() c03642e backlight: atmel-pwm-bl: fix deferred probe from __init be85221 misc: atmel_pwm: add deferred-probing support b53ef13 iwlwifi: pcie: add new SKUs for 7000 & 3160 NIC series 57b0a9d perf: Fix perf ring buffer memory ordering 747b007 drm/i915/dp: workaround BIOS eDP bpp clamping issue 33e3df4 tracing: Fix potential out-of-bounds in trace_get_user() 0e5f119 ALSA: hda - hdmi: Fix reported channel map on common default layouts 01535e4 USB: add new zte 3g-dongle's pid to option.c e55433c hyperv-fb: add pci stub 583d159 Thermal: x86_pkg_temp: change spin lock edd6447 xen-netback: transition to CLOSED when removing a VIF 5fe1417 xen-netback: Handle backend state transitions in a more robust way 4e9728a ipv6: reset dst.expires value when clearing expire flag 1731edc ipv6: ip6_dst_check needs to check for expired dst_entries 2ce4f60 tcp: gso: fix truesize tracking 16eb627 cxgb3: Fix length calculation in write_ofld_wr() on 32-bit architectures 6f54c27 xen-netback: use jiffies_64 value to calculate credit timeout 1527a1e virtio-net: correctly handle cpu hotplug notifier during resuming 6047108 net: flow_dissector: fail on evil iph->ihl e697716 net: sctp: do not trigger BUG_ON in sctp_cmd_delete_tcb 51ce609 net/mlx4_core: Fix call to __mlx4_unregister_mac In terms of Fedora specific patches we have: e1db685 Add patch to fix rhel5.9 KVM guests (rhbz 967652) 0daff16 Add bugzilla/upstream-status notes to 24hz audio patch 4c2b97b Add patch to fix crash from slab when using md-raid mirrors (rhbz 1031086) 59378ff Add patches from Pierre Ossman to fix 24Hz/24p radeon audio (rhbz 1010679) 0b654a6 Add patch to fix ALX phy issues after resume (rhbz 1011362) 09060dc Fix ipv6 sit panic with packet size > mtu (from Michele Baldessari) (rbhz 1015905) 67ce21f CVE-2013-4563: net: large udp packet over IPv6 over UFO-enabled device with TBF qdisc panic (rhbz 1030015 1030017) The more likely changes are: 1) 4e9728a ipv6: reset dst.expires value when clearing expire flag 2) 1731edc ipv6: ip6_dst_check needs to check for expired dst_entries 3) 2ce4f60 tcp: gso: fix truesize tracking 4) 09060dc Fix ipv6 sit panic with packet size > mtu (from Michele Baldessari) (rbhz 1015905) 5) 67ce21f CVE-2013-4563: net: large udp packet over IPv6 over UFO-enabled device with TBF qdisc panic (rhbz 1030015 1030017) 5) Is UDP-only so it should not affect this BZ. 4) is composed of 9037c3579a277f3a23ba476664629fda8c35f7c4 "ip6_output: fragment outgoing reassembled skb properly" 6aafeef03b9d9ecf255f3a80ed85ee070260e1ae "netfilter: push reasm skb through instead of original frag skbs" 3) Hits ipv4-only code 2) I don't see how this could be relevant 1) I don't see how this could be relevant So my hunch now would be that 4) is a good candidate. Although it does not explain yet why only one system seems to be affected Do you have any netfilter rules on the amparo box? Any rules in general I mean. Would you be able to compile a kernel without the two commits of 4) and see if the problem goes away? (Note: I'll be flying the next couple of days so I'll resume looking at this mid-week) regards, Michele
I'd check your routing table, to see if the dst entry to ::2 is expiring. 1731edc could be causing previously unexpiring dst entries to expire properly now, requiring the need for a new neighbor solicitation every .2 seconds or so, which may introduce delays.
Hi, I will compile the kernel this week, when I have some time. The netfilter rules are: [root@amparo ~]# ip6tables -L Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all anywhere anywhere state RELATED,ESTABLISHED ACCEPT ipv6-icmp anywhere anywhere ACCEPT all anywhere anywhere ACCEPT tcp anywhere anywhere state NEW tcp dpt:ssh ACCEPT udp anywhere anywhere state NEW udp dpt:ipp ACCEPT udp anywhere ff02::fb/128 state NEW udp dpt:mdns ACCEPT tcp anywhere anywhere state NEW tcp dpt:ipp ACCEPT udp anywhere anywhere state NEW udp dpt:ipp REJECT all anywhere anywhere reject-with icmp6-adm-prohibited Chain FORWARD (policy ACCEPT) target prot opt source destination REJECT all anywhere anywhere reject-with icmp6-adm-prohibited Chain OUTPUT (policy ACCEPT) target prot opt source destination I don't found something strange. The IPv6 routing table are: [root@amparo ~]# ip -6 route 2001:720:1014:222::/64 dev p5p1 proto kernel metric 256 expires 2591996sec fe80::/64 dev vmnet1 proto kernel metric 256 fe80::/64 dev vmnet8 proto kernel metric 256 fe80::/64 dev p5p1 proto kernel metric 256 default via fe80::20e:d6ff:feb7:400 dev p5p1 proto ra metric 1024 expires 1796sec It seems correct, but I have run a test. I run the script shell: while true; do ip -6 route >> ip6route.txt; echo "==========" >> ip6route.txt; sleep 1; done And I run the iperf -V -c 2001:720:1014:222::2 -t 10 in other shell. The file "ip6route.txt" contain: ... 2001:720:1014:222::2 dev p5p1 metric 0 cache expires -4379514sec 2001:720:1014:222::/64 dev p5p1 proto kernel metric 256 expires 2591812sec fe80::/64 dev vmnet1 proto kernel metric 256 fe80::/64 dev vmnet8 proto kernel metric 256 fe80::/64 dev p5p1 proto kernel metric 256 default via fe80::20e:d6ff:feb7:400 dev p5p1 proto ra metric 1024 expires 1612sec ========== 2001:720:1014:222::2 dev p5p1 metric 0 cache expires -4379515sec 2001:720:1014:222::/64 dev p5p1 proto kernel metric 256 expires 2591811sec fe80::/64 dev vmnet1 proto kernel metric 256 fe80::/64 dev vmnet8 proto kernel metric 256 fe80::/64 dev p5p1 proto kernel metric 256 default via fe80::20e:d6ff:feb7:400 dev p5p1 proto ra metric 1024 expires 1611sec ... The cache expires value is negative. Is this correct?. This happen on all computers on my network. Thanks, Enrique
Its not right, but I don't think its a catastrophic problem (the problem is in iproute2, in its treatment of an unsigned kernel value as signed). That said, it makes it very difficult to determine exactly what the lifetime of that cached route is (you'll note it counts up instead of down). IIRC, that route is cloned from the gateway route with with the expiration time of 2591812. That said, it seems odd to me that they have different expiration times. Looking at the source, it seems we can update the expiration time if we are updating path mtu information and the /proc/sys/net/ipv6/route/mtu_expires value is set to something very low. It doesn't appear that you have mtu information set, but I could be wrong. Can you do 3 things: 1) Check your value for /proc/sys/net/ipv6/route/mtu_expires 2) Back out commit 1731edc, and test that kernel as that seems to be the most likely reason we're dropping dst entries that have had their expirations reduced quickly. 3) Augment the function rt6_check_expireds to log a message when a dst entry is expired?
Hi Norman, The value of /proc/sys/net/ipv6/route/mtu_expires is 600. Others computers have the same value. I don't know how back out patch 1731edc, but I applied patch-3.11.9 to kernel 3.11 and I modified the source code of the net/ipv6/route.c as follow: ... static struct dst_entry *ip6_dst_check(struct dst_entry *dst,u32 cookie) { ... /* if (rt6_check_expired(rt)) return NULL; */ return dst; } ... And I run the iperf command: [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 1 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 53408 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 40.2 MBytes 337 Mbits/sec [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 10 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 53409 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 946 MBytes 794 Mbits/sec It has improved... "3) Augment the function rt6_check_expireds to log a message when a dst entry is expired?" Can you help me?. What function can I use to write in /var/log/messages or other file?. Thanks, Enrique
the 600 value is good, thats the default value, and suggests that mtu updates are not at fault here. The removal of the rt6_check_expired was sufficient for the test I suggested in 2 and 3. It indicates thats commit e3bc10bd95d7fcc3f2ac690c6ff22833ea6781d6 is causing this problem (thats the upstream sha1 for the "ipv6: ip6_dst_check needs to check for expired dst_entries" fix). The fix itself isn't wrong, but seems to be uncovering either another bug or misconfiguration in your network. What would really be useful here is a reproducer. How do you assign global ipv6 addresses in your network? Do you use SLAAC, DHCPv6 or manual assignment? Do you have a tcpdump that shows some router advertisements or dhcpv6 transactions from your network that I can look over?
hey, just FYI, I've received a simmilar report to this one, and have an internal reproducer for it, I'll update this bz when I have some results from that investigation
think I see the problem, ip6_rt_copy needs to call rt6_update_expires.
Created attachment 847770 [details] SLAAC capture traffic Hi Neil, I read your comments 22 and 23, but I answer your comment 21. The protocol used by assign global ipv6 addresses in our network is SLAAC. I attached a ZIP file with the tcpdump, the filter used is: tcpdump -i p5p1 -w slaac ether <my ethernet MAC> and ip6 Besides, I modified the function rt6_check_expired() as follow: static bool rt6_check_expired(const struct rt6_info *rt) { static unsigned long int call=0,expire=0; if ((++call%1000)==0) printk(KERN_DEBUG "r6_check_expired called %lu times\n",call); if (rt->rt6i_flags & RTF_EXPIRES) { if (time_after(jiffies, rt->dst.expires)) { if ((++expire%1000)==0) printk(KERN_DEBUG "r6_check_expired suceed %lu times\n",expire); return true; } } else if (rt->dst.from) { return rt6_check_expired((struct rt6_info *) rt->dst.from); } return false; } If I uncommented, in the ip6_dst_check() function, the code: if (rt6_check_expired(rt)) return NULL; And run the iperf -V -c 2001:720:1014:222::2 The output of the dmesg command is: [ 83.299503] r6_check_expired called 1000 times [ 83.301892] r6_check_expired called 2000 times [ 83.303622] r6_check_expired suceed 1000 times [ 83.303819] r6_check_expired called 3000 times ... [ 84.093472] r6_check_expired called 46000 times [ 84.336506] r6_check_expired suceed 17000 times [ 84.336740] r6_check_expired called 47000 times [ 84.338378] r6_check_expired called 48000 times More than 48000 calls and 17000 expires !, and the transfer rate is 62.5 Mbits/sec. If I comment the code, the dmesg command output has not lines r6_check_expired, so it is rarely called, and the transfer rate is 926 Mbits/sec. I think that the number of calls at the r6_check_expired() funcion is the problem. Best regards, Enrique
Hi Norman, I think that the solution is, as you said in the comment 23, call to rt6_update_expires() in the ip6_rt_copy(). I changed the ip6_rt_copy() function as follows: static struct rt6_info *ip6_rt_copy(struct rt6_info *ort, const struct in6_addr *dest) { struct net *net = dev_net(ort->dst.dev); struct rt6_info *rt = ip6_dst_alloc(net, ort->dst.dev, 0, ort->rt6i_table); if (rt) { ... rt->rt6i_table = ort->rt6i_table; /* The following line is added */ rt6_update_expires(rt, net->ipv6.sysctl.ip6_rt_mtu_expires); } return rt; } And the iperf command return: [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 1 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 39961 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 111 MBytes 926 Mbits/sec The r6_check_expired() function is called more than 97000 times !, but it never happen. Could this be the solution? Best regards, Enrique
It could be, but I'm not sure yet. I tried the same thing with success, but then I attempted another fix, which I felt was more correct (changing the condition under which we call rt6_set_from in rt6_check_expires. That modified the flow such that the cloned route sets the from pointer in the dst_entry and clears the cloned expires flag, so the cloned dst_entry should use the parent (from pointers dst informtation), I've validated thats happening, but just the same we're not getting better performance, despite the parent route not expiring. I'm looking some more.
Actually, your patch is also wrong by way of the use of net->ipv6.sysctl.ip6_rt_mtu_expires. If we were going to preserve the expiration of this route, we would set it to the expiration of the parent route that we are copying from.
Created attachment 849636 [details] patch to ensure route expiration is set Hey, I think this is going to be our fix. I'm still not convinced that we don't just need to set the from pointer on the copied route, but I'm still looking into that. Could you please test this and confirm that it fixes the problem? Thanks!
Hi Norman, Your patch solves the problem. I apply the patch to the 3.11.9 kernel, reboot and run the iperf -V command. The output is: [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 1 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 39545 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 50.4 MBytes 421 Mbits/sec [root@amparo ~]# iperf -V -c 2001:720:1014:222::2 -t 10 ------------------------------------------------------------ Client connecting to 2001:720:1014:222::2, TCP port 5001 TCP window size: 22.7 KByte (default) ------------------------------------------------------------ [ 3] local 2001:720:1014:222:f66d:4ff:fe09:8938 port 39546 connected with 2001:720:1014:222::2 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 1.00 GBytes 860 Mbits/sec ! Work fine ! Just a comment. I first apply the 3.11.9 general patches: patch -p1 < ../patch-3.11.9 And after your patch patch -p1 < ../route.patch (your patch) And the output was: patching file net/ipv6/route.c Hunk #1 succeeded at 1859 (offset -52 lines). Are you going to apply this patch on a new kernel version? Thanks, Enrique
I'm going to post this upstream, and if its accepted, yes, I'll backport it to fedora.
grr, looks like someone beat me too it. Upstream commit 24f5b855e17df7e355eacd6c4a12cc4d6a6c9ff0 is what we need.
ok, I've comitted the backport to the tree, the next f19 kernel should have it fixed.
kernel-3.12.8-300.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/kernel-3.12.8-300.fc20
kernel-3.12.8-200.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/kernel-3.12.8-200.fc19
Package kernel-3.12.8-300.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.12.8-300.fc20' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-1062/kernel-3.12.8-300.fc20 then log in and leave karma (feedback).
kernel-3.12.8-200.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.12.8-300.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.