Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Description of problem: Regression. Kernel 3.11 hangs almost immediately after login prompt appears. 3.10 works fine. Version-Release number of selected component: kernel-3.11.6-200.fc19.x86_64 How reproducible: Every time I boot with 3.11 kernel. Actual results: System hang. See attached screen shot. Expected results: Kernel booting normally. Additional info: I suspect network adapter (via_velocity) has something to do with it: 02:00.0 Ethernet controller [0200]: VIA Technologies, Inc. VT6120/VT6121/VT6122 Gigabit Ethernet Adapter [1106:3119] (rev 82) 03:00.0 Ethernet controller [0200]: VIA Technologies, Inc. VT6120/VT6121/VT6122 Gigabit Ethernet Adapter [1106:3119] (rev 82)
Created attachment 815568 [details] Photo of kernel stack trace Kernel was booted without rhgb and quiet options.
Apparently known regression: http://comments.gmane.org/gmane.linux.network/284445 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=727041
I have been experiencing this very same bug since I upgraded to fc19.i686, i.e, started to use kernel 3.11. I am able to boot just fine since my VIA_VELOCITY ethernet card is a secondary one but the system freezes out as soon as the card is put into use -- a ping to its address is enough -- and it is so fast that I don't even get crash messages... Linux haar 3.11.4-201.fc19.i686 #1 SMP Thu Oct 10 14:59:49 UTC 2013 i686 i686 i386 GNU/Linux AMD Athlon(tm) II X4 620 Processor 05:00.0 Ethernet controller: VIA Technologies, Inc. VT6120/VT6121/VT6122 Gigabit Ethernet Adapter (rev 82)
Alex, Juha, could you try the ML posted patch and confirm that it fixes the issue for you as well? diff --git a/drivers/net/ethernet/via/via-velocity.c b/drivers/net/ethernet/via/via-velocity.c index d022bf9..64c42be 100644 --- a/drivers/net/ethernet/via/via-velocity.c +++ b/drivers/net/ethernet/via/via-velocity.c @@ -2172,16 +2172,13 @@ static int velocity_poll(struct napi_struct *napi, int budget) unsigned int rx_done; unsigned long flags; - spin_lock_irqsave(&vptr->lock, flags); /* * Do rx and tx twice for performance (taken from the VIA * out-of-tree driver). */ - rx_done = velocity_rx_srv(vptr, budget / 2); - velocity_tx_srv(vptr); - rx_done += velocity_rx_srv(vptr, budget - rx_done); + rx_done = velocity_rx_srv(vptr, budget); + spin_lock_irqsave(&vptr->lock, flags); velocity_tx_srv(vptr); - /* If budget not fully consumed, exit the polling mode */ if (rx_done < budget) { napi_complete(napi); Once you confirm I'll ping Francois/netdev again. Thanks, Michele
The proposed fix corrected the problem for me!! Thanks Michele. Alex
Thanks for confirming Alex, I've pinged Francois again to push it upstream.
I've exchanged mails with Francois and he will push it upstream. I'll put a note here once it hits the net tree or Linus' tree. Thanks for testing Alex, Michele
Created attachment 824353 [details] Revised patch Hi Alex & Juha, could you please test the new revised patch from Francois. This one should be safe against MTU changes, whereas the previous one was not. If you could test it like the following, that'd be great: - run some netperf/iperf - during the above network load change the MTU to a few values in a loop Let me know if it works for you or if there are any issues. Thanks again, Michele
(In reply to Michele Baldessari from comment #9) > Created attachment 824353 [details] > Revised patch > > Hi Alex & Juha, > > could you please test the new revised patch from Francois. This one should > be safe against MTU changes, whereas the previous one was not. > > If you could test it like the following, that'd be great: > - run some netperf/iperf > - during the above network load change the MTU to a few values in a loop > > Let me know if it works for you or if there are any issues. > > Thanks again, > Michele Hello Michelle, Thanks for the new patch. The driver is working just as well but I am not sure the result regarding MTU change is as expected. What follows is an iperf chitchat between a client and a server, both using the VIA ethernet card and the patched via_veleocity driver (on a 3.11.7-200.fc19.i866 kernel). The chitchat goes on OK until I change the MTU on the server side from 1500 to, say, 3000 (with "ifconfig p4p1 mtu 3000 up"). From that moment onwards the server (or it would be the client?) misses all the messages as can be seen bellow. Is that OK? But things get worse if I change the MTU on the client side during the chitchat. The iperf output is held back forevever and never resumes... Gladly, a ctrl-c can interrupt iperf just fine and no error messages are shown. Regards, Alex iperf client/server chitchat: ------------------------------------------------ server side: harten|~> [ 9] local 192.168.1.105 port 5001 connected with 192.168.1.102 port 43296 [ 9] 0.0-3288.1 sec 79.2 MBytes 202 Kbits/sec [ 9] MSS size 1448 bytes (MTU 1500 bytes, ethernet) client side: mallat|~> iperf -c lc_harten -P 1 -i 1 -f m -t 20 ------------------------------------------------------------ Client connecting to lc_harten, TCP port 5001 TCP window size: 0.02 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.102 port 43296 connected with 192.168.1.105 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 11.4 MBytes 95.4 Mbits/sec [ 3] 1.0- 2.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 2.0- 3.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 3.0- 4.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 4.0- 5.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 5.0- 6.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 6.0- 7.0 sec 11.0 MBytes 92.3 Mbits/sec [ 3] 7.0- 8.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 8.0- 9.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 9.0-10.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 10.0-11.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 11.0-12.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 12.0-13.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 13.0-14.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 14.0-15.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 15.0-16.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 16.0-17.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 17.0-18.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 18.0-19.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 19.0-20.0 sec 0.62 MBytes 5.24 Mbits/sec [ 3] 0.0-20.0 sec 79.2 MBytes 33.2 Mbits/sec
Hi Alex, a couple of questions: - did the previous version of the patch also exhibit this behaviour? - if both client and server have mtu set to 3000 things work correctly yes? (i.e. it is only the mtu change itself that breaks things) I'm assuming here that client and server are connected through a medium that supports L2 frames > 1500, yes? Thanks, Michele
(In reply to Michele Baldessari from comment #11) > Hi Alex, > > a couple of questions: > - did the previous version of the patch also exhibit this behaviour? > - if both client and server have mtu set to 3000 things work correctly yes? > (i.e. it is only the mtu change itself that breaks things) > > I'm assuming here that client and server are connected through a medium > that supports L2 frames > 1500, yes? > > Thanks, > Michele Hello Michele, I apologise for taking so long to answer... I use these VIA cards as a secondary network between some machines in our lab. I am not sure about the L2 frames support in this intranet since I will have to look around in the building to find out which switch the cables go to... So, I guess NO to L2 frames support and have made further tests with MTU <= 1500. And this is what I got: a) both previous and current patches behave in the same way: - there seems to a be a limit in the workable size of MTU on the server. For instance, the iperf would work with 1600 but not with 1700... But I think this is related to the L2 support. More about this below. - iperf chitchat works with any MTU starting value <= 1500 (on both client and server); an MTU change -- on either the client or the server -- breaks the message exchange for a while but it is resumed later as can be seen bellow: mallat|~> iperf -c lc_harten -P 1 -i 1 -f m -t 20 ------------------------------------------------------------ Client connecting to lc_harten, TCP port 5001 TCP window size: 0.02 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.102 port 43329 connected with 192.168.1.105 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 11.4 MBytes 95.4 Mbits/sec [ 3] 1.0- 2.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 2.0- 3.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 3.0- 4.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 4.0- 5.0 sec 10.8 MBytes 90.2 Mbits/sec [ 3] 5.0- 6.0 sec 0.00 MBytes 0.00 Mbits/sec <<<< MTU change from 1500 to 1000 on the server [ 3] 6.0- 7.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 7.0- 8.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 8.0- 9.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 9.0-10.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 10.0-11.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 11.0-12.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 12.0-13.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 13.0-14.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 14.0-15.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 15.0-16.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 16.0-17.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 17.0-18.0 sec 0.88 MBytes 7.34 Mbits/sec [ 3] 18.0-19.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 19.0-20.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 0.0-20.0 sec 79.1 MBytes 33.2 Mbits/sec mallat|~> iperf -c lc_harten -P 1 -i 1 -f m -t 20 ------------------------------------------------------------ Client connecting to lc_harten, TCP port 5001 TCP window size: 0.02 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.102 port 43330 connected with 192.168.1.105 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 11.4 MBytes 95.4 Mbits/sec [ 3] 1.0- 2.0 sec 11.1 MBytes 93.3 Mbits/sec [ 3] 2.0- 3.0 sec 11.2 MBytes 94.4 Mbits/sec [ 3] 3.0- 4.0 sec 5.12 MBytes 43.0 Mbits/sec [ 3] 4.0- 5.0 sec 0.00 MBytes 0.00 Mbits/sec <<<< MTU change from 1500 to 1000 on the client [ 3] 5.0- 6.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 6.0- 7.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 7.0- 8.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 8.0- 9.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 9.0-10.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 10.0-11.0 sec 2.62 MBytes 22.0 Mbits/sec [ 3] 11.0-12.0 sec 10.8 MBytes 90.2 Mbits/sec [ 3] 12.0-13.0 sec 11.0 MBytes 92.3 Mbits/sec [ 3] 13.0-14.0 sec 10.9 MBytes 91.2 Mbits/sec [ 3] 14.0-15.0 sec 10.9 MBytes 91.2 Mbits/sec [ 3] 15.0-16.0 sec 10.9 MBytes 91.2 Mbits/sec [ 3] 16.0-17.0 sec 10.9 MBytes 91.2 Mbits/sec [ 3] 17.0-18.0 sec 10.9 MBytes 91.2 Mbits/sec [ 3] 18.0-19.0 sec 10.9 MBytes 91.2 Mbits/sec [ 3] 19.0-20.0 sec 10.9 MBytes 91.2 Mbits/sec [ 3] 0.0-20.0 sec 140 MBytes 58.5 Mbits/sec I connected the two VIA cards directly using a cat 5e ethernet cable and repeated the tests with L2 frame support issues involved. The iperf chitchat would now occur for MTU > 1500 (up to 9000), the bandwith was much greater as expected, but the behaviour regarding MTU change during the chitchat was exactly as before: mallat|.../ethernet/via> iperf -c lc_harten -P 1 -i 1 -f m -t 20 ----------------------------------------------------------- Client connecting to lc_harten, TCP port 5001 TCP window size: 0.06 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.102 port 35426 connected with 192.168.1.105 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 66.6 MBytes 559 Mbits/sec [ 3] 1.0- 2.0 sec 66.2 MBytes 556 Mbits/sec [ 3] 2.0- 3.0 sec 66.1 MBytes 555 Mbits/sec [ 3] 3.0- 4.0 sec 66.4 MBytes 557 Mbits/sec [ 3] 4.0- 5.0 sec 66.4 MBytes 557 Mbits/sec [ 3] 5.0- 6.0 sec 66.1 MBytes 555 Mbits/sec [ 3] 6.0- 7.0 sec 66.4 MBytes 557 Mbits/sec [ 3] 7.0- 8.0 sec 66.2 MBytes 556 Mbits/sec [ 3] 8.0- 9.0 sec 66.4 MBytes 557 Mbits/sec [ 3] 9.0-10.0 sec 66.2 MBytes 556 Mbits/sec [ 3] 10.0-11.0 sec 19.4 MBytes 163 Mbits/sec [ 3] 11.0-12.0 sec 0.00 MBytes 0.00 Mbits/sec <<<MTU change from 9000 to 3000 on the server [ 3] 12.0-13.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 13.0-14.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 14.0-15.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 15.0-16.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 16.0-17.0 sec 28.0 MBytes 235 Mbits/sec [ 3] 17.0-18.0 sec 66.2 MBytes 556 Mbits/sec [ 3] 18.0-19.0 sec 66.1 MBytes 555 Mbits/sec [ 3] 19.0-20.0 sec 66.1 MBytes 555 Mbits/sec [ 3] 0.0-20.0 sec 909 MBytes 381 Mbits/sec mallat|.../ethernet/via> iperf -c lc_harten -P 1 -i 1 -f m -t 20 ------------------------------------------------------------ Client connecting to lc_harten, TCP port 5001 TCP window size: 0.02 MByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.102 port 35427 connected with 192.168.1.105 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 1.0 sec 93.5 MBytes 784 Mbits/sec [ 3] 1.0- 2.0 sec 93.2 MBytes 782 Mbits/sec [ 3] 2.0- 3.0 sec 93.2 MBytes 782 Mbits/sec [ 3] 3.0- 4.0 sec 93.2 MBytes 782 Mbits/sec [ 3] 4.0- 5.0 sec 55.8 MBytes 468 Mbits/sec [ 3] 5.0- 6.0 sec 0.00 MBytes 0.00 Mbits/sec <<< MTU change from 9000 to 3000 on the client [ 3] 6.0- 7.0 sec 0.00 MBytes 0.00 Mbits/sec [ 3] 7.0- 8.0 sec 28.8 MBytes 241 Mbits/sec [ 3] 8.0- 9.0 sec 93.1 MBytes 781 Mbits/sec [ 3] 9.0-10.0 sec 93.4 MBytes 783 Mbits/sec [ 3] 10.0-11.0 sec 93.1 MBytes 781 Mbits/sec [ 3] 11.0-12.0 sec 93.1 MBytes 781 Mbits/sec [ 3] 12.0-13.0 sec 93.4 MBytes 783 Mbits/sec [ 3] 13.0-14.0 sec 93.1 MBytes 781 Mbits/sec [ 3] 14.0-15.0 sec 93.4 MBytes 783 Mbits/sec [ 3] 15.0-16.0 sec 93.4 MBytes 783 Mbits/sec [ 3] 16.0-17.0 sec 93.2 MBytes 782 Mbits/sec [ 3] 17.0-18.0 sec 93.4 MBytes 783 Mbits/sec [ 3] 18.0-19.0 sec 90.4 MBytes 758 Mbits/sec [ 3] 19.0-20.0 sec 92.4 MBytes 775 Mbits/sec [ 3] 0.0-20.0 sec 1573 MBytes 660 Mbits/sec Regards, Alex
Hi Alex, thanks for your tests. Well as long as the traffic eventually starts again, a certain amount of downtime after an MTU can be reasonable depending on what the driver needs to do to do. I'll tell Francois that this patch works (equally) well as the other one so we can get this upstream. Thanks, Michele
Patch has been submitted upstream and has been asked to be included stable 3.11.x and 3.12.y.
I've added the patch. It isn't going to make 3.11.10, and that's the last 3.11 stable release. F20 is shipping with that, so we need this as a patch there anyway. Proposing as an F20 freeze exception.
+1 FE. Hanging on boot is bad, yo.
+1 FE.
Discussed at 2013-11-27 freeze exception review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-11-27/f20-blocker-review-3.2013-11-27-17.01.log.txt . Accepted as a freeze exception issue; hanging on boot is clearly not good, and can't be cleanly fixed post-release.
kernel-3.11.10-300.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/kernel-3.11.10-300.fc20
Package kernel-3.11.10-300.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.11.10-300.fc20' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-22531/kernel-3.11.10-300.fc20 then log in and leave karma (feedback).
kernel-3.11.10-200.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/kernel-3.11.10-200.fc19
kernel-3.11.10-100.fc18 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/kernel-3.11.10-100.fc18
kernel-3.11.10-300.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.11.10-200.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.
kernel-3.11.10-100.fc18 has been pushed to the Fedora 18 stable repository. If problems still persist, please make note of it in this bug report.