Description of problem: Sometimes, on resume, the eth0 interface (which is not fisically connected) gives unusual values on ifconfig: eth0 Link encap:Ethernet HWaddr 00:26:22:XX:XX:XX UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:1271860075133760 errors:7631173335704445 dropped:2543728740202110 overruns:1271864370101055 frame:6359313260570685 TX packets:1271864370101055 errors:5087457480404220 dropped:0 overruns:1271864370101055 carrier:2543728740202110 collisions:6359321850505275 txqueuelen:1000 RX bytes:1271860075133760 (1.1 PiB) TX bytes:1271864370101055 (1.1 PiB) Interrupt:45 This breaks dhcpd and thus networkmanager, for any interface. Version-Release number of selected component (if applicable): 3.3.0-4.fc16.x86_64 Apr 4 08:27:15 KiwiBook dhclient[14703]: Bad line reading interface information Apr 4 08:27:15 KiwiBook dhclient[14703]: Error getting interface information. Apr 4 08:27:15 KiwiBook NetworkManager[1044]: Bad line reading interface information ... Apr 4 08:27:15 KiwiBook dhclient[14703]: exiting. Apr 4 08:27:15 KiwiBook NetworkManager[1044]: <info> (wlan0): DHCPv4 client pid 14703 exited with status 1 How reproducible: Sometimes. Steps to Reproduce: 1. Suspend the system 2. Wake up the system Actual results: Networkmanager won't reconnect, stops working for any interface Expected results: Everything is fine Additional info: Smolt of the machine: http://www.smolts.org/client/show/pub_5b3d2153-a96a-4833-bd41-e19763fa6671 rmmod / insmod atl1c resets the interface statistics. Networkmanager works again.
Happened again, using kernel 3.3.0-8.fc16.x86_64
Happened again on kernel-3.3.4-3.fc16.x86_64, this time after a reboot for upgrading the kernel, so this is not strictly related to suspend/resume.
output of cat /proc/net/dev: Inter-| Receive | Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed … eth0: 88274463624106 88274462819165 529646776884810 176548925628270 88274462814135 441372314070675 0 88274462814135 88274463554144 88274462823115 353115031125720 0 88278757781430 441393788907150 176557515562863 0
We update atl1c driver in kernel-3.3.6-3.fc17, perhaps update fix this issue. Please test.
So, do you still hit this issue on updated kernels?
Experienced again on fedora 17 (I did an upgrade) using kernel 3.4.3-1.fc17.x86_64
still happens using kernel 3.4.4-5.fc17.x86_64
I see now that once the problem appears the numbers continue growing: eth0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether 00:26:22:51:eb:9b txqueuelen 1000 (Ethernet) RX packets 251285651528565 bytes 251285651528565 (228.5 TiB) RX errors 1507713909171390 dropped 502571303057130 overruns 251285651528565 frame 1256428257642825 TX packets 251285651528565 bytes 251285651528565 (228.5 TiB) TX errors 1005142606114260 dropped 0 overruns 251285651528565 carrier 502571303057130 collisions 1256428257642825 eth0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether 00:26:22:51:eb:9b txqueuelen 1000 (Ethernet) RX packets 251298536430450 bytes 251298536430450 (228.5 TiB) RX errors 1507791218582700 dropped 502597072860900 overruns 251298536430450 frame 1256492682152250 TX packets 251298536430450 bytes 251298536430450 (228.5 TiB) TX errors 1005194145721800 dropped 0 overruns 251298536430450 carrier 502597072860900 collisions 1256492682152250
still happens in fedora 18, kernel 3.7.2-204.fc18.x86_64
hmm, it appears that this may be a hardware issue As we're just reading hardware registers into our software stats block. Maybe the hw regs stop clearning properly. When they start to grow large like this, do they take a big jump, then grow linearly again, or do they start to grow exponentially?
ping, any response here?
I'm sorry but I'm currently not using that system, I hope to find some time to look on how the number grows in the next week.
ping, any update?
This message is a reminder that Fedora 17 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 17. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '17'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 17's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 17 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 17's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Hello, just experienced the thing again. in kernel 3.13.3-201.fc20.x86_64 The counter is growing linearly at a huge speed # ( for i in {1..3}; do ifconfig eth0; sleep 1; done ) | grep -E [TR]X RX packets 20169166417320 bytes 20169166417320 (18.3 TiB) RX errors 121014998503920 dropped 40338332834640 overruns 20169166417320 frame 100845832086600 TX packets 20169166417320 bytes 20169166417320 (18.3 TiB) TX errors 80676665669280 dropped 0 overruns 20169166417320 carrier 40338332834640 collisions 100845832086600 RX packets 20182051319205 bytes 20182051319205 (18.3 TiB) RX errors 121092307915230 dropped 40364102638410 overruns 20182051319205 frame 100910256596025 TX packets 20182051319205 bytes 20182051319205 (18.3 TiB) TX errors 80728205276820 dropped 0 overruns 20182051319205 carrier 40364102638410 collisions 100910256596025 RX packets 20194936221090 bytes 20194936221090 (18.3 TiB) RX errors 121169617326540 dropped 40389872442180 overruns 20194936221090 frame 100974681105450 TX packets 20194936221090 bytes 20194936221090 (18.3 TiB) TX errors 80779744884360 dropped 0 overruns 20194936221090 carrier 40389872442180 collisions 100974681105450 Sorry for long delay, I stopped using that laptop for some time, but now I'm starting in ~daily
Ok, thats great, you still need to answer my question in comment 10 however, just re-opening this doesn't help.
Sorry forgot to answer that part. So, it makes a big jump and then grows linearly (with no cable attached since boot).
Ok, thank you. The stats update mechanism relies on the notion that all the stats registers are read-clear. The fact that the stats take a big jump then grow linearly suggests that occasionally some of the read-clear registers don't clear, resulting in a large double count. I think you're seeing a hardware problem. You may want to check with atheros to see if you can find an errata sheet or firmware update for your NIC
*********** MASS BUG UPDATE ************** This bug has been in a needinfo state for several weeks and is being closed with insufficient data due to inactivity. If this is still an issue with Fedora 20, please feel free to reopen the bug and provide the additional information requested.
I experience the same in Ubuntu Precise 32 bit (with linux 3.2.0-58-generic-pae and those with which Precise started its life) — at least when I have my EEE PC 1015PN unplugged from Ethernet, and I made a test to answer the comment 10: the error count is 0 for several days, with suspend/resume cycles it doesn't change. Then suddenly it goes to almost 2^32, but not quite: 4294967274 (maybe it was higher between samples, which I did once per 60 seconds). After that the error count linearly goes down. If I suspend and wait some time, then resume, the error count continues from its previous value, i.e. it doesn't change while in suspend. At least this is true for "RX packets" column, I can check others, but main fact is: they all suddenly go from 0 to ~2^32.
Also, if I reload atl1c module ("modprobe -r atl1c", then "modprobe atl1c"), then all stats reset to zeroes.
I am also having the same issue, and can reproduce (within 2 minutes to one days). I have tested using the last 4 non-debug kernels for Fedora 19, but few or no errors in messages/dmesg as kernel thinks everything is OK. I believe this hardware worked without problem until maybe, roughly, November 2013, but since the problem was sporadic it went attributed to something else. But now is much worse. I have all logs, dmesg, ifconfig, ethtool. atl1c appears to be up, and ifocnfig reports it has link but with massive errors. It appears the PCI bus may get reset or fail as modules complain that they cannot talk to their hardware, or it is in a bad state. I have removed/disabled possibly suspect PCI devices and the problem continues. I have memtest86, memtester, turned off NetworkManager, disabled all other networking like bluetooth, and removed/disabled all utilities like network monitoring tools. I have tried unplugging eth cable. I can ifdown/rmmod/modprobe atl1c/ifup the device and I can access the network again, most often. But it has blown other devices offline so a restart is necessary. Thank You for any assistance.
Created attachment 909121 [details] ethtool, ifconfig, dmesg, var/log/messages
Created attachment 909314 [details] lspci -vvvn