Description of problem: I have an IBM thinkpad T60p (2008-93G with 3GB ram) running Fedora 11. When I copy a large file to or from my machine using scp the copy starts fine, but during or after the copy network manager reports that the link has been lost and reconnects to the network. This usually repeats again a minute or two later, even if the transfer has already completed. The link light on the network interface actually goes off briefly (<1s) and the following messages appear in /var/log/messages Aug 3 14:45:00 localhost NetworkManager: <info> (eth0): carrier now OFF (device state 8) Aug 3 14:45:00 localhost NetworkManager: <info> (eth0): device state change: 8 -> 2 (reason 40) Aug 3 14:45:00 localhost NetworkManager: <info> (eth0): deactivating device (reason: 40). Aug 3 14:45:00 localhost NetworkManager: <info> eth0: canceled DHCP transaction, dhcp client pid 405 Aug 3 14:45:00 localhost NetworkManager: <WARN> check_one_route(): (eth0) error -34 returned from rtnl_route_del(): Sucess#012 Aug 3 14:45:00 localhost dnsmasq[2344]: no servers found in /etc/resolv.conf, will retry Aug 3 14:45:01 localhost kernel: e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None Aug 3 14:45:01 localhost kernel: 0000:02:00.0: eth0: 10/100 speed: disabling TSO Aug 3 14:45:01 localhost NetworkManager: <info> (eth0): carrier now ON (device state 2) Aug 3 14:45:01 localhost NetworkManager: <info> (eth0): device state change: 2 -> 3 (reason 40) Aug 3 14:45:01 localhost NetworkManager: <info> Activation (eth0) starting connection 'ethernet dhcp' things I tried that didn't fix it: - connecting the network cable to the laptop directly when on battery power - downgrading my kernel to the F11 release version - connecting directly to my desktop machine rather than via the switch (although it did seem to improve matters and it used 1Gb/s rather than 100Mb/s) Version-Release number of selected component (if applicable): both of: kernel-2.6.29.4-167.fc11.i586 kernel-2.6.29.6-217.2.3.fc11.i586 How reproducible: By copying a 100MB file, every time. Steps to Reproduce: 1. sftp remotemachine (via ethernet) 2. get largefile (arrives at around 8MByte/sec for me 3. keep an eye on /var/log/messages and throughput Actual results: After less than a minute, the above messages are printed, the network link light goes off briefly, network manager says the network connection was disconnected and begins reconnecting, and the transfer stalls until the interface is configured again. This usually repeats itself a minute or two later. Expected results: There are no messages printed in the log, the network link stays up and the file is transferred. Additional info: This wasn't a problem when I was on fedora 10, and I upgraded to fedora 11 within a week of its release. However I don't do large file transfers that often, but I have only noticed it in the last month or two. So there is a small chance that it's a hardware failure but I can't find any more detailed logging to enable. 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller [root@john-laptop ~]# ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: pumbag Wake-on: g Current message level: 0x00000001 (1) Link detected: yes eth0 Link encap:Ethernet HWaddr 00:16:41:A7:69:46 inet addr:9.20.209.223 Bcast:9.20.209.255 Mask:255.255.255.0 inet6 addr: fe80::216:41ff:fea7:6946/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:235175 errors:0 dropped:0 overruns:0 frame:0 TX packets:287639 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:49798262 (47.4 MiB) TX bytes:193502419 (184.5 MiB) Memory:ee000000-ee020000
I am seeing something similar with F11 (fully up to date) on my machine - with smolt info: http://www.smolts.org/client/show/pub_34ba7c2a-3891-44a5-b011-98990af7b916 Sep 16 10:05:21 physics30 kernel: b44: eth0: powering down PHY Sep 16 10:05:21 physics30 NetworkManager: <info> (eth0): carrier now OFF (device state 8) Sep 16 10:05:21 physics30 kernel: b44: eth0: Link is down. Sep 16 10:05:21 physics30 NetworkManager: <info> (eth0): device state change: 8 -> 2 (reason 40) Sep 16 10:05:21 physics30 NetworkManager: <info> (eth0): deactivating device (reason: 40). Sep 16 10:05:21 physics30 NetworkManager: <info> eth0: canceled DHCP transaction, dhcp client pid 2317 Sep 16 10:05:21 physics30 NetworkManager: <WARN> check_one_route(): (eth0) error -34 returned from rtnl_route_del(): Sucess#012 Sep 16 10:05:21 physics30 avahi-daemon[1117]: Withdrawing address record for 144.32.35.20 on eth0. Sep 16 10:05:21 physics30 avahi-daemon[1117]: Leaving mDNS multicast group on interface eth0.IPv4 with address 144.32.35.20. Sep 16 10:05:21 physics30 avahi-daemon[1117]: Interface eth0.IPv4 no longer relevant for mDNS. Sep 16 10:05:22 physics30 ntpd[1467]: Deleting interface #6 eth0, 144.32.35.20#123, interface stats: received=24, sent=24, dropped=0, active_time=787 secs Sep 16 10:05:24 physics30 NetworkManager: <info> (eth0): carrier now ON (device state 2) Sep 16 10:05:24 physics30 NetworkManager: <info> (eth0): device state change: 2 -> 3 (reason 40) Sep 16 10:05:24 physics30 kernel: b44: eth0: Link is up at 100 Mbps, full duplex. Sep 16 10:05:24 physics30 kernel: b44: eth0: Flow control is off for TX and off for RX. Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) starting connection 'Auto Ethernet' Sep 16 10:05:24 physics30 NetworkManager: <info> (eth0): device state change: 3 -> 4 (reason 0) Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) scheduled... Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) started... Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) scheduled... Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 1 of 5 (Device Prepare) complete. Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) starting... Sep 16 10:05:24 physics30 NetworkManager: <info> (eth0): device state change: 4 -> 5 (reason 0) Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) successful. Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) scheduled. Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 2 of 5 (Device Configure) complete. Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) started... Sep 16 10:05:24 physics30 NetworkManager: <info> (eth0): device state change: 5 -> 7 (reason 0) Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Beginning DHCP transaction. Sep 16 10:05:24 physics30 NetworkManager: <info> dhclient started with pid 2753 Sep 16 10:05:24 physics30 NetworkManager: <info> Activation (eth0) Stage 3 of 5 (IP Configure Start) complete. Sep 16 10:05:24 physics30 dhclient: Internet Systems Consortium DHCP Client 4.1.0p1 Sep 16 10:05:24 physics30 dhclient: Copyright 2004-2009 Internet Systems Consortium. Sep 16 10:05:24 physics30 dhclient: All rights reserved. Sep 16 10:05:24 physics30 dhclient: For info, please visit http://www.isc.org/sw/dhcp/ Sep 16 10:05:24 physics30 dhclient: Sep 16 10:05:24 physics30 NetworkManager: <info> DHCP: device eth0 state changed normal exit -> preinit Sep 16 10:05:24 physics30 dhclient: Listening on LPF/eth0/00:02:b3:a6:04:42 Sep 16 10:05:24 physics30 dhclient: Sending on LPF/eth0/00:02:b3:a6:04:42 Sep 16 10:05:24 physics30 dhclient: Sending on Socket/fallback Sep 16 10:05:26 physics30 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67 Sep 16 10:05:26 physics30 dhclient: DHCPACK from 144.32.128.17 Sep 16 10:05:26 physics30 NetworkManager: <info> DHCP: device eth0 state changed preinit -> reboot Sep 16 10:05:26 physics30 NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) scheduled... Sep 16 10:05:26 physics30 NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) started... Sep 16 10:05:26 physics30 NetworkManager: <info> address 144.32.35.20 Sep 16 10:05:26 physics30 NetworkManager: <info> prefix 23 (255.255.254.0) Sep 16 10:05:26 physics30 NetworkManager: <info> gateway 144.32.34.253 Sep 16 10:05:26 physics30 NetworkManager: <info> hostname 'physics30' Sep 16 10:05:26 physics30 NetworkManager: <info> nameserver '144.32.128.242' Sep 16 10:05:26 physics30 NetworkManager: <info> nameserver '144.32.128.243' Sep 16 10:05:26 physics30 NetworkManager: <info> domain name 'csrv.ad.york.ac.uk' Sep 16 10:05:26 physics30 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) scheduled... Sep 16 10:05:26 physics30 NetworkManager: <info> Activation (eth0) Stage 4 of 5 (IP Configure Get) complete. Sep 16 10:05:26 physics30 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) started... Sep 16 10:05:26 physics30 avahi-daemon[1117]: Joining mDNS multicast group on interface eth0.IPv4 with address 144.32.35.20. Sep 16 10:05:26 physics30 avahi-daemon[1117]: New relevant interface eth0.IPv4 for mDNS. Sep 16 10:05:26 physics30 avahi-daemon[1117]: Registering new address record for 144.32.35.20 on eth0.IPv4. Sep 16 10:05:26 physics30 dhclient: bound to 144.32.35.20 -- renewal in 6661 seconds. Sep 16 10:05:27 physics30 NetworkManager: <info> (eth0): device state change: 7 -> 8 (reason 0) Sep 16 10:05:27 physics30 NetworkManager: <info> Policy set 'Auto Ethernet' (eth0) as default for routing and DNS. Sep 16 10:05:27 physics30 NetworkManager: <info> Activation (eth0) successful, device activated. Sep 16 10:05:27 physics30 NetworkManager: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete. The disconnect/reconnect happens at frequent intervals when the network is heavily loaded. Version of kernel and NM are: 2.6.30.5-43.fc11.i686.PAE NetworkManager-0.7.1-8.git20090708.fc11.i586
Ping! Any news on this bug - it still plagues me!
If it is any help my NIC is: 03:08.0 Ethernet controller: Intel Corporation 82801G (ICH7 Family) LAN Controller (rev 01) Subsystem: Dell Device 01ab Flags: bus master, medium devsel, latency 64, IRQ 20 Memory at ebeff000 (32-bit, non-prefetchable) [size=4K] I/O ports at ccc0 [size=64] Capabilities: <access denied> Kernel driver in use: e100 Kernel modules: e100
The previous info was the ethernet controller - physical NIC is description: Ethernet interface product: BCM4401 100Base-T vendor: Broadcom Corporation physical id: 9 bus info: pci@0000:01:09.0 logical name: eth0 version: 01 However I don't know if this is something that is NM related, kernel related or NIC specific?
I have updated the kernel to the latest in F11 and this problem persists - I wonder if the component may be set wrong and it is possible the problem is a bug in NetworkManager?
this seems to work fine on the same hardware on fedora 12. Switching back to 11 shows the bug again. Presumably this has been fixed along the way so I'm closing this. I suspect there was some problem caused after the network hardware moved from some low power mode to a higher power mode used when traffic was high. Because there were always 2 disconnects, and one would always happen soon after commencing traffic and one soon after stopping it, no matter how long the transfer lasted.