Bug 1080709 - wired connection stops working when wireless is active, have to manually choose wired or wireless
Summary: wired connection stops working when wireless is active, have to manually choo...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 21
Hardware: Unspecified
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: fedora-kernel-wireless-ath
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-03-26 02:10 UTC by Tom Georgoulias
Modified: 2015-05-04 19:24 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 981680
Environment:
Last Closed: 2015-01-31 16:50:20 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Tom Georgoulias 2014-03-26 02:10:30 UTC
After upgrading from Fedora 19 to Fedora 20, bz #981680 has returned.  (I think the complete fix was documented in https://bugzilla.redhat.com/show_bug.cgi?id=995308)

My updated info:

# uname -r
3.13.6-200.fc20.i686+PAE

# modinfo atl1c
filename:       /lib/modules/3.13.6-200.fc20.i686+PAE/kernel/drivers/net/ethernet/atheros/atl1c/atl1c.ko
version:        1.0.1.1-NAPI
license:        GPL
description:    Qualcom Atheros 100/1000M Ethernet Network Driver
author:         Qualcomm Atheros Inc., <nic-devel>
author:         Jie Yang
srcversion:     77A3FAB53265A7C9ACA8A49
alias:          pci:v00001969d00001083sv*sd*bc*sc*i*
alias:          pci:v00001969d00001073sv*sd*bc*sc*i*
alias:          pci:v00001969d00002062sv*sd*bc*sc*i*
alias:          pci:v00001969d00002060sv*sd*bc*sc*i*
alias:          pci:v00001969d00001062sv*sd*bc*sc*i*
alias:          pci:v00001969d00001063sv*sd*bc*sc*i*
depends:        
intree:         Y
vermagic:       3.13.6-200.fc20.i686+PAE SMP mod_unload 686 
signer:         Fedora kernel signing key
sig_key:        F0:30:B5:8F:65:BF:2C:A6:B7:CC:81:E3:96:9E:CC:60:E0:42:E4:19
sig_hashalgo:   sha256



+++ This bug was initially created as a clone of Bug #981680 +++

Description of problem:
The NIC on my laptop does not keep a stable network connection if my wireless card is enabled and I am simultaneously plugged into a wired connection.  The wired connection works for variable period of time, then all of my network connections start hanging.  If I pull the plug on my wired connection, it will switch back to wireless and start working again.  If I do the reverse and kill the radio via the keyboard killswitch, then switch to the wired, it also starts working.

Version-Release number of selected component (if applicable):
3.9.8-300.fc19.i686.PAE

Steps to Reproduce:
1. Plug network patch cable into on board NIC when wifi is active, or boot laptop with cable plugged in (wifi starts automatically on boot)
2. Wait


Actual results:

Laptop will use wired connection for some period of time, could be a few mins or could be a little longer, then all networking operations hang unti I pull the cable or kill the wifi with the switch on my keyboard.

Expected results:
Wired connection should just work.

Additional info:

I did not experience this behavior in Fedora 17 or any prior release, the NIC has "just worked" on this laptop since Fedora 14 or 15.

08:00.0 Ethernet controller: Qualcomm Atheros AR8151 v2.0 Gigabit Ethernet (rev c0)
	Subsystem: Hewlett-Packard Company Device 1650
	Flags: bus master, fast devsel, latency 0, IRQ 45
	Memory at c1400000 (64-bit, non-prefetchable) [size=256K]
	I/O ports at 2000 [size=128]
	Capabilities: [40] Power Management version 3
	Capabilities: [48] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [58] Express Endpoint, MSI 00
	Capabilities: [6c] Vital Product Data
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [180] Device Serial Number ff-f9-df-82-10-1f-74-ff
	Kernel driver in use: atl1c

# modinfo atl1c
filename:       /lib/modules/3.9.8-300.fc19.i686.PAE/kernel/drivers/net/ethernet/atheros/atl1c/atl1c.ko
version:        1.0.1.1-NAPI
license:        GPL
description:    Qualcom Atheros 100/1000M Ethernet Network Driver
author:         Qualcomm Atheros Inc., <nic-devel>
author:         Jie Yang
srcversion:     BDE702F64103375A65FCBB2
alias:          pci:v00001969d00001083sv*sd*bc*sc*i*
alias:          pci:v00001969d00001073sv*sd*bc*sc*i*
alias:          pci:v00001969d00002062sv*sd*bc*sc*i*
alias:          pci:v00001969d00002060sv*sd*bc*sc*i*
alias:          pci:v00001969d00001062sv*sd*bc*sc*i*
alias:          pci:v00001969d00001063sv*sd*bc*sc*i*
depends:        
intree:         Y
vermagic:       3.9.8-300.fc19.i686.PAE SMP mod_unload 686 
signer:         Fedora kernel signing key
sig_key:        F4:91:27:00:2C:8A:29:92:9F:C4:33:B7:AB:1A:3F:87:02:40:AB:A6
sig_hashalgo:   sha256

--- Additional comment from Josh Boyer on 2013-07-05 09:31:04 EDT ---

Neil, I think you were working on another bug that is this exact issue but I've stared at so many different bugs this week that I can't remember which it was.  Want to dup this to the other bug if I'm remembering correctly?

--- Additional comment from Tom Georgoulias on 2013-07-07 10:53:29 EDT ---

More data that might help isolate this.

This seems to work:  If I boot up my laptop w/o a cable plugged into the nic, log into my desktop session, then plug in to use the wired connection, it seems to be fine and I use wired the entire time.

As soon as my session is idle and the screensaver starts (or if I activate the screensaver manually), the wired connection won't recover when I log back in.  Ihave to pull and reseat the connection to get it working.

Also, if I put the laptop to sleep, it behaves the same way when I wake it back up.  I have to reseat the wired connection.

I am running the latest kernel, 3.9.9-301.fc19.i686.PAE.  It still has this bug.

--- Additional comment from Michele Baldessari on 2013-08-17 18:41:25 EDT ---

Tom,

could you check if by chance https://bugzilla.redhat.com/show_bug.cgi?id=995308 is related (see last comment there)?

thanks,
Michele

--- Additional comment from Tom Georgoulias on 2013-08-17 20:23:10 EDT ---

Thank you for the suggestion, I have installed kernel-PAE-3.10.7-200.fc19.i686.rpm and will report back with my findings.

--- Additional comment from Tom Georgoulias on 2013-08-18 21:09:26 EDT ---

So far so good.  I have been running this kernel for nearly a day and haven't had the wired connection drop on me once.  I have been running my laptop "normally", letting network manager handle the NIC and not using any toggle switches or the radio kill switch.  I'm still testing it and will post another update tomorrow or the next day if things are still going smoothly, or sooner if it fails on me.

--- Additional comment from Tom Georgoulias on 2013-08-19 21:26:35 EDT ---

The wired NIC under the new kernel continues to work w/o error.

--- Additional comment from Tom Georgoulias on 2013-08-19 21:26:56 EDT ---

The wired NIC using the new kernel continues to work w/o error.

--- Additional comment from Michele Baldessari on 2013-11-26 03:37:51 EST ---

Hi Tom,

is the issue gone? If yes, can you go ahead and close this BZ?

Thanks,
Michele

--- Additional comment from Tom Georgoulias on 2013-11-26 18:05:04 EST ---

Yes, I'm no longer suffering from this bug.  You can close this BZ.

--- Additional comment from Josh Boyer on 2013-11-26 18:17:33 EST ---

Thanks all.

Comment 1 Tom Georgoulias 2014-04-16 00:16:28 UTC
I've continued to see some form of this bug with the latest kernels, including the one I'm running now 

3.13.9-200.fc20.i686+PAE

Is there something I can provide to help diagnose this or test a fix?

Comment 2 Justin M. Forbes 2014-05-21 19:41:32 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.14.4-200.fc20.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you experience different issues, please open a new bug report for those.

Comment 3 Tom Georgoulias 2014-05-21 23:24:08 UTC
This is still a problem, even with kernel 3.14.4-200.fc20.i686+PAE

Comment 4 Justin M. Forbes 2014-11-13 16:05:04 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 20 kernel bugs.

Fedora 20 has now been rebased to 3.17.2-200.fc20.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 21, and are still experiencing this issue, please change the version to Fedora 21.

If you experience different issues, please open a new bug report for those.

Comment 5 Tom Georgoulias 2014-12-07 14:58:50 UTC
This bug is still present in Fedora 21, I just had the same behavior occur this morning on my freshly installed Fedora 21 OS.

I really wish whatever patch was applied when Michele Baldessari updated the other bz ticket was applied to the latest kernel.  That fixed it.  I don't know why it was pulled out, seems like that patch could just be applied to the new kernels and this bug would go away again.

[root@crankarm ~]# uname -r
3.17.4-301.fc21.x86_64
[root@crankarm ~]# modinfo atl1c
filename:       /lib/modules/3.17.4-301.fc21.x86_64/kernel/drivers/net/ethernet/atheros/atl1c/atl1c.ko.xz
version:        1.0.1.1-NAPI
license:        GPL
description:    Qualcom Atheros 100/1000M Ethernet Network Driver
author:         Qualcomm Atheros Inc., <nic-devel>
author:         Jie Yang
srcversion:     4333D8ADEE755DD5ABDF0B8
alias:          pci:v00001969d00001083sv*sd*bc*sc*i*
alias:          pci:v00001969d00001073sv*sd*bc*sc*i*
alias:          pci:v00001969d00002062sv*sd*bc*sc*i*
alias:          pci:v00001969d00002060sv*sd*bc*sc*i*
alias:          pci:v00001969d00001062sv*sd*bc*sc*i*
alias:          pci:v00001969d00001063sv*sd*bc*sc*i*
depends:        
intree:         Y
vermagic:       3.17.4-301.fc21.x86_64 SMP mod_unload 
signer:         Fedora kernel signing key
sig_key:        4C:74:34:E0:6F:FA:84:0A:EA:AA:9E:91:F7:66:C5:FD:A0:77:12:60
sig_hashalgo:   sha256

Comment 6 Tom Georgoulias 2014-12-23 03:47:00 UTC
Any updates on this?  It wasn't always a bug, it was fixed in Fedora 19 and then reintroduced and has since stuck around.  If someone would look at it, I'd be glad to do any testing and give feedback on the fix.  You can probably apply the same fix that was used the first time around.

Comment 7 DO NOT USE account not monitored (old adamwill) 2014-12-29 22:06:50 UTC
The original report says "This should be fixed via 7b70176421993866e616f1cbc4d0dd4054f1bf78 that was included in v3.11-rc4 in Linus' tree." What that means is that the fix should have gone upstream as of 3.11. If you're still having problems, either it didn't, or it's been regressed upstream later somehow, or you're now seeing a different bug with a similar symptom.

I do see the fix in Linus' tree:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/atheros/atl1c/atl1c_main.c?id=7b70176421993866e616f1cbc4d0dd4054f1bf78

so I think we can eliminate option 1. I don't see any later change that reverts that commit. It's *possible* that https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/atheros/atl1c/atl1c_main.c?id=07641c8fa45774d5e99f4bdc8c37a7d174a2e973 or https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/atheros/atl1c/atl1c_main.c?id=0f5c113c5adb56c1352c05155dd4a711b68a839b are somehow to blame, I guess, though given the dates, it doesn't seem likely.

I think the best thing to do may possibly be to treat this as a new bug and file a report of the issue 'from scratch' with all necessary debugging info, rather than assuming its simply a recurrence of the original bug.

Comment 8 Tom Georgoulias 2015-01-07 23:30:39 UTC
OK, I can do that.  I'm not exactly sure what debugging info is needed so I'll start with  the output of /var/log/messages before & after the link drops.  Once I unplug the cable and go back to wifi, my network connection is restored and everything is fine.  

Jan  7 17:36:58 crankarm NetworkManager[853]: <error> [1420670218.996151] [devices/nm-device.c:2026] activation_source_schedule(): activation stage already scheduled
Jan  7 17:38:41 crankarm NetworkManager[853]: <error> [1420670321.091102] [devices/nm-device.c:2026] activation_source_schedule(): activation stage already scheduled
Jan  7 17:44:54 crankarm NetworkManager[853]: <error> [1420670694.953497] [devices/nm-device.c:2026] activation_source_schedule(): activation stage already scheduled
Jan  7 17:44:59 crankarm NetworkManager[853]: <error> [1420670699.663337] [devices/nm-device.c:2026] activation_source_schedule(): activation stage already scheduled
Jan  7 17:45:04 crankarm NetworkManager[853]: <error> [1420670704.576670] [devices/nm-device.c:2026] activation_source_schedule(): activation stage already scheduled
Jan  7 17:50:06 crankarm NetworkManager[853]: <error> [1420671006.548638] [devices/nm-device.c:2026] activation_source_schedule(): activation stage already scheduled
Jan  7 17:53:29 crankarm NetworkManager[853]: <error> [1420671209.001743] [devices/nm-device.c:2026] activation_source_schedule(): activation stage already scheduled


Jan  7 17:59:06 crankarm NetworkManager[853]: <error> [1420671546.000802] [devices/nm-device.c:2026] activation_source_schedule(): activation stage already scheduled
Jan  7 18:01:01 crankarm systemd: Starting Paths.
Jan  7 18:01:01 crankarm systemd: Reached target Paths.
Jan  7 18:01:01 crankarm systemd: Starting Timers.
Jan  7 18:01:01 crankarm systemd: Reached target Timers.
Jan  7 18:01:01 crankarm systemd: Starting Sockets.
Jan  7 18:01:01 crankarm systemd: Reached target Sockets.
Jan  7 18:01:01 crankarm systemd: Starting Basic System.
Jan  7 18:01:01 crankarm systemd: Reached target Basic System.
Jan  7 18:01:01 crankarm systemd: Starting Default.
Jan  7 18:01:01 crankarm systemd: Reached target Default.
Jan  7 18:01:01 crankarm systemd: Startup finished in 11ms.
Jan  7 18:01:01 crankarm systemd: Stopping Default.
Jan  7 18:01:01 crankarm systemd: Stopped target Default.
Jan  7 18:01:01 crankarm systemd: Stopping Basic System.
Jan  7 18:01:01 crankarm systemd: Stopped target Basic System.
Jan  7 18:01:01 crankarm systemd: Stopping Paths.
Jan  7 18:01:01 crankarm systemd: Stopped target Paths.
Jan  7 18:01:01 crankarm systemd: Stopping Timers.
Jan  7 18:01:01 crankarm systemd: Stopped target Timers.
Jan  7 18:01:01 crankarm systemd: Stopping Sockets.
Jan  7 18:01:01 crankarm systemd: Stopped target Sockets.
Jan  7 18:01:01 crankarm systemd: Starting Shutdown.
Jan  7 18:01:01 crankarm systemd: Reached target Shutdown.
Jan  7 18:01:01 crankarm systemd: Starting Exit the Session...
Jan  7 18:01:01 crankarm systemd: Received SIGRTMIN+24 from PID 2774 (kill).
Jan  7 18:01:01 crankarm kernel: [371545.616385] traps: polkitd[4443] general protection ip:7f53d8f1cde2 sp:7fffa0b6b050 error:0 in libmozjs-17.0.so[7f53d8ddd000+3ba000]
Jan  7 18:01:01 crankarm kernel: traps: polkitd[4443] general protection ip:7f53d8f1cde2 sp:7fffa0b6b050 error:0 in libmozjs-17.0.so[7f53d8ddd000+3ba000]
Jan  7 18:01:01 crankarm systemd: polkit.service: main process exited, code=killed, status=11/SEGV
Jan  7 18:01:01 crankarm systemd: Unit polkit.service entered failed state.
Jan  7 18:01:01 crankarm systemd: polkit.service failed.
Jan  7 18:01:01 crankarm NetworkManager[853]: <warn> error requesting auth for org.freedesktop.NetworkManager.wifi.share.open: (4) GDBus.Error:org.freedesktop.DBus.Error.NoReply: Message did not receive a reply (timeout by message bus)
Jan  7 18:01:01 crankarm NetworkManager[853]: <warn> error requesting auth for org.freedesktop.NetworkManager.wifi.share.protected: (4) GDBus.Error:org.freedesktop.DBus.Error.NoReply: Message did not receive a reply (timeout by message bus)
Jan  7 18:01:01 crankarm NetworkManager[853]: <warn> error requesting auth for org.freedesktop.NetworkManager.settings.modify.hostname: (4) GDBus.Error:org.freedesktop.DBus.Error.NoReply: Message did not receive a reply (timeout by message bus)


 modinfo atl1c
filename:       /lib/modules/3.17.7-300.fc21.x86_64/kernel/drivers/net/ethernet/atheros/atl1c/atl1c.ko.xz
version:        1.0.1.1-NAPI
license:        GPL
description:    Qualcom Atheros 100/1000M Ethernet Network Driver
author:         Qualcomm Atheros Inc., <nic-devel>
author:         Jie Yang
srcversion:     4333D8ADEE755DD5ABDF0B8
alias:          pci:v00001969d00001083sv*sd*bc*sc*i*
alias:          pci:v00001969d00001073sv*sd*bc*sc*i*
alias:          pci:v00001969d00002062sv*sd*bc*sc*i*
alias:          pci:v00001969d00002060sv*sd*bc*sc*i*
alias:          pci:v00001969d00001062sv*sd*bc*sc*i*
alias:          pci:v00001969d00001063sv*sd*bc*sc*i*
depends:        
intree:         Y
vermagic:       3.17.7-300.fc21.x86_64 SMP mod_unload 
signer:         Fedora kernel signing key
sig_key:        A3:DE:AA:E7:2F:85:C6:30:29:D7:87:90:41:C6:33:43:8A:40:E6:88
sig_hashalgo:   sha256

Comment 9 Tom Georgoulias 2015-01-12 01:53:12 UTC
I don't know what changed, but after I updated with the latest batch of errata this morning I've been able to use my wired connection non-stop all day long.

I'm running these now:

kernel-3.17.8-300.fc21.x86_64
systemd-216-14.fc21.x86_64
NetworkManager-0.9.10.1-1.git20150105.fc21.x86_64

I'm cautiously optimistic that this bug has been fixed, but I'll give it another day before I break out the champagne.

Comment 10 Penelope Fudd 2015-01-12 08:29:42 UTC
Hi...

I think I've figured out what's going wrong, at least in my situation.

My laptop has a wired and wireless connection to the same access point, and both networks are 192.168.116.0/24.  The apartment building gateway has the same ip address and mac address (192.168.116.5 and 24:a4:3c:05:29:82, vendor-id "Ubiquiti Networks") in both networks.  My wired interface is 192.168.116.115, and my wifi interface is 192.168.116.183.

To reproduce this, I have both wifi and ether up, then I unplug the ethernet cable, wait a few seconds, then plug it back in.  The network then goes down, waits a while, comes up for a less than a minute, goes down, lather-rinse-repeat.

The ethernet is using the r8169 module, the wifi is brcmsmac.

I've got 
kernel 3.17.7-200.fc20.i686
systemd-208-28.fc20.i686
NetworkManager-0.9.9.0-46.git20131003.fc20.i686

When I first noticed this, I tried to debug it with tcpdump, but much to my surprise, the problem went away *while* I was running tcpdump (Heisenbug!).  Perhaps it's arp filtering that's causing the problem?

# dmesg | gawk '/brcmsmac/{sub("^[^]]*[]]","");print}' | sort -u
 brcmsmac bcma0:0: brcmsmac: brcms_ops_bss_info_changed: associated
 brcmsmac bcma0:0: brcmsmac: brcms_ops_bss_info_changed: disassociated
 brcmsmac bcma0:0: brcms_ops_bss_info_changed: arp filtering: 0 addresses (implement)
 brcmsmac bcma0:0: brcms_ops_bss_info_changed: arp filtering: 1 addresses (implement)
 brcmsmac bcma0:0: brcms_ops_bss_info_changed: qos enabled: false (implement)
 brcmsmac bcma0:0: brcms_ops_bss_info_changed: qos enabled: true (implement)
 brcmsmac bcma0:0: brcms_ops_config: change power-save mode: false (implement)
 brcmsmac bcma0:0: mfg 4bf core 812 rev 24 class 0 irq 17
 brcmsmac bcma0:0 wlp2s0: renamed from wlan0
 ieee80211 phy0: registered radio enabled led device: brcmsmac-phy0:radio gpio: 243

I'll update my kernel, and see if I get success too.

Cheers!

Comment 11 Tom Georgoulias 2015-01-13 02:22:16 UTC
Unfortunately, it has returned.  I was peacefully working away on my laptop and suddenly web pages never reloaded.  Pull the plug, wait, plug it back in and things are OK again (for a while).

I'll try to capture some data when it reoccurs so I can have something to work with, for now it just seems like the nic gets dropped and goes back to wifi.

53008.520385] atl1c 0000:08:00.0: atl1c: eno1 NIC Link is Down
[57846.290646] atl1c 0000:08:00.0: atl1c: eno1 NIC Link is Up<100 Mbps Half Duplex>
[57846.290661] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready
[58801.815718] atl1c 0000:08:00.0: atl1c: eno1 NIC Link is Down
[58814.463440] atl1c 0000:08:00.0: atl1c: eno1 NIC Link is Up<100 Mbps Half Duplex>
[58814.463479] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready

I can't tell if something is telling it to switch (like NetworkManager) or the kernel driver is failing and the eno1 becomes useless.

Comment 12 Penelope Fudd 2015-01-13 03:49:01 UTC
Try turning wifi off then on.

Do you have the weird network configuration that I do, where the router on both nets has the same ip and mac?

Comment 13 Penelope Fudd 2015-01-13 03:52:15 UTC
That is to say, in order to make network fail, I had to unplug and replug the ethernet cable.  In order to bring it back, I had to turn off and turn on wifi with the rf kill switch on the laptop.

It might also be helpful to watch the output of 'ip monitor' (which runs continuously) and 'ip neighbour' frequently.  I don't know what to look for, but there's events happening when the network goes down (and up).

Comment 14 Tom Georgoulias 2015-01-15 13:17:37 UTC
I've experimented some more, monitoring syslog and ip monitor output.  (On my network, I have different IPs and MACs for each interface.) It kinda feels like this is a networkmanager issue.

I had lots of errors in ip monitor for IP6 issues, presumably because it's trying to get an IPV6 connection and my LAN doesn't provide IPV6. To rule out any ill effects, I disabled ipv6 both the wired and wireless configs in the network settings.  I also added these to /etc/sysctl.conf and rebooted:

net.ipv6.conf.all.disable=1
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.eno1.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv6.conf.wlo1.disable_ipv6 = 1

If I boot my laptop up with both wired and wifi enabled, it'll choose the wifi connection.  If I pull the cable, wait a few sec, then plug it back in, it uses wired until I put my laptop to sleep. 

Here's some example ip monitor output while successfully using the wired connection:

192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 REACHABLE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 STALE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 REACHABLE
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 STALE
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 STALE
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 STALE
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 STALE
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 STALE
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 STALE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 STALE
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 STALE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 REACHABLE
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 STALE
3: wlo1: <BROADCAST,MULTICAST,UP,LOWER_UP> 
    link/ether 
192.168.1.254 dev eno1 lladdr 74:9d:dc:db:4a:29 REACHABLE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 STALE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 REACHABLE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 STALE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 REACHABLE
192.168.1.72 dev wlo1 lladdr b8:27:eb:ea:8f:b2 STALE

Does this sound more like a NetworkManager issue than a kernel module issue?

Comment 15 Penelope Fudd 2015-01-22 07:55:44 UTC
Note: I'm just another user, not a Redhat employee.

There's been so many tweaks with the ip stack (starting with the creation of the 'ip' command itself; hey you kids, get off my lawn), that I am not confident picking one or the other.

However, you can put NetworkManager to sleep and then do the test:
   kill -STOP `pidof NetworkManager`
Wake it up afterwards:
   kill -CONT `pidof NetworkManager`

Not sure whether you need to do the same with dhclient, as it calls /usr/libexec/nm-dhcp-helper when there are changes.  If worst comes to worst, rebooting will restore everything.

Comment 16 Tom Georgoulias 2015-01-23 03:20:54 UTC
I understand you don't work at Red Hat, no problem.  I was asking that in the bz just in case someone else wanted to weigh in.  I've gone back to using wifi only so that I don't have to deal with it anymore, at least that still works.

Comment 17 John Greene 2015-01-23 14:33:36 UTC
Between the two of you (Tom and Penelope) you have 2 totally different sets of cards (ethernet and wifi) and so the problem, while appearing to be the same kind of thing *may* not be.  

Tom: I've looked at your atl1c driver history upstream.  I see the commit reference and then nearly imediately reverted as not relevent for this device, then something else in its place all in the 3.11 timeframe. 

b70176 atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring  3.11-rc1-254
fafb6eb Revert "atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring"  3.11-rc1-241
ebe7fdb atl1c: Fix misuse of netdev_alloc_skb in refilling rx ring  v3.11-rc1-240-gebe7fdb

The only other issue for the atl1c currently is this, 3.13 vintage.
a4f6363 atl1c: Check return from pci_find_ext_capability() in atl1c_reset_pcie()  

So as to this being an driver issue, perhaps there is still a problem (alluded to in notes for 7b70176421993866e616f1cbc4d0dd4054f1bf78 that remains at large.
As for now, nothing else is out there for the atl1c that would be helpful.

My instinct says that 
A> you guys may have separate problems, appearing similar.
B> have you tried updated network-manager, dbus, wpa_supplicant, etc?  This may help and would be something net-mgr guys would ask I assume?  

If it *is* a problem with the stack above drivers (sounds like it) that would be helpful possibly.  

It does appear odd that a stable (admittedly not sure of that) gigabit ethernet connection would be unused in preference to wifi like you see.  Have anyone posted a dmesg log with the problem occurring?  That would be good.

Comment 18 Tom Georgoulias 2015-01-24 02:13:04 UTC
I agree, I don't think we have the same issue.  And I don't think its my driver, I've been able to use the wired connection for long periods of time without failure.  If I turn off wifi using the switch on my laptop, the wired NIC just works.  I have the problem when both are enabled and I'm letting networkmanager handle the choice of which one to use.

As for RPMs, I have all of the latest of everything in the F21 updates repo.  I run yum update almost daily and reboot each time a new kernel is installed.

I'll be glad to post dmesg output, run test pre-errata RPMs, rebuild RPMs with patches, you name it.  Just tell me what you need to see and I'll provide it.

Comment 19 Tom Georgoulias 2015-01-24 13:50:12 UTC
Here's the dmesg.  This section is when it works:

[81521.658670] Restarting tasks ... done.
[81521.700991] video LNXVIDEO:00: Restoring backlight state
[81521.722694] atl1c 0000:08:00.0: irq 29 for MSI/MSI-X
[81521.736592] IPv6: ADDRCONF(NETDEV_UP): eno1: link is not ready
[81522.199703] IPv6: ADDRCONF(NETDEV_UP): wlo1: link is not ready
[81522.354814] atl1c 0000:08:00.0: atl1c: eno1 NIC Link is Up<100 Mbps Full Duplex>
[81522.354914] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready
[81523.832313] wlo1: authenticate with 74:9d:dc:db:4a:29
[81523.852355] wlo1: send auth to 74:9d:dc:db:4a:29 (try 1/3)
[81523.857332] wlo1: authenticated
[81523.857492] rtl8192ce 0000:01:00.0 wlo1: disabling HT as WMM/QoS is not supported by the AP
[81523.857496] rtl8192ce 0000:01:00.0 wlo1: disabling VHT as WMM/QoS is not supported by the AP
[81523.857670] wlo1: associate with 74:9d:dc:db:4a:29 (try 1/3)
[81523.861648] wlo1: RX AssocResp from 74:9d:dc:db:4a:29 (capab=0x431 status=0 aid=1)
[81523.861820] wlo1: associated
[81523.861830] IPv6: ADDRCONF(NETDEV_CHANGE): wlo1: link becomes ready
[81523.862244] cfg80211: Calling CRDA for country: US
[81523.868010] cfg80211: Regulatory domain changed to country: US
[81523.868015] cfg80211:  DFS Master region: FCC
[81523.868016] cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time)
[81523.868019] cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (N/A, 3000 mBm), (N/A)
[81523.868023] cfg80211:   (5170000 KHz - 5250000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 1700 mBm), (N/A)
[81523.868026] cfg80211:   (5250000 KHz - 5330000 KHz @ 80000 KHz, 160000 KHz AUTO), (N/A, 2300 mBm), (0 s)
[81523.868028] cfg80211:   (5735000 KHz - 5835000 KHz @ 80000 KHz), (N/A, 3000 mBm), (N/A)
[81523.868030] cfg80211:   (57240000 KHz - 63720000 KHz @ 2160000 KHz), (N/A, 4000 mBm), (N/A)
[81701.495037] atl1c 0000:08:00.0: atl1c: eno1 NIC Link is Down
[81711.474620] atl1c 0000:08:00.0: atl1c: eno1 NIC Link is Up<100 Mbps Full Duplex>

Everything was fine for a while, then the wired nic stopped working.  Typed dmesg again and heres the stuff that showed up next:


82202.270097] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
[82772.843704] SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
[82772.875961] nr_pdflush_threads exported in /proc is scheduled for removal
[82772.876072] sysctl: The scan_unevictable_pages sysctl/node-interface has been disabled for lack of a legitimate use case.  If you have one, please send an email to linux-mm.

Here's ip link info when the problem is present:

[root@crankarm ~]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 10:1f:74:f9:df:82 brd ff:ff:ff:ff:ff:ff
3: wlo1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DORMANT group default qlen 1000
    link/ether ac:81:12:af:8e:42 brd ff:ff:ff:ff:ff:ff

Comment 20 Tom Georgoulias 2015-01-27 02:59:55 UTC
Cautiously optimistic here, but I may have made some progress on this.  I noticed that my wired NIC kept coming up at 10 or 100base T, half duplex, so I updated ifcfg-eno1 with ETHTOOL_OPTS="autoneg off speed 100 duplex full"  Now my wired NIC is always in full duplex mode.

I ran all weekend with my wired NIC on and wifi off.  No issues with the link, it stayed up.

Today I've been running with wired connection plus wifi enabled, everything is OK.  I also updated to the new kernel-3.18.3-201.fc21.x86_64

Would I be wildly off base to think that NetworkManager might prefer wifi over wired if the wired was operating in half duplex mode?

Comment 21 Tom Georgoulias 2015-01-31 16:50:20 UTC
The 100 full duplex for wired nic is still doing the trick.  Closing this bz since I have a work around in place.


Note You need to log in before you can comment on or make changes to this bug.