Description of problem: Qualcomm Atheros AR8131 NIC only runs at low speed. Asus ROG laptop with the following: lspci reports: 06:00.0 Ethernet controller: Qualcomm Atheros AR8131 Gigabit Ethernet (rev c0) root@billlaptop ~# lsmod | grep ath ath9k 141923 0 ath9k_common 13503 1 ath9k ath9k_hw 443174 2 ath9k_common,ath9k ath 23142 3 ath9k_common,ath9k,ath9k_hw mac80211 564808 1 ath9k cfg80211 460310 3 ath,ath9k Actual results: How reproducible: Plug NIC into switch. NIC always comes up at low speed. Testing reveals speed is snail slow. root@billlaptop ~# ethtool p5p1: Settings for p5p1:: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: No Supports auto-negotiation: Yes Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on MDI-X: Unknown Supports Wake-on: pg Wake-on: d Current message level: 0x0000003f (63) drv probe link timer ifdown ifup Link detected: yes root@billlaptop ~# ethtool -s p5p1 speed 1000 Cannot advertise speed 1000 root@billlaptop ~# lshw -C network ... *-network description: Ethernet interface product: AR8131 Gigabit Ethernet vendor: Qualcomm Atheros physical id: 0 bus info: pci@0000:06:00.0 logical name: p5p1 version: c0 serial: bc:ae:c5:13:b3:09 size: 100Mbit/s capacity: 1Gbit/s width: 64 bits clock: 33MHz capabilities: pm msi pciexpress vpd bus_master cap_list ethernet physicaroot@billlaptop ~# root@billlaptop ~# l tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation configuration: autonegotiation=on broadcast=yes driver=atl1c driverversion=1.0.1.1-NAPI duplex=full ip=192.168.10.168 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s resources: irq:55 memory:d6200000-d623ffff ioport:8000(size=128) root@billlaptop ~# ethtool -s p5p1 speed 1000 autoneg off This hangs the NIC loosing connectivity. systemctl to restart network.service does not revive the NIC. I have to bounce the box to regain the slow connection.
What kind of switch? Have you tried on a different switch?
I only have 1 switch - Trendnet Gigabit TEG-S80g. I swapped cables and tried different switch ports all to no effect. I've noted that without a switch, when two machines are connected with a crossover cable, the first machine that comes up does so on low speed as there is no one to negotiate with, and the second also connects at low speed because the first is already there. The switch is what allows a machine coming up to "see" another active device and negotiate a speed. Therefore, I don't think connecting my box to another via a crossover cable is a valid test of anything. Am I wrong?
Yeah, probably true.. Fail on my part. Lets try again. Can you upload the dmesg output when you connect to the card switch? I'll take a look. If there is no helpful stuff as is, we might need to turn on driver debugging.
Created attachment 915777 [details] Comment (This comment was longer than 65,535 characters and has been moved to an attachment by Red Hat Bugzilla).
yup, not much help, need some additional debug info enabled in the driver. Do you have the ability / comfort level to build your own kernels?
I can follow instructions. I built kernels 9 years ago and not one since. I think things have probably changed, and I've slept since then. I'm game if you are. Want to use email for communications?
Ok, cool. I can give you a couple items to get you started. Would do email, but I sleep sometimes too..and, as you can see, when I sleep too little I fail. ;) So I'd prefer to leave it here to benefit all that come later and provide me a history to help others. And get you fixed asap.. A couple things first may preclude the need to build a kernel: I'm just that much lazy. Check the mount table for debugfs mounted..should be by default. # mount | grep debugfs none on /sys/kernel/debug type debugfs (rw) Here on f19 it's /sys/kernel/debug, on by default. Confirm you have that first. Next, we need to enable the driver debug: as root do this: cat <path to debugfs>/sys/kernel/debug/dynamic_debug/control | grep -i ath1c If your kernel is enabled and driver is loaded, to debug you'll get a bunch of output listing the debug messages from these files below..if not, we gotta build you a kernel, somehow. <path>/drivers/net/ethernet/atheros/atl1c/atl1c_ethtool.c <path>/drivers/net/ethernet/atheros/atl1c/atl1c_hw.c <path>/drivers/net/ethernet/atheros/atl1c/atl1c_main.c Grab the list generated and post it to me and we'll march on.
Probably not what you were hoping for: BTW I changed the grep to atl1c instead of ath1c as ath1c produced nothing. root@billlaptop dynamic_debug# cat control | grep -i atl1c drivers/net/ethernet/atheros/atl1c/atl1c_main.c:616 [atl1c]atl1c_mii_ioctl =_ "<atl1c_mii_ioctl> write %x %x" drivers/net/ethernet/atheros/atl1c/atl1c_main.c:2640 [atl1c]atl1c_probe =_ "mac address : %pM\012" drivers/net/ethernet/atheros/atl1c/atl1c_main.c:2443 [atl1c]atl1c_suspend =_ "phy power saving failed" drivers/net/ethernet/atheros/atl1c/atl1c_main.c:2300 [atl1c]atl1c_request_irq =_ "atl1c_request_irq OK\012" drivers/net/ethernet/atheros/atl1c/atl1c_main.c:437 [atl1c]atl1c_vlan_mode =_ "atl1c_vlan_mode\012" drivers/net/ethernet/atheros/atl1c/atl1c_main.c:451 [atl1c]atl1c_restore_vlan =_ "atl1c_restore_vlan\012" drivers/net/ethernet/atheros/atl1c/atl1c_hw.c:821 [atl1c]atl1c_power_saving =_ "%s: suspend MAC=%x,MASTER=%x,PHY=0x%x,WOL=%x\012" drivers/net/ethernet/atheros/atl1c/atl1c_hw.c:814 [atl1c]atl1c_power_saving =_ "%s: write phy MII_IER failed.\012" drivers/net/ethernet/atheros/atl1c/atl1c_hw.c:741 [atl1c]atl1c_phy_to_ps_link =_ "get speed and duplex failed\012" drivers/net/ethernet/atheros/atl1c/atl1c_hw.c:727 [atl1c]atl1c_phy_to_ps_link =_ "phy autoneg failed\012"
Good catch..Actually looks great: We should be able to get some debug turned on without a kernel build. couple more commands to enable the prints in the driver echo 'file atl1c_main.c +p' > /sys/kernel/debug/debugfs/dynamic_debug/control echo 'file atl1c_hw.c +p' > /sys/kernel/debug/debugfs/dynamic_debug/control then run your connect test again, maybe a couple times to get both connect and disconnect then turn the prints off again to keep your logs from getting clogged: echo 'file atl1c_main.c -p' > /sys/kernel/debug/debugfs/dynamic_debug/control echo 'file atl1c_hw.c -p' > /sys/kernel/debug/debugfs/dynamic_debug/control Then send me the dmesg log. It works a little better to attach the file rather than paste it unless it's pretty small. Hope we catch something to help debug the negotiation issue.
Hang on! That control file is rather large and your two commands would normally wipe it out and only leave one line in it. I assume you meant >> instead of > - correct? I tried >> and did a tail after the first echo to see what happened - nothing happened. File length is 0 yet it has content, but the echo didn't touch it. Then I tried the simple > and again - no change. ???
Great valid question, but nope .. > is right. For a great background: http://lwn.net/Articles/434856/ debugfs is a virtual file system, used to control the debug print process. So..you aren't clobbering anything. In this case, cat the control list before and after and you'll see these with a "-p" in the string go to +p. This is handled by the print support to log messages to log without building a driver, HOPEFULLY. No worries about clobbering anything permanently..
Stop the presses! Yesterday, I connected my laptop to a wireless network I set up on another box using hostapd. During my testing, the gateway setting I had on the wired NIC was getting in my way, so I turned the wired NIC off via networkManager to only have the wireless NIC available. After completing my wireless testing, I turned the wireless NIC off and turned the wired NIC back on. I kept using the laptop for a few more hours and then powered it down. This morning, I turned my laptop on and noted that the switch had the 1000Mbps light lit for my box. bill@billlaptop ~$ ethtool p5p1: Settings for p5p1:: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supported pause frame use: No Supports auto-negotiation: Yes Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on MDI-X: Unknown Cannot get wake-on-lan settings: Operation not permitted Current message level: 0x0000003f (63) drv probe link timer ifdown ifup Link detected: yes Note the difference in this output versus the first one I uploaded. I executed it several times to see if it ever changed, but its always the same. The only thing I know I changed was turning the NIC off and then back on. How could doing that have given me the 1000Mbps capability?
I don't know..Strange things happen like that at times. Are you able to reproduce the original problem now at all? What is the wireless NIC?
I've booted the box several times after power downs and can't reproduce the issue. I even powered off the switch, other boxes, etc and now I get 1000Mbps by default. I have no idea what changed. The wireless NIC is : bill@billlaptop ~$ lspci -k | grep -A 3 -i "network" 03:00.0 Network controller: Qualcomm Atheros AR9285 Wireless Network Adapter (PCI-Express) (rev 01) Subsystem: AzureWave AW-NE785 / AW-NE785H 802.11bgn Wireless Full or Half-size Mini PCIe Card Kernel driver in use: ath9k At end of day, I always powered down all my test gear by turning off a UPS. Now doing the same morning startup sequence no longer gives me a slow NIC. I'm stumped.
Frustrating! I hate bugs that hide when to turn lights on. Check your logs and see if we got anything on the negotiation at all. Sounds doubtful, but maybe.. If we can't reproduce, it will be hard to do much. I checked this driver upstream just now, I don't see any fixes later that 3.11-rc1, which you have.So, barring anything new showing up, there don't appear to be any fixes imminent. If that is where we are, I'd say I'll leave this open a few days for you to reproduce: if we can't do so in a couple weeks say, we can close CANT REPRODUCE. Hope you can reproduce..will help you if do!
"I hate bugs that hide when to turn lights on." That's a very good line. I like it. The original problem has been there for a long time. I ignored the speed issue as 100Mbps was fine for most things. Only recently I needed to move several hundred GIG of data back and forth between machines and that literally took days. At 11MBps I did the math and knew I had to get the speed bumped up. That's why I started this thread. Now I have the 1000Mbps and transfers are going at about 34MBps. Not the 10 times as fast as I was hoping for, but certainly better. I've booted this box numerous times to see if I can get it to go back to 100Mbps but no such (bad) luck. How turning the NIC off and back on again fixed the issue is a mystery, and frankly I don't believe that did it. This box has one other issue I'm investigating but I don't think it has anything to do with the speed. In the morning, when I power up, the first boot gives me the Fedora balloon and I never get a login screen. I have to go to a terminal, login and issue a shutdown -r now. The second boot works. This problem is accelerating. This used to happen once a week, now its every day. dmesg comparisons of the boots don't highlight a difference. I'm still looking into it.
(In reply to Bill Gradwohl from comment #16) > "I hate bugs that hide when to turn lights on." > > That's a very good line. I like it. Thanks, but sticking with day job a while longer.. > > The original problem has been there for a long time. I ignored the speed > issue as 100Mbps was fine for most things. Only recently I needed to move > several hundred GIG of data back and forth between machines and that > literally took days. At 11MBps I did the math and knew I had to get the > speed bumped up. That's why I started this thread. > > Now I have the 1000Mbps and transfers are going at about 34MBps. Not the 10 > times as fast as I was hoping for, but certainly better. > > I've booted this box numerous times to see if I can get it to go back to > 100Mbps but no such (bad) luck. How turning the NIC off and back on again > fixed the issue is a mystery, and frankly I don't believe that did it. > Many times this is par, seldom reachs the media speed in thruput. Depends on a lot of things, CPU/Memory/Tool used to cp the data.. Looked up your switch it has no firmware upgrade capability. Price was good I bet though.. lol Switch manual says it supports 802.3az DRAFT but the chip hardware supports the "latest standard": wonder if there might be a mismatch? You might try booting with 'pcie_aspm=off' added to the kernel command line and see if this helps. Beyond that..we need to reproduce and get a log to continue or await a fix upstream. > This box has one other issue I'm investigating but I don't think it has > anything to do with the speed. In the morning, when I power up, the first > boot gives me the Fedora balloon and I never get a login screen. I have to > go to a terminal, login and issue a shutdown -r now. The second boot works. > > This problem is accelerating. This used to happen once a week, now its every > day. dmesg comparisons of the boots don't highlight a difference. I'm still > looking into it. Sounds like an real issue. Please put this into a new bug, it helps us keep things straight if you don't mind. I suggest you add the logs you have and try disabling wifi and/or bluetooth (if you have it) and see if you can coorelate that much with boot issue. Please post those results in new BZ. Not sure it's a wifi issue, just a tip to help you get started. Certainly Red Hat will help out there too. As you enter the new BZ, you'll be presented with similar issues already out there, maybe it's fixed. Let me know about the pcie_asmp=off (goes in the grub file). For now, though, lets close this as insufficient data. Feel free to reopen if you can reproduce the low speed issue!