I have recently swapped network cards from a dual port Solarflare SFN6122F (uses an sfc_siena driver) to a Solarflare SFN8522 (uses sfc) adapter. Shut down, swapped cards, rebooted and have just spent the last 3 hours working out why my network card no longer has a link detected for more than a few seconds after power on. All cables are the same between the two adapter cards and I have also tried with different cables. kernel-6.4.4-200.fc38.x86_64 tuned-2.20.0-1.fc38.noarch This is possibly a tuned bug or maybe it's a kernel bug. Thoguth I'd start with tuned as not running that fixes the immediate problem. Reproducible: Always Steps to Reproduce: 1. Install Solarflare SFN8522 and connect using Direct Attach cable to a switch or to another machine (I do both) 2. Install tuned and set to use profile 'powersave' 3. May need a reboot to activate tuned Actual Results: A cold boot (power on) comes up as normal, link is detected, connection established for anything between 1s and about 1 minute then it goes away. Running `ethtool enp9s0f0np0` shows "Link detected: No". Syslog shows Jul 26 21:48:29 trevor4 kernel: [ 5.263140] sfc 0000:09:00.0: Solarflare NIC detected Jul 26 21:48:29 trevor4 kernel: [ 5.269260] sfc 0000:09:00.0: Part Number : SFN8522 Jul 26 21:48:29 trevor4 kernel: [ 5.498217] sfc 0000:09:00.1: Solarflare NIC detected Jul 26 21:48:29 trevor4 kernel: [ 5.501776] sfc 0000:09:00.1: Part Number : SFN8522 Jul 26 21:48:29 trevor4 kernel: [ 5.522425] sfc 0000:09:00.0 enp9s0f0np0: renamed from eth0 Jul 26 21:48:29 trevor4 kernel: [ 5.617192] sfc 0000:09:00.1 enp9s0f1np1: renamed from eth1 Jul 26 21:48:27 trevor4 kernel: sfc 0000:09:00.0: Solarflare NIC detected Jul 26 21:48:27 trevor4 kernel: sfc 0000:09:00.0: Part Number : SFN8522 Jul 26 21:48:27 trevor4 kernel: sfc 0000:09:00.1: Solarflare NIC detected Jul 26 21:48:27 trevor4 kernel: sfc 0000:09:00.1: Part Number : SFN8522 Jul 26 21:48:27 trevor4 kernel: sfc 0000:09:00.0 enp9s0f0np0: renamed from eth0 Jul 26 21:48:28 trevor4 kernel: sfc 0000:09:00.1 enp9s0f1np1: renamed from eth1 Jul 26 21:48:29 trevor4 kernel: [ 7.167012] sfc 0000:09:00.0 enp9s0f0np0: link up at 10000Mbps full-duplex (MTU 1500) Jul 26 21:48:29 trevor4 kernel: sfc 0000:09:00.0 enp9s0f0np0: link up at 10000Mbps full-duplex (MTU 1500) Jul 26 21:48:29 trevor4 kernel: [ 7.340488] sfc 0000:09:00.1 enp9s0f1np1: link up at 10000Mbps full-duplex (MTU 1500) Jul 26 21:48:29 trevor4 kernel: sfc 0000:09:00.1 enp9s0f1np1: link up at 10000Mbps full-duplex (MTU 1500) Jul 26 21:49:20 trevor4 kernel: [ 57.446270] sfc 0000:09:00.0 enp9s0f0np0: link down Jul 26 21:49:20 trevor4 kernel: [ 57.446440] sfc 0000:09:00.0 enp9s0f0np0: link down Jul 26 21:49:20 trevor4 kernel: [ 57.488974] sfc 0000:09:00.1 enp9s0f1np1: link down Jul 26 21:49:20 trevor4 kernel: [ 57.489024] sfc 0000:09:00.1 enp9s0f1np1: link down # sfctool enp9s0f0npo0 Settings for enp9s0f0np0: Supported ports: [ FIBRE ] Supported link modes: 1000baseT/Full 1000baseX/Full 10000baseCR/Full 10000baseSR/Full 10000baseLR/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 1000baseT/Full 1000baseX/Full 10000baseCR/Full 10000baseSR/Full 10000baseLR/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Link partner advertised link modes: Not reported Link partner advertised pause frame use: No Link partner advertised auto-negotiation: No Link partner advertised FEC modes: Not reported Speed: 10000Mb/s Duplex: Full Port: FIBRE PHYAD: 255 Transceiver: internal Auto-negotiation: on Supports Wake-on: d Wake-on: d Current message level: 0x000020f7 (8439) drv probe link ifdown ifup rx_err tx_err hw Link detected: no At this point I found that the only way to get the Link Detected: yes back was to cold boot the machine using the power button. Ctrl-Alt-Del sometimes seemed to work but the only reliable way to get it working again was to power off/on. This was repeatable on every boot, link would connect, things would work for some time - never more than about one minute, sometimes going away before I could even login to ping things. I have two of these cards installed, one in a machine with tuned set to profile powersave which exhibits the problem. The other is set to profile virtual-host and does not. I have swapped SFN8522 cards between the 2 systems and both work in the virtual-host system and both fail in the one in powersave mode. After many many reboots into single user, multi-user, and emergency targets and activating the network manually with `ip`, then bringing up services one by one I found that `systemctl mask tuned` will stop this. I haven't experimented with tuned settings to see if I can get it to stop doing whatever it is that it's doing that breaks this. I'm just thankful to have a working network connection again! Expected Results: Network connection works reliably without needing power off/on! Link detected: yes While debugging this problem I have used sfboot to reset the dual port adapter to all default settings, upgraded to latest Solarflare firmware: Firmware version: v8.5.2 Controller type: Solarflare SFC9200 family Controller version: v8.5.0.1002 Boot ROM version: v5.2.2.1006 UEFI ROM version: v2.9.6.3 I actually use a profile called powersave-nodisk which is set to use the following # cat /etc/tuned/powersave-nodisk/tuned.conf [main] summary=Optimize for low power consumption but leave storage alone include=powersave [disk] devices=sdz I have no sdz, I just wanted it to stop powering down the two spinning rust devices in my mdadm array.
Fedora Linux 38 entered end-of-life (EOL) status on 2024-05-21. Fedora Linux 38 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.