Hide Forgot
Description of problem: At boot, the Intel igb card fails with: [ 35.883590] igb 0000:04:00.0 enp4s0: PCIe link lost, device now detached [ 35.891333] br0: port 1(enp4s0) entered blocking state [ 35.891338] br0: port 1(enp4s0) entered disabled state [ 35.891645] device enp4s0 entered promiscuous mode [ 35.904155] igb 0000:04:00.0 enp4s0: failed to initialize vlan filtering on this port [ 35.915012] br0: port 1(enp4s0) entered blocking state [ 35.915017] br0: port 1(enp4s0) entered disabled state [ 35.931059] igb 0000:04:00.0 enp4s0: failed to initialize vlan filtering on this port It was suggested to me that this indicates a hardware failure. However this is unlikely, as simply reloading the igb module fixes the problem. I now have a script which does this after boot: modprobe -r igb sleep 1 modprobe igb sleep 1 systemctl restart network So it looks much more likely that the driver is just broken. Version-Release number of selected component (if applicable): Currently 4.11.0-0.rc4.git1.1.fc27.x86_64, but this has been happening since I bought the machine a year ago. How reproducible: 100% Steps to Reproduce: 1. Boot.
Hi Richard, is that the only output you got, or do you have also a splat like: [ 471.537833] ------------[ cut here ]------------ [ 471.537849] igb: Failed to read reg 0x8! [ 471.537904] WARNING: CPU: 1 PID: 9497 at drivers/net/ethernet/intel/igb/igb_main.c:756 igb_rd32.cold+0x30/0x3b [igb] [...] [ 471.538638] Call Trace: [ 471.538654] igb_get_link_ksettings+0x20/0x200 [igb] [ 471.538674] duplex_show+0x6e/0xc0 [ 471.538689] dev_attr_show+0x19/0x40 [ 471.538704] sysfs_kf_seq_show+0x9b/0xf0 [ 471.538720] seq_read+0xcd/0x400 [ 471.538734] vfs_read+0x9d/0x150 [ 471.538746] ksys_read+0x5f/0xe0 [ 471.538761] do_syscall_64+0x5f/0x1a0 [ 471.538776] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 471.538795] RIP: 0033:0x7ff5a09383c2 [ 471.538808] Code: c0 e9 c2 fe ff ff 50 48 8d 3d c2 0d 0a 00 e8 b5 f1 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 471.538862] RSP: 002b:00007ffe3e6fd9d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 471.538887] RAX: ffffffffffffffda RBX: 00000000021442e0 RCX: 00007ff5a09383c2 [ 471.538910] RDX: 0000000000001000 RSI: 000000000215a350 RDI: 0000000000000004 [ 471.538932] RBP: 00007ff5a0a0a300 R08: 0000000000000004 R09: 0000000000000070 [ 471.538955] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000021442e0 [ 471.538977] R13: 00007ff5a0a09700 R14: 0000000000000d68 R15: 0000000000000d68 [ 471.539000] ---[ end trace 0aea06ceef9e275e ]--- Have you already had the opportunity to try kernel 5.3.7-301.fc31 without your workaround? I've found this commit that worked on that part of the code: 94bc1e522b32c866d85b5af0ede55026b585ae73 maybe may be relevant for you as well.
It still happens on this same hardware with every kernel I've tried since around 2016. This machine is using the Rawhide kernel. I don't know if there's something particular about 5.3.7-301.fc31, but there's is nothing for the latest Rawhide (5.4.0-0.rc6.git0.1.fc32.x86_64). In case I missed something I will attach the complete log.
Created attachment 1633038 [details] dmesg