Bug 1787026 - [REGRESSION] kernel 5.4.x breaks e1000e: can't connect to network
Summary: [REGRESSION] kernel 5.4.x breaks e1000e: can't connect to network
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 31
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1787187 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-30 12:52 UTC by Vasiliy Glazov
Modified: 2020-03-10 06:48 UTC (History)
36 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-10 06:48:57 UTC
Type: Bug


Attachments (Terms of Use)
journalctl -k (89.00 KB, text/plain)
2019-12-30 12:52 UTC, Vasiliy Glazov
no flags Details
/var/log/messages excerpt (10.57 KB, text/plain)
2020-01-02 13:40 UTC, Eugene Mah
no flags Details
Proposed revert from upstream backported to 5.4.x stable (6.00 KB, patch)
2020-01-06 09:36 UTC, Stefan Becker
no flags Details | Diff
journalctl_b0 (291.39 KB, text/plain)
2020-01-07 04:58 UTC, Ronald Warsow
no flags Details
journalctl -b -1 (546.65 KB, text/plain)
2020-01-13 14:33 UTC, Eugene Mah
no flags Details
Add proposed patch to Fedora kernel package (7.98 KB, patch)
2020-01-26 14:23 UTC, Stefan Becker
no flags Details | Diff
Add proposed patch to Fedora kernel package (8.02 KB, patch)
2020-02-05 08:03 UTC, Stefan Becker
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Linux Kernel 205047 0 None None None 2020-01-06 08:18:58 UTC

Description Vasiliy Glazov 2019-12-30 12:52:56 UTC
Created attachment 1648586 [details]
journalctl -k

1. Please describe the problem:

I am try kernels 5.4.5, 5.4.6 for F31 from koji and can't use network.
In log I see many this messages:
дек 30 12:31:29 v-glazov kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
дек 30 12:31:35 v-glazov kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
дек 30 12:31:42 v-glazov kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
дек 30 12:31:49 v-glazov kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
дек 30 12:31:56 v-glazov kernel: e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

And network always down. After rebooting to any 5.3 kernels all work good.

2. What is the Version-Release number of the kernel:

kernel-5.4.5-300.fc31.x86_64
kernel-5.4.6-200.fc31.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

Work good in kernel-5.3.16-300.fc31.x86_64

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Just install new kernel and reboot to it.

Comment 1 Ryan 2020-01-01 05:10:32 UTC
*** Bug 1787187 has been marked as a duplicate of this bug. ***

Comment 2 Eugene Mah 2020-01-02 13:39:56 UTC
Having the same issue on one of my systems with an Intel ethernet controller with the 5.4.x kernels.

lspci -v:
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31)
	Subsystem: Dell Device 06b7
	Flags: bus master, fast devsel, latency 0, IRQ 125
	Memory at ef100000 (32-bit, non-prefetchable) [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: e1000e
	Kernel modules: e1000e

dmesg:
[   21.880554] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   28.662467] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   35.448386] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   42.360741] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   49.145497] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   55.994139] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   62.774053] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   69.563958] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   76.473873] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   83.386789] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
[   90.101706] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

Excerpt from /var/log/messages attached

Comment 3 Eugene Mah 2020-01-02 13:40:39 UTC
Created attachment 1649181 [details]
/var/log/messages excerpt

Comment 4 Ryan 2020-01-03 02:54:52 UTC
This is also affecting 5.4.7-200.fc31.x86_64 currently in Koji

Comment 5 Stefan Becker 2020-01-05 14:22:27 UTC
Same here with Dell Latitude E6420. Reverting back to 5.3.14.

[    3.583701] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[    3.583706] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[    3.583951] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[    3.685730] e1000e 0000:00:19.0 0000:00:19.0 (uninitialized): registered PHC clock
[    3.778472] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) <MAC REMOVED>
[    3.778474] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection
[    3.778513] e1000e 0000:00:19.0 eth0: MAC: 10, PHY: 11, PBA No: 3041FF-0FF
[    3.781126] e1000e 0000:00:19.0 enp0s25: renamed from eth0
[   11.327370] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[   24.619723] e1000e: enp0s25 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
... and so on...

Comment 6 Andre Robatino 2020-01-05 17:24:23 UTC
Works for me with F31 and 5.4.7-200.fc31.x86_64 from updates-testing, so not affecting everyone.

00:19.0 Ethernet controller: Intel Corporation 82567LM-3 Gigabit Network Connection (rev 02)
	Subsystem: Lenovo Device 3048
	Flags: bus master, fast devsel, latency 0, IRQ 26
	Memory at fc500000 (32-bit, non-prefetchable) [size=128K]
	Memory at fc527000 (32-bit, non-prefetchable) [size=4K]
	I/O ports at 1820 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: e1000e
	Kernel modules: e1000e

Comment 7 Stefan Becker 2020-01-05 20:24:06 UTC
Correct, seems to only affect some systems (maybe older chips?)

Dell Latitude E6420: does not work

00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (Lewisville) (rev 
04)
        Subsystem: Dell Device 0493
        Flags: bus master, fast devsel, latency 0, IRQ 33
        Memory at e6e00000 (32-bit, non-prefetchable) [size=128K]
        Memory at e6e80000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at 5080 [size=32]
        Capabilities: [c8] Power Management version 2
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [e0] PCI Advanced Features
        Kernel driver in use: e1000e
        Kernel modules: e1000e

Gigabyte motherboard B360M-D3H: works fine

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10)
        DeviceName: Onboard - Ethernet
        Subsystem: Gigabyte Technology Co., Ltd Device e000
        Flags: bus master, fast devsel, latency 0, IRQ 136
        Memory at a1300000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Kernel driver in use: e1000e
        Kernel modules: e1000e

Comment 8 Ronald Warsow 2020-01-06 00:28:53 UTC
cit.: "Correct, seems to only affect some systems (maybe older chips?)"

I don't know !

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device 7a72
	Flags: bus master, fast devsel, latency 0, IRQ 135
	Memory at df200000 (32-bit, non-prefetchable) [size=128K]
	Capabilities: [c8] Power Management version 3
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [e0] PCI Advanced Features
	Kernel driver in use: e1000e



Revison of my I219-V ???
Box was bought 30.06.2017.



I'm running 5.4.x vanilla kernels and notices since around 5.4.2 that the eth controller doesn't come up, but only sometimes, so every ~3 clean box start (not out of suspend) independently if it's the first boot of the day. 
A immediate reboot fixes it so far.

additional info:
- I set my MTU to 1492 in NM and do fixed IP adress dhcp (means: my router(AVM FB 7590, New since 11.2019) always gives the same IP to the box via dhcp).
- In the past with some, but not all kernels - IIRC - 5.2.x, manual MTU setting did some trouble. eth controller was down. Switching MTU to default fixed it.
- I'm currently some small steps away from setting MTU to default and investigate further with distro kernels, cause reboot fixed it.
 
I wonder if this is an old problem, coming up again ...

Comment 9 Andre Robatino 2020-01-06 02:45:16 UTC
I've only booted into 5.4.7 once (after seeing the comments in https://bodhi.fedoraproject.org/updates/FEDORA-2020-8eb97ec77e indicating that e1000e might not work) so I don't know if there's an intermittent problem. My machine is a Lenovo ThinkCentre M58p which is around 10 years old.

Comment 10 Stefan Becker 2020-01-06 08:18:59 UTC
It seems upstream has identified one change that causes this and is in the queue to be reverted on master: https://patchwork.ozlabs.org/patch/1217709/

Comment 11 Stefan Becker 2020-01-06 09:36:17 UTC
Created attachment 1650070 [details]
Proposed revert from upstream backported to 5.4.x stable

I can confirm that the proposed upstream revert fixes the issue for my system.

I'm attaching the patch, backported to stable 5.4.x (one minor change to make it apply cleanly).

Comment 12 Ryan 2020-01-06 21:05:28 UTC
glad someone has found a possible fix for this in the patch above, I am running a fairly old asus motherboard (P8Z77v-pro from around 2012?), for the record the output from lspci is:

00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 04)
        DeviceName:  Onboard LAN
        Subsystem: ASUSTeK Computer Inc. P8P67 Deluxe Motherboard
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 50
        Region 0: Memory at f7b00000 (32-bit, non-prefetchable) [size=128K]
        Region 1: Memory at f7b35000 (32-bit, non-prefetchable) [size=4K]
        Region 2: I/O ports at f080 [size=32]
        Capabilities: <access denied>
        Kernel driver in use: e1000e
        Kernel modules: e1000e

Comment 13 john getsoian 2020-01-07 04:06:57 UTC
I don't have the log from the net connect failure as I have already backed out the kernel update (one place where BTRFS does shine!), but this is an ASUS Z170-Pro with a I7-6700 so it is not *that* old ;) and it hit here. Looking forward to reversion.

thanks
jg

Comment 14 Ronald Warsow 2020-01-07 04:58:14 UTC
Created attachment 1650261 [details]
journalctl_b0

journalctl from an un-init eth0

see time: Jan 07 01:24:16

Comment 15 Stefan Becker 2020-01-07 06:45:40 UTC
Another data point: Lenovo T490s, works fine

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (6) I219-V (r
ev 30)
        Subsystem: Lenovo Device 2286
        Flags: bus master, fast devsel, latency 0, IRQ 129
        Memory at c9700000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Kernel driver in use: e1000e
        Kernel modules: e1000e

Comment 16 Andy Mauragis 2020-01-08 13:23:34 UTC
Regression present with behavior as originally reported in 5.4.7-200.fc31 on a Z370 board (MSI Z370 Gaming Pro Carbon AC):
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V
        DeviceName: Onboard - Ethernet
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device 7b45
        Flags: bus master, fast devsel, latency 0, IRQ 136
        Memory at d9600000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [e0] PCI Advanced Features
        Kernel driver in use: e1000e
        Kernel modules: e1000e

Reverting to 5.3.16-300.fc31 eliminates the issue

Mentioning because it is I219-V like Stefan has (though potentially a different rev)

Comment 17 Ryan 2020-01-09 10:47:01 UTC
5.4.8-200.fc31.x86_64 also has this issue for me

Comment 18 JM 2020-01-10 12:41:44 UTC
Same problem here with Fedora 30 and kernel-5.4.7-100.fc30.x86_64

00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10)
	DeviceName: Onboard - Ethernet
	Subsystem: ASRock Incorporation Device 15bc
	Flags: bus master, fast devsel, latency 0, IRQ 124
	Memory at ab200000 (32-bit, non-prefetchable) [size=128K]
	Capabilities: [c8] Power Management version 3
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Kernel driver in use: e1000e
	Kernel modules: e1000e

Kernel 5.3.16-200.fc30.x86_64 works.

Comment 19 Vasiliy Glazov 2020-01-10 13:23:13 UTC
My device:
Network:   Device-1: Intel Ethernet I217-LM vendor: Hewlett-Packard driver: e1000e v: 3.2.6-k port: f080 bus ID: 00:19.0 
           chip ID: 8086:153a

Comment 20 john getsoian 2020-01-11 05:01:22 UTC
Confirming as per "Ryan". No change with 5.4.8-200 but I assume window had closed for revisions to 5.4.8 before this bug was posted....

Jan 10 23:33:39  kernel: e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 10 23:33:32  kernel: e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 10 23:33:20  kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
Jan 10 23:33:20  kernel: e1000e 0000:00:1f.6 eth0: MAC: 12, PHY: 12, PBA No: FFFFFF-0FF
Jan 10 23:33:20  kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
Jan 10 23:33:20  kernel: e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) f8:32:e4:74:a4:ed
Jan 10 23:33:20  kernel: e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered PHC clock
Jan 10 23:33:20  kernel: e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
Jan 10 23:33:20  kernel: e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
Jan 10 23:33:20  kernel: e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k

ASUS Z70Pro, Fedora 31/KDE


Lan adapter: Intel Corporation Ethernet Connection (2) I219-V

Comment 21 Ryan 2020-01-11 06:01:45 UTC
5.4.10-200.fc31.x86_64 also affected

Comment 22 P D 2020-01-11 06:13:52 UTC
Works fine on Thinkpad T510i.

00:19.0 Ethernet controller: Intel Corporation 82577LM Gigabit Network Connection (rev 06)
	Subsystem: Lenovo Device 2153
	Flags: bus master, fast devsel, latency 0, IRQ 31
	Memory at f2600000 (32-bit, non-prefetchable) [size=128K]
	Memory at f2625000 (32-bit, non-prefetchable) [size=4K]
	I/O ports at 1820 [size=32]
	Capabilities: <access denied>
	Kernel driver in use: e1000e
	Kernel modules: e1000e

Comment 23 Eugene Mah 2020-01-13 14:15:13 UTC
Still running into this problem on one of my machines with the 5.4.10 kernel

Comment 24 Eugene Mah 2020-01-13 14:33:20 UTC
Created attachment 1651880 [details]
journalctl -b -1

Output from journalctl -b -1

Comment 25 Marius 2020-01-14 14:34:39 UTC
Me and a colleague are having the same problem here. As a workaround you can disable autonegotiate for the connection to get it working. We tested it with 1000 MBit/s and Full and Half duplex.

Output from lspci
Ethernet controller: Intel Corporation Ethernet Connection (2) I219-LM (rev 31)
	Subsystem: Dell Device 06de
	Flags: bus master, fast devsel, latency 0, IRQ 127
	Memory at ef200000 (32-bit, non-prefetchable) [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: e1000e
	Kernel modules: e1000e

Comment 26 Mr E 2020-01-14 15:52:27 UTC
Also afflicted, but Marius' workaround also worked for me.

>lspci -v
Ethernet controller: Intel Corporation Ethernet Connection (2) I219-V (rev 31)
Subsystem: Micro-Star International Co., Ltd. [MSI] Device 7998
Flags: bus master, fast devsel, latency 0, IRQ 137
Memory at df400000 (32-bit, non-prefetchable) [size=128K]
Capabilities: [c8] Power Management version 3
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [e0] PCI Advanced Features
Kernel driver in use: e1000e
Kernel modules: e1000e

Intel C236 chipset circa 2017.

Comment 27 Eugene Mah 2020-01-15 13:45:06 UTC
Unfortunately for me, disabling autonegotiation didn't work on the one machine I'm having a problem with (unless I did it wrong, which is entirely possible)

Comment 28 Ryan 2020-01-16 09:37:31 UTC
fix from marius above setting negotiation manually worked for me, after testing full and half duplex a couple times

Comment 29 Ryan 2020-01-16 09:47:30 UTC
also had to set MTU otherwise would only connect 1 Gbps at half duplex or 100 Mbps Full/Half

Comment 30 Ryan 2020-01-16 10:32:43 UTC
After further testing, I found that only setting the MTU was required. Once I did that it would automatically connect to the network using auto negotiate.

Comment 31 Marius 2020-01-16 12:39:59 UTC
As an update to myself: After a restart of fedora, a connection with full duplex and 1Gbps does not work and results in the same error as before. As a workaround for that, I can set duplex to half and connect. Afterwards I disconnect and set duplex to full. Then the connection works again.

Comment 32 john getsoian 2020-01-16 14:36:32 UTC
Played with this a little also. My connection will hold with auto-negotiate turned on if the MTU is set down a bit - in my case to 1472, which is apparently a sweet spot for the way ATT fiber service does IP6. However in a sample too small to be strictly statically valid (2 tries) I still obtained somewhat better data rates through the router (800 vs 750) with MTU=1472 and the connection speed set manually to 1000mbps.

Comment 33 Stefan Becker 2020-01-16 18:05:03 UTC
On my system switching off autonegotiation *and* then switching the speed made the connection stable, at least until you disconnect the device and try to reconnect. I wouldn't call that a solution...

Comment 34 Ryan 2020-01-17 02:38:00 UTC
(In reply to Stefan Becker from comment #33)
> I wouldn't call that a solution...

You're right, I thought about how I called it a 'fix' after I wrote that and decided it's more a workaround than a fix, but at least we have a workaround for now. Hopefully they'll resolve this, the kernel.org changelog on 5.4.12 mentions MTU so I'm hopeful, but I haven't tested 5.4.12 yet...I will be doing so this evening.

Comment 35 Ryan 2020-01-17 02:41:28 UTC
(In reply to Marius from comment #31)
> As an update to myself: After a restart of fedora, a connection with full
> duplex and 1Gbps does not work and results in the same error as before. As a
> workaround for that, I can set duplex to half and connect. Afterwards I
> disconnect and set duplex to full. Then the connection works again.

I personally only got it stable after setting the MTU to 1492 (which is what my LAN is using by default). I tested rebooting and was still able to reconnect post this. So for me, setting the MTU to the correct value was the workaround I needed to maintain a stable auto negotiate reconnection through reboots

Comment 36 Marius 2020-01-17 09:52:58 UTC
(In reply to Ryan from comment #35)
> I personally only got it stable after setting the MTU to 1492 (which is what
> my LAN is using by default). I tested rebooting and was still able to
> reconnect post this. So for me, setting the MTU to the correct value was the
> workaround I needed to maintain a stable auto negotiate reconnection through
> reboots

I checked this also. For that, I enabled autonegotiate again and set the MTU from automatic to 1500. I further checked 500 and 10000. However, I was not able to get a connection with that. Therefore I disabled autonegotiate again (MTU stayed at 1500). Now I am able to connect again. So in my case, it is definately the autonegotiate setting

Comment 37 Eugene Mah 2020-01-17 13:49:29 UTC
Setting the MTU to 1492 also seems to work around the issue on my system.  Tested with autonegotiation off as well as on.

Comment 38 Panu Matilainen 2020-01-17 17:19:28 UTC
Similar story here, 5.3.16 was okay, 5.4.7 and 5.4.8 loop trying to connect and then claiming carrier change (in NetworkManager log):

00:19.0 Ethernet controller: Intel Corporation 82579V Gigabit Network Connection (rev 05)
        DeviceName: Intel(R) 82579LM Gigabit Network Device
        Subsystem: Intel Corporation Device 2006
        Flags: bus master, fast devsel, latency 0, IRQ 34
        Memory at fe700000 (32-bit, non-prefetchable) [size=128K]
        Memory at fe728000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at f040 [size=32]
        Capabilities: [c8] Power Management version 2
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [e0] PCI Advanced Features
        Kernel driver in use: e1000e
        Kernel modules: e1000e

The loop occurs with autonegotiation when connected to a gigabit switch, but not if connected to a 100M powerline adapter, and forcing link to 100M/FD restores functionality with the switch too.

Comment 39 Panu Matilainen 2020-01-17 17:28:14 UTC
Oh, here's another box using e1000e which is *not* affected:

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-LM (rev 04)
        Subsystem: Lenovo ThinkPad T440p
        Flags: bus master, fast devsel, latency 0, IRQ 29
        Memory at f0600000 (32-bit, non-prefetchable) [size=128K]
        Memory at f063f000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at 3080 [size=32]
        Capabilities: [c8] Power Management version 2
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [e0] PCI Advanced Features
        Kernel driver in use: e1000e
        Kernel modules: e1000e

Comment 40 Panu Matilainen 2020-01-17 17:47:43 UTC
FWIW, problem still present in 5.4.10-200.fc31.

Comment 41 Ryan 2020-01-18 00:48:51 UTC
still happening in latest updates-testing kernel 5.4.12-200.x86_64.fc31

Comment 42 john getsoian 2020-01-21 13:21:58 UTC
another quirk. Wake-On-Lan fails with auto-negotiation turned off. Works with auto-negotiation turned on and MTU set manually (as per comment 32). I don't know if this was normal behavior before this bug or not and at this point have no easy way to test.

Comment 43 Eugene Mah 2020-01-22 17:55:56 UTC
Still having problems with the 5.4.13 kernel that's currently in updates-testing.  Network connection flaps when MTU is set to Automatic, but is fine when I set it to 1492.

Comment 44 Stefan Becker 2020-01-22 19:29:52 UTC
The good news is that the commit was reverted in upstream master, i.e. latest with 5.5 this issue will be gone fr good: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.5-rc7&id=d5ad7a6a7f3c87b278d7e4973b65682be4e588dd

Unfortunately this change hasn't been backported to the linux-5.4.y branch in stable yet: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/drivers/net/ethernet/intel/e1000e/e1000.h?h=linux-5.4.y

I wonder what it would take to get the package maintainers include the patch in the Fedora linux package?

Comment 45 Ryan 2020-01-25 23:10:10 UTC
still impacted in 5.4.14-200.fc31 if I set MTU to auto. Remains stable if set to 1492.

Comment 46 Stefan Becker 2020-01-26 14:23:05 UTC
Created attachment 1655426 [details]
Add proposed patch to Fedora kernel package

I've taken the proposed patch and integrated it into the f31 git branch of the Fedora kernel package.

A scratch build for x86_64 is running here: https://koji.fedoraproject.org/koji/taskinfo?taskID=41038832

Comment 47 Ed 2020-01-29 12:35:44 UTC
I can also confirm on F31 with Linux kernel 5.4.13-201.fc31.x86_64

If you set MTU to 1492 and negotiation to Automatic this works.
This specific machine suffered from a flapping network port.

First I went back to a previous kernel-5.3.12-300.fc31.x86_64 which worked fine for the user and finally had some time do dig into this a bit.
So not the only one having these issues.
Other pc's with same revision chip Intel Ehternet don't seem to have this or suffer from this ( all quite recent dell Optiplex 5060 and 5070, I had this issues on an Optiplex 5060 )

[    2.011399] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[    2.011400] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[    2.011924] e1000e 0000:00:1f.6: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[    2.406870] e1000e 0000:00:1f.6 0000:00:1f.6 (uninitialized): registered PHC clock
[    2.473185] e1000e 0000:00:1f.6 eth0: (PCI Express:2.5GT/s:Width x1) 54:bf:64:81:8e:51
[    2.473186] e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000 Network Connection
[    2.473339] e1000e 0000:00:1f.6 eth0: MAC: 13, PHY: 12, PBA No: FFFFFF-0FF
[    2.474026] e1000e 0000:00:1f.6 eno1: renamed from eth0
[    8.677276] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[   13.082953] e1000e 0000:00:1f.6 eno1: changing MTU from 1500 to 1492
[   19.095031] e1000e 0000:00:1f.6 eno1: changing MTU from 1492 to 1500
[   19.828194] e1000e: eno1 NIC Link is Up 1000 Mbps Half Duplex, Flow Control: None
[   19.930859] e1000e 0000:00:1f.6 eno1: changing MTU from 1500 to 1492
[   26.049761] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None

Optiplex 5060
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10) ( Intel 8th gen Core i5 8500 )
Optiplex 5070
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10) ( Intel 9th gem Core i5 9500 )

Comment 48 Ryan 2020-02-05 07:03:04 UTC
(In reply to Stefan Becker from comment #46)
> Created attachment 1655426 [details]
> Add proposed patch to Fedora kernel package
> 
> I've taken the proposed patch and integrated it into the f31 git branch of
> the Fedora kernel package.
> 
> A scratch build for x86_64 is running here:
> https://koji.fedoraproject.org/koji/taskinfo?taskID=41038832

that build failed, have you managed to get it to build successfully?

Comment 49 Ryan 2020-02-05 07:14:24 UTC
Doesn't appear to be applied to the latest kernel release (5.4.17) in updates-testing so I imagine the issue still stands...

Comment 50 Stefan Becker 2020-02-05 08:03:53 UTC
Created attachment 1657800 [details]
Add proposed patch to Fedora kernel package

(In reply to Ryan from comment #48)
> that build failed, have you managed to get it to build successfully?

It seems kernel build nowadays requires builder with large disk space (15+ GB, I got the same failure in local mock build).

I've rebased my package patch to 5.4.17 and started another scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=41386340

Comment 51 Stefan Becker 2020-02-05 17:49:36 UTC
(In reply to Stefan Becker from comment #50)
> 
> I've rebased my package patch to 5.4.17 and started another scratch build:
> https://koji.fedoraproject.org/koji/taskinfo?taskID=41386340

After booting into the kernel from the scratch build e1000e works like a charm again with default setting (MTU: automatic, auto-negotiaten: enabled)

Comment 52 Vasiliy Glazov 2020-02-06 06:52:13 UTC
(In reply to Stefan Becker from comment #50)
> I've rebased my package patch to 5.4.17 and started another scratch build:
> https://koji.fedoraproject.org/koji/taskinfo?taskID=41386340

Works good. Please submit this for update.

Comment 53 Ryan 2020-02-06 10:16:24 UTC
YASS! https://lwn.net/Articles/811637/

Comment 54 Ryan 2020-02-06 10:17:20 UTC
Fixed upstream in 5.4.18

Comment 55 Vasiliy Glazov 2020-02-07 08:30:12 UTC
Fixed in 5.5.2
https://koji.fedoraproject.org/koji/buildinfo?buildID=1457411

Comment 56 Eugene Mah 2020-02-07 22:43:41 UTC
kernel-5.4.18-200.fc31 fixes the issue on my machine

Comment 57 Ryan 2020-02-08 01:41:06 UTC
This build fixes the issue on my machine: https://koji.fedoraproject.org/koji/buildinfo?buildID=1457404

Comment 58 Ryan 2020-02-08 01:44:14 UTC
the build above is 5.4.18-200.fc31 but obviously there's also the 5.5.2-200.fc31 build in koji...so maybe we're all going to be moving to the 5.5 kernel soon

Comment 59 JM 2020-02-17 17:32:48 UTC
kernel-5.4.18 has solved the problem for me.

Comment 60 Justin M. Forbes 2020-03-03 16:20:31 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 31 kernel bugs.

Fedora 31 has now been rebased to 5.5.7-200.fc31.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 32, and are still experiencing this issue, please change the version to Fedora 32.

If you experience different issues, please open a new bug report for those.

Comment 61 Vasiliy Glazov 2020-03-10 06:48:57 UTC
Now all work.


Note You need to log in before you can comment on or make changes to this bug.