Bug 2306163 - Regression in Kernel 6.10 causes e1000e to prevent suspend if there is no active ethernet connection on Meteor Lake CPU
Summary: Regression in Kernel 6.10 causes e1000e to prevent suspend if there is no act...
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 40
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL: https://discussion.fedoraproject.org/...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-08-20 18:22 UTC by oirnoir
Modified: 2024-09-26 14:55 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: ---
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: ---
Embargoed:


Attachments (Terms of Use)

Description oirnoir 2024-08-20 18:22:40 UTC
1. Please describe the problem:
When I close my laptop or run `systemctl suspend`, I expect it to enter suspend mode. On kernel 6.9.x, it does this correctly. However, on kernel 6.10.3+, it fails to suspend. Thereafter, the display will freeze every few seconds until a restart.

2. What is the Version-Release number of the kernel:
6.10.5

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
First appeared: 6.10.3
Last version that worked: 6.9.12

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
- Use kernel 6.10.3 or higher
- Suspend by using `sudo systemctl suspend` or by closing the lid

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:
Yes it does, exactly as listed. uname -r: `6.11.0-0.rc3.20240814git6b0f8db921ab.32.fc42.x86_64`

6. Are you running any modules that not shipped with directly Fedora's kernel?:
I don't think so

(Other information:)
Output of $ sudo lspci -nn -vv -s 00:1f.6:
00:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:550a] (rev 20)
        DeviceName: Ethernet controller
        Subsystem: CLEVO/KAPOK Computer Device [1558:a743]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin D routed to IRQ 194
        IOMMU group: 15
        Region 0: Memory at b54a0000 (32-bit, non-prefetchable) [size=128K]
        Capabilities: [c8] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee00c18  Data: 0000
        Kernel driver in use: e1000e
        Kernel modules: e1000e

Output of $ inxi -MCnzxx:
Machine:
  Type: Laptop System: Notebook product: V54x_6x_TU v: V540TU
    serial: <superuser required> Chassis: type: 10 serial: <superuser required>
  Mobo: Notebook model: V54x_6x_TU v: V540TU serial: <superuser required>
    UEFI: 3mdeb v: Dasharo (coreboot+UEFI) v0.9.0 date: 07/17/2024
CPU:
  Info: 16-core (6-mt/10-st) model: Intel Core Ultra 7 155H bits: 64
    type: MST AMCP arch: Meteor Lake rev: 4 cache: 24 MiB note: check
  Speed (MHz): avg: 987 high: 2003 min/max: 400/4500:4800:3800:2500 cores:
    1: 2003 2: 1700 3: 400 4: 400 5: 1208 6: 1571 7: 1924 8: 400 9: 1770 10: 400
    11: 1952 12: 400 13: 999 14: 999 15: 1000 16: 1000 17: 1002 18: 400
    19: 1000 20: 400 21: 400 22: 400 bogomips: 131788
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Network:
  Device-1: Intel Meteor Lake PCH CNVi WiFi driver: iwlwifi v: kernel
    bus-ID: 00:14.3 chip-ID: 8086:7e40
  IF: wlp0s20f3 state: up mac: <filter>
  Device-2: Intel vendor: CLEVO/KAPOK driver: e1000e v: kernel port: N/A
    bus-ID: 00:1f.6 chip-ID: 8086:550a
  IF: eno0 state: down mac: <filter>

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Could not attatch full journalctl output due to character limit. Will try to provide soon.

Reproducible: Always

Comment 1 oirnoir 2024-08-20 18:26:29 UTC
Because I couldn't post the journalctl logs here, I've placed them in a gist: https://gist.github.com/OIRNOIR/4ffeb6243966475fe0f423d5fde7c2d4

Comment 2 oirnoir 2024-08-23 11:09:03 UTC
Kernel 6.10.6 just rolled out to me. I can confirm that the issue is still present.

Comment 3 anotheruser 2024-08-23 14:09:04 UTC
I think it could be related  to  https://bugzilla.kernel.org/show_bug.cgi?id=218940

The last activity was to exclude all non meteor lake systems (if I understand the ticket above correctly).

Here the user has a meteor lake system  with an unique pciid 8086:550a that still shows the same symptoms (failed s2idle) described in the ticket upstream.

Maybe the pciid needs to be excluded as well? 
Or are all Meteor Lake systems in 6.10 and newer generelly affected?

Comment 4 oirnoir 2024-09-05 07:51:18 UTC
Note: I can also reproduce the issue on Kernel 6.10.7.
I have been sticking with kernel 6.9.12 for daily use because this bug is just too unacceptable for me. It causes short freezes and duplicated keystrokes that make typing anything after a failed suspend a nightmare.

Comment 5 oirnoir 2024-09-26 06:23:41 UTC
Another note: The issue seems to be fixed on the latest rawhide 6.12.0-rc0 kernel. I cannot, however, verify whether it is fixed on 6.11 because I haven't been able to install 6.11 on my fedora 40 stable build.
The rawhide kernel has several other problems that don't fall under the scope of this issue, so I obviously won't be daily-driving that one.

Comment 7 anotheruser 2024-09-26 14:55:39 UTC
I was wrong.  It's not in 6.11 yet.


Note You need to log in before you can comment on or make changes to this bug.