Bug 2253789 - [regression] erratic state with kernel version 6.6.5
Summary: [regression] erratic state with kernel version 6.6.5
Status: CLOSED DUPLICATE of bug 2253756
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 39
Hardware: x86_64
OS: Linux
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
Depends On:
Blocks: Framework
TreeView+ depends on / blocked
Reported: 2023-12-09 18:41 UTC by Baptiste Mille-Mathias
Modified: 2023-12-11 12:40 UTC (History)
19 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2023-12-11 12:40:31 UTC
Type: ---

Attachments (Terms of Use)
journal kernel log (541.83 KB, text/plain)
2023-12-09 18:41 UTC, Baptiste Mille-Mathias
no flags Details

Description Baptiste Mille-Mathias 2023-12-09 18:41:02 UTC
1. Please describe the problem:
My machine at some point looses it's network access. NetworkManager takes 100% and a lot of process are killed by the OOM killer
Model: framework laptop with AMD CPU. After that the laptop is not usable anymore, doing a proper shutdown takes 7 minutes because not unit stop properly and have to be killed reaching their shutdown timeout.
➜  ~ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  16
  On-line CPU(s) list:   0-15
Vendor ID:               AuthenticAMD
  Model name:            AMD Ryzen 7 7840U w/ Radeon  780M Graphics

2. What is the Version-Release number of the kernel:

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :
Yes it works until I received the new kernel version 6.6.5.

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:
yes I got the same behaviour after a while each time.

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

6. Are you running any modules that not shipped with directly Fedora's kernel?:

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.


I saw another framework AMD Owner with the same kernel reported some issue on bug #2253756.

Reproducible: Always

Comment 1 Baptiste Mille-Mathias 2023-12-09 18:41:29 UTC
Created attachment 2003449 [details]
journal kernel log

Comment 2 Dimitris 2023-12-09 20:09:33 UTC
I have a very similar symptom on the same hardware with bug 2253756, in my case I get into this takes-forever-to-shutdown-resort-to-hold-power-switch state after an attempted suspend fails.  I may just not have had been running long enough with 6.6.5 to see this manifest before the suspend attempt.

My stack trace for wpa_supplicant looks the same to yours here.

So I suspect we're hitting the same issue.

I've tried 6.7.0-0.rc4.20231208git5e3f5b81de80.38.fc40 from rawhide and my issue seemed resolved, now going through kernel bugzilla looking for possibly already fixed issues that might be worth backporting to 6.6.

Comment 3 Justin M. Forbes 2023-12-11 12:40:31 UTC

*** This bug has been marked as a duplicate of bug 2253756 ***

Note You need to log in before you can comment on or make changes to this bug.