Bug 1899805
Summary: | [5.8.9 -> 5.9 REGRESSION] Constant hard freezes with "BUG: Bad page state in process swapper/8", works fine with previous kernel | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | ell1e <el> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED WORKSFORME | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 33 | CC: | acaringi, adscvr, airlied, arnik, bskeggs, hdegoede, itamar, jarodwilson, jdobes, jeremy, jglisse, jonathan, josef, kernel-maint, lgoncalv, linville, masami256, mchehab, mjg59, msandova, ptalbert, steved |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-09-16 19:50:32 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
ell1e
2020-11-20 04:57:14 UTC
Created attachment 1731159 [details]
journalctl -r (includes the "Bad page state" error at the bottom, then the follow-up reboot info above)
I am also affected by this bug, without doing anything in particular the machine freezes and in the logs I can see nov 20 16:14:32 alpha kernel: BUG: Bad page state in process swapper/8 pfn:3aa55a This started happening today after an update in Fedora Silverblue 33 (I don't know when was the last time I updated, but it was probably not more than a week ago). I have a Ryzen 2700x cpu, talking with ell1e it seem they are also on a ryzen cpu. Created attachment 1731290 [details]
/proc/cpuinfo output on affected machine
Since this seems to be possibly AMD CPU-related, I'm hereby also attaching my /proc/cpuinfo output.
Ok, I have been using 5.8.18-300.fc33.x86_64 for 12 hours ish now including the packagekitd uses that previously were good at triggering the freeze, and so far nothing. So given I didn't see this a few days ago, neither has msandova apparently, and Fedora upgraded to 5.9 roughly a few days ago, I feel pretty confident suggesting that this is a 5.9 regression. I think at this point it might also be interesting to know how many AMD CPUs are really affected, and whether there is maybe any point in reverting the main repos back to an older kernel until the cause of this is found. This Bug also happened for me. There is no specific reason that I can guess, but system will crash randomly after some unknown minute. it happened after I upgraded to kernel 5.9 . Was there ever any consideration of a rollback until this is investigated? This seems like quite a disruptive regression, especially for an innocuous mid-cycle update. I tested an upstream kernel now where it also happened, and made an upstream ticket: https://bugzilla.kernel.org/show_bug.cgi?id=211317 Happens also to me (CPU Ryzen 3200G) after upgrade to F33, it's not always swapper process, I saw also chrome process causing this. Now I've updated to 5.10.10-200.fc33.x86_64, no freeze yet but I'm getting kerneloops with reason "BUG: Bad page state in process swapper/0" like every minute now. Definitely some change here even it doesn't seem fully fixed yet. My exact CPU is AMD Ryzen 5 1600, and I tested 5.11.0-0.rc4.129.vanilla.1.fc33.x86_64 which I assume is newer? So maybe you seeing no freeze yet might have just been coincidence or something, who knows. (Or maybe it affects different Ryzen gens differently?) I do hope there will be some sort of kernel bugzilla response soon. Will a package pin on 5.8 in dnf get removed by an upgrade to Fedora 34 (since I assume that one no longer ships 5.8 officially)? If yes then this is actually kind of a significant upgrade blocker... I'm actually not sure if even the installer would run without a freeze, possibly messing up the install, since it boots into its own initramfs thing :( Just a reminder this basically turns computers unusable, I might as well switch to FreeBSD or Debian Stable at this point once Fedora 33 support runs out. Is there ANY plan to have this looked into? This appears to have been fixed in 5.11.11-200.fc33.x86_6 (or earlier), I can no longer reproduce after running the machine for days. Unless somebody protests soon, I will close the ticket. I'm running this kernel for several weeks and I can no longer reproduce it as well. |