Bug 1365917
Summary: | kernel panic at boot - x2apic_cluster_probe+0x33/0x70 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Peter Gervase <pgervase> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | rawhide | CC: | awilliam, byodlows, gansalmon, itamar, jforbes, jonathan, kernel-maint, kparal, labbott, madhu.chinakonda, mchehab, pgervase, plautrba, pschindl, robatino | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | RejectedBlocker AcceptedFreezeException | ||||||||
Fixed In Version: | kernel-4.8.0-0.rc2.git3.1.fc25 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-08-22 22:07:58 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1277284, 1277285 | ||||||||
Attachments: |
|
Description
Peter Gervase
2016-08-10 13:48:34 UTC
Created attachment 1189632 [details]
screen shot showing the new panic message two minutes after the first one
Can you test the following scratch build? It contains a probable fix from the upstream developers http://koji.fedoraproject.org/koji/taskinfo?taskID=15217136 I saw the same/similar kernel panic problem on my Lenovo x240 with kernel-4.8.0-0.rc1.git3.1.fc25.x86_64 http://koji.fedoraproject.org/koji/taskinfo?taskID=15217136 build fixes it. Thanks! *** Bug 1367396 has been marked as a duplicate of this bug. *** kernel from koji build linked in comment 3 works for me. System boots normally with it. I tested it with Fedora 24 with kernel-4.8.0-0.rc0.git3.1.fc25.x86_64 installed and it didn't boot (kernel panic). Then I installed kernel from koji and it booted normally. Can you confirm that kernel-4.8.0-0.rc1.git0.1.fc25 does not display this behaviour? for blocker / release engineering purposes: labbott states she's certain that kernel-4.8.0-0.rc1.git0.1.fc25 - which is the current 'stable' f25 kernel build, i.e. the one in the 'fedora' repo and which is included in composes - *would* be affected by this bug. That means that if we decide the bug is a blocker, we must find a fix for it before we can ship Alpha. But, she and jforbes also believe this is fixed in upstream kernel by commit d52c0569bab4edc888832df44dc7ac28517134f6 , and that furthermore that means the bug should be fixed by these Fedora builds: f25: http://koji.fedoraproject.org/koji/buildinfo?buildID=792279 (kernel-4.8.0-0.rc2.git1.1.fc25) Rawhide: http://koji.fedoraproject.org/koji/buildinfo?buildID=792280 (kernel-4.8.0-0.rc2.git1.1.fc26) that build is not currently submitted as an update for F25. It would be good if reporters could confirm the fix. labbott also states she'd vote -1 blocker / +1 FE for this bug, given the range of hardware affected. jforbes says "1365917 could theoretically impact any modern intel machine", the upstream commit can be seen at https://lkml.org/lkml/2016/8/11/516 , describing the issue, if anyone feels up to evaluating its impact themselves. "any modern intel machine" is quite scary to me, I might be more inclined to go +1 blocker for this one, I'm definitely +1 FE. To clarify the "Any modern intel machine" x2apic was introduced with nehalem, so about 6 years ago. It can also be "opted out" of by firmware, and frequently is. I don't know the percentages of machines that do or don't opt out, I know by a quick look at 3 machines here, 2 have it turned off, 1 has it turned on. You can check by looking at a dmesg after boot, you will either see "x2apic enabled" or "DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit" with instructions on how to override the opt out. A quick google search shows that several people have seen this bug, but it is still hard to determine because no one shipped a kernel to masses of users with the bug. Bit more discussion about the range of hardware likely affected by this: <jwb> jforbes: eh... i won't disagree but that might be stretching it <jforbes> jwb: theoretically. x2apic came in with nahalem, and it is basically a race condition with CPU state change realistically it is probably a smaller subset, but a quick google search says it is non trivial <jwb> jforbes: yeah, but i thought there was a firmware component to x2apic support too i might be thinking of something else <jforbes> jwb: there is, thus the theoretical part <jwb> right. so the stretch is that most laptop class hardware doesn't have the firmware bits for x2apic. at least not that i've seen but desktop/larger servers are certainly a possibility now if we only could tell for certainty what most Fedora users have for machines. IN A WORLD <jforbes> Well, that would certainly be nice only 1 out of 3 machines here has it enabled I could power on and check others I suppose But even in the ones that disable by default, it can be overridden For those that have dep issues installing: $ sudo rpm -ivh kernel-4.8.0-0.rc2.git1.1.fc26.x86_64.rpm error: Failed dependencies: kernel-core-uname-r = 4.8.0-0.rc2.git1.1.fc26.x86_64 is needed by kernel-4.8.0-0.rc2.git1.1.fc26.x86_64 kernel-modules-uname-r = 4.8.0-0.rc2.git1.1.fc26.x86_64 is needed by kernel-4.8.0-0.rc2.git1.1.fc26.x86_64 I made https://bugzilla.redhat.com/show_bug.cgi?id=1367929 to clean up the dep checking - "uname -r" not getting parsed. I'll test booting to that rc2 kernel... er...you're reading that wrong. you have to install at least the kernel, kernel-core and kernel-modules packages when manually installing a kernel build. The package called 'kernel' is basically just a metapackage and doesn't contain anything. The actual kernel is in 'kernel-core', the modules are in 'kernel-modules'. You may also need 'kernel-modules-extra' depending on your hardware. Right, you need all three, but the error shouldn't say "uname-r" in the failed deps. kernel-core-4.8.0-0.rc2.git1.1.fc26.x86_64 and kernel-modules-4.8.0-0.rc2.git1.1.fc26.x86_64 are what should be specified, not "kernel-core-uname-r" or "kernel-modules-uname-r". $ sudo rpm -ivh kernel-4.8.0-0.rc2.git1.1.fc26.x86_64.rpm kernel-core-4.8.0-0.rc2.git1.1.fc26.x86_64.rpm kernel-modules-4.8.0-0.rc2.git1.1.fc26.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:kernel-core-4.8.0-0.rc2.git1.1.fc################################# [ 33%] 2:kernel-modules-4.8.0-0.rc2.git1.1################################# [ 67%] 3:kernel-4.8.0-0.rc2.git1.1.fc26 ################################# [100%] nah, the Provides: are explicitly named that way in the spec, the spec clearly doesn't expect the 'uname-r' to be interpreted as a command: http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/tree/kernel.spec#n633 http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/tree/kernel.spec#n824 http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/tree/kernel.spec#n847 etc. I dunno why the kernel team decided to use those names, but it's a conscious choice. Per Paul Whalen: "adding 'nox2apic' (on Fedora-25-20160807.n.0) got the installer booting on an x220 laptop". Given that there's a relatively straightforward workaround on the kernel boot command line, I'm inclined to say -1 blocker, +1 FE here. Discussed at 2016-08-18 go/no-go meeting, functioning as a blocker review meeting: https://meetbot-raw.fedoraproject.org/fedora-meeting/2016-08-18/f25-alpha-go_no_go-meeting.2016-08-18-17.00.html . Given our best estimate as to the range of hardware affected, and on the basis there's a simple documentable workaround, we decided to reject it as an Alpha blocker, but accept it as a freeze exception issue. kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 I can confirm that with 'nox2apic' I can boot (installer and installed system). kernel-4.8.0-0.rc2.git2.1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git3.1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git3.1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.8.0-0.rc2.git3.1.fc25 really solves problem for me. |