Bug 1905962
| Summary: | RHEL8 kernel has acpi_lapic breakage with 4.18.0-257 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Jason A. Donenfeld <Jason> | ||||||
| Component: | kernel | Assignee: | Al Stone <ahs3> | ||||||
| kernel sub component: | ACPI | QA Contact: | Jiri Dluhos <jdluhos> | ||||||
| Status: | CLOSED WONTFIX | Docs Contact: | |||||||
| Severity: | unspecified | ||||||||
| Priority: | unspecified | CC: | ahs3, bstinson, carl, darcari, jwboyer, mail, ngompa13, rvr | ||||||
| Version: | CentOS Stream | Keywords: | Triaged | ||||||
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
||||||
| Target Release: | 8.5 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2021-06-05 00:07:09 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
If you want me to write a patch, let me know and I'll provide that. But I've noticed half the time I file these bugs, the reply is, "oh we already fixed that for the next version," so I figure this time I'll wait before proceeding. I understand the frustration but please believe, a bugreport with a patch is extremely useful, and very appreciated. At the very least, such a patch-from-outside is an additional proof that the appropriate fix is correct - or, sometimes, a warning that it might be incorrect, or that there is another aspect to the problem that the developer might have overlooked. Al, do we have a fix already? :) It's not frustration; I'm just being practical. Just let me know if you need a patch (i.e. that this has not already been fixed) and I'll provide it. (In reply to Jason A. Donenfeld from comment #3) > It's not frustration; I'm just being practical. Just let me know if you need > a patch (i.e. that this has not already been fixed) and I'll provide it. Howdy, Jason. I'm still in the middle of a backport for ACPI up to about Linux 5.9. I suspect a fix would be included, but so I can be positive, would you mind attaching a config that causes the problem so I can see how this ended up not being defined? I can use that as a test case to make sure we don't break it again. If you already have a patch, awesome. I'd love to see it. If you don't, no worries, I think I'd rather have the config file as root cause anyway; there should be very few circumstances where LAPIC code does not compile (well, maybe none) so this has piqued my curiosity. Thanks. Created attachment 1738997 [details]
Broken config
Here's a broken config.
(In reply to Jason A. Donenfeld from comment #5) > Created attachment 1738997 [details] > Broken config > > Here's a broken config. Excellent. Thanks! Created attachment 1739001 [details]
And here's the patch to fix all the issues.
Let me know if you need anything else. This should handle all the breakage I'm seeing.
I was waiting for this bug to be closed before releasing, but I couldn't wait longer. https://lists.zx2c4.com/pipermail/wireguard/2020-December/006210.html mentions this bug. Could you let me know whether the patches in https://bugzilla.redhat.com/attachment.cgi?id=1739001&action=diff have been applied? Looks like almost all of this is covered by one missing upstream patch: 13c01139b17163c9b2aa543a9c39f8bbc875b625 x86/headers: Remove APIC headers from <asm/smp.h> Add in the #ifdef for kvm.c, and all is well. Assuming testing goes well, this should show up in RHEL8.4. Thanks for finding this; sorry it caused a detour. *** Bug 1912433 has been marked as a duplicate of this bug. *** Any new Stream kernels going to be released with this? I can't test wireguard in the CI until this is fixed. Feel free to participate in the discussion if you'd like to be more reassuring: https://lore.kernel.org/wireguard/CAHmME9pzOrvHX2bVcHVHh44_j1P_bYaz+o-wCnjnEgxoMeZCyA@mail.gmail.com/ But in all likelihood with no visible progress for a month I'll have to drop support for the platform if things don't turn around pretty quickly. (In reply to Jason A. Donenfeld from comment #12) > Any new Stream kernels going to be released with this? I can't test > wireguard in the CI until this is fixed. > > Feel free to participate in the discussion if you'd like to be more > reassuring: > https://lore.kernel.org/wireguard/CAHmME9pzOrvHX2bVcHVHh44_j1P_bYaz+o- > wCnjnEgxoMeZCyA.com/ > > But in all likelihood with no visible progress for a month I'll have to drop > support for the platform if things don't turn around pretty quickly. This is turning out to be non-trivial. The straightforward fix damages kABI compatibility for other existing Customers and Partners. Still working out an approach that safely maintains kABI but still provides for the config wireguard needs. How does this have anything at all to do with kABI. The issue is that you forgot some includes and an ifdef. The fix is here: https://bugzilla.redhat.com/attachment.cgi?id=1739001&action=diff How does that have anything to do with kABI? I'm not buying that as a rationale here. Yes, I have the fix. And yes, it fixes the one specific situation. However, it requires at least one additional patch to compile a RHEL kernel properly with the RHEL config; both alter the header files being included (or not included). When those patches are added, they change the checksums that are calculated for multiple kABI variables since those checksums will take into account all of the header files used in a declaration or definition. No functions, structs or variables changed; their checksums changed dramatically though -- so the kABI checks fail when building a kernel. This is readily fixable, just not straightforward. That's very strange. I hope you can fix it soon. Barring that, I'll probably have to drop support for RHEL8. With no vendor support, if my CI is broken without intention to fix it, it's an upward battle for me. Patch posted to internal review list for 8.4. Thanks! I really appreciate it. Hopefully this will be in Stream before 8.4 so I can test it out. Patch(es) available on kernel-4.18.0-287.el8.dt4 Is that on stream? How can I test that kernel? (In reply to Jason A. Donenfeld from comment #24) > Is that on stream? How can I test that kernel? A kernel will be made available in Stream once it passes internal testing. Hey guys it's been over three months of breakage. I thought stream was supposed to track your work a bit more closely than a >3 month lag. What's going on? CI being broken is worrisome to me, as delivering high quality software is important. Paging again... How can I actually try out this kernel? Being able to do so before the next version drops would be quite important, lest my CI remain broken. Do you want it to be impossible to support software on your kernel? Letting this linger like this all but ensures that. And let me be abundantly clear here: if you continue to make my job impossible to do, I *will* give up on RHEL and drop support for it. You might figure, "sure, it doesn't make a difference to us," and that's an understandable and fine position to take, and we can go our separate ways. But if you do have any desire for me to continue supporting your users, you'll need to make that possible to do. This is just the boring vanilla reality of the situation. Jason, it's already been explained to you once a kernel with this fix passes internal testing it will be pushed to git.centos.org and will be built for CS8. I've also explained to you that the Stream development model is a work in progress and isn't fully implemented in the 8 cycle. There are many business and development processes that need to be modified, and it won't happen overnight. Your expectations are not realistic at this point of the process. Allow me to reset those expectations. Before Stream, contributing to RHEL could often be a multi-year process. That's where we're starting from. We've already had many contributions in Stream 8 that took 2-6 months each, which is a massive improvement. We hope to get even better at this with Stream 9. Even then, you need to be patient. Keep these things in mind: - Your bug is not the only bug in front of the kernel developers. - Other bugs have a higher priority, often because they are associated with customer tickets or other business-related priorities. - Red Hat does not support custom kernel configurations, nor does it ordinarily test them. While upstream does as part of making patches, Red Hat only tests with their kernel configurations for its kernel source tree. No one is trying to make your job impossible to do, they are just busy doing their own jobs. It's your choice if you decide to "give up on RHEL" because you're not happy with the process at this point. If you do I hope you take another look in 9. The repository for the 9 kernel is already visible if you'd like to take a peek. https://gitlab.com/redhat/centos-stream/rpms/kernel After thinking about this for far longer than I should have, I am going to close this as WONTFIX. My apologies for the length of time this took. It was a difficult decision. Basically, this is a request to enable a kernel configuration we do not test and do not support. Getting it to work is not pretty, and has the potential to be disruptive to kABI (the upstream patches cannot be used without modifications and workarounds to maintain kABI) because of the header files that are involved. That's a bit too risky for our installed base, in my estimation. |
It looks like something wasn't backported correctly, and certain configurations are now erroring out with: arch/x86/kernel/apic/apic.c: In function ‘__apic_intr_mode_select’: arch/x86/kernel/apic/apic.c:1308:8: error: ‘acpi_lapic’ undeclared (first use in this function) 1308 | if (!acpi_lapic) { | ^~~~~~~~~~ arch/x86/kernel/apic/apic.c:1308:8: note: each undeclared identifier is reported only once for each function it appears in arch/x86/kernel/apic/apic.c: In function ‘init_apic_mappings’: arch/x86/kernel/apic/apic.c:2020:8: error: ‘acpi_lapic’ undeclared (first use in this function) 2020 | if (!acpi_lapic && !smp_found_config) | ^~~~~~~~~~ This affects my CI infrastructure.