Bug 804347
| Summary: | Crash early in xen dom0 boot with 3.2.10-3 kernel | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Michael Young <m.a.young> | ||||||||||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
| Severity: | unspecified | Docs Contact: | |||||||||||||
| Priority: | unspecified | ||||||||||||||
| Version: | 16 | CC: | bill-bugzilla.redhat.com, gansalmon, greno, itamar, joerg, jonathan, kernel-maint, ketuzsezr, madhu.chinakonda, myroslav, rtc, vcputtini, vgulch | ||||||||||||
| Target Milestone: | --- | ||||||||||||||
| Target Release: | --- | ||||||||||||||
| Hardware: | Unspecified | ||||||||||||||
| OS: | Unspecified | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Fixed In Version: | kernel-3.3.0-8.fc17 | Doc Type: | Bug Fix | ||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||
| Clone Of: | |||||||||||||||
| : | 806245 (view as bug list) | Environment: | |||||||||||||
| Last Closed: | 2012-04-01 00:27:17 UTC | Type: | --- | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Embargoed: | |||||||||||||||
| Attachments: |
|
||||||||||||||
|
Description
Michael Young
2012-03-18 00:26:20 UTC
I did a bit of hacking to get the kernel warning without the crash. The context for the warning is [ 0.000000] ACPI: PM-Timer IO Port: 0x1008 [ 0.000000] ACPI: Local APIC address 0xfee00000 [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) [ 0.000000] BIOS bug: APIC version is 0 for CPU 0/0x0, fixing up to 0x10 [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) [ 0.000000] I/O APIC 0xfec00000 regs return all ones, skipping! [ 0.000000] IOAPIC[0]: apic_id 2, version 255, address 0xfec00000, GSI 0-255 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.000000] ACPI: IRQ0 used by override. [ 0.000000] ACPI: IRQ2 used by override. [ 0.000000] ACPI: IRQ9 used by override. [ 0.000000] Using ACPI (MADT) for SMP configuration information [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 [ 0.000000] SMP: Allowing 2 CPUs, 0 hotplug CPUs [ 0.000000] nr_irqs_gsi: 272 I suspect the crash occurs a bit later when the pirqs are mapped [ 0.000000] NR_IRQS:16640 nr_irqs:512 16 [ 0.000000] xen: sci override: global_irq=9 trigger=0 polarity=0 [ 0.000000] xen: registering gsi 9 triggering 0 polarity 0 [ 0.000000] xen: --> pirq=9 -> irq=9 (gsi=9) [ 0.000000] xen: acpi sci 9 [ 0.000000] xen: --> pirq=1 -> irq=1 (gsi=1) [ 0.000000] xen: --> pirq=2 -> irq=2 (gsi=2) [ 0.000000] xen: --> pirq=3 -> irq=3 (gsi=3) [ 0.000000] xen: --> pirq=4 -> irq=4 (gsi=4) [ 0.000000] xen: --> pirq=5 -> irq=5 (gsi=5) [ 0.000000] xen: --> pirq=6 -> irq=6 (gsi=6) [ 0.000000] xen: --> pirq=7 -> irq=7 (gsi=7) [ 0.000000] xen: --> pirq=8 -> irq=8 (gsi=8) [ 0.000000] xen_map_pirq_gsi: returning irq 9 for gsi 9 [ 0.000000] xen: --> pirq=9 -> irq=9 (gsi=9) [ 0.000000] xen: --> pirq=10 -> irq=10 (gsi=10) [ 0.000000] xen: --> pirq=11 -> irq=11 (gsi=11) [ 0.000000] xen: --> pirq=12 -> irq=12 (gsi=12) [ 0.000000] xen: --> pirq=13 -> irq=13 (gsi=13) [ 0.000000] xen: --> pirq=14 -> irq=14 (gsi=14) [ 0.000000] xen: --> pirq=15 -> irq=15 (gsi=15) I tried a scratch build without the x86-ioapic-add-register-checks-for-bogus-io-apic-entries.patch patch and that boots successfully as dom0. Adding Konrad to CC. This patch is already queued for upstream, so I've emailed Suresh as well. Thanks. Responded on the email thread. Will instrument it a bit to see if my theory holds true. Hi, Unfortunately the problem continues to happen. Looking at the changelog of the new kernel didn't see the references to the patch cited by Michael Young. kernel-3.3.0-4.fc16.x86_64 xen-4.1.2-6.fc16.x86_64 Is there any estimate of when this problem will be solved? Thank a lot Wating on Ingo to Ack these patches: https://lkml.org/lkml/2012/3/21/632 there is also the quick-n-dirty-hack: https://lkml.org/lkml/2012/3/20/349 [mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. [mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. It crashes the same way with kernel-3.3.0-4.fc16 (as expected). [mass update] kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository. Please retest with this update. I have done another scratch build - http://koji.fedoraproject.org/koji/taskinfo?taskID=3923761 - with the patches from https://lkml.org/lkml/2012/3/21/632 - this boots successfully as a dom0. *** Bug 806245 has been marked as a duplicate of this bug. *** Created attachment 572774 [details]
x86: add io_apic_ops to allow interception
Created attachment 572775 [details]
x86/apic_ops: Replace apic_ops with x86_apic_ops
Created attachment 572776 [details]
xen/x86: Implement x86_apic_ops
I've committed these patches to F16 now. Should be in the next update. *** Bug 807401 has been marked as a duplicate of this bug. *** kernel-3.3.0-8.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.3.0-8.fc16 kernel-3.3.0-8.fc17 has been submitted as an update for Fedora 17. https://admin.fedoraproject.org/updates/kernel-3.3.0-8.fc17 Package kernel-3.3.0-8.fc17: * should fix your issue, * was pushed to the Fedora 17 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.3.0-8.fc17' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-4888/kernel-3.3.0-8.fc17 then log in and leave karma (feedback). Did kernel-PAE also get submitted? On F16 I just did a 'yum --enablerepo=updates-testing list kernel-PAE' but do not see any newer kernel-PAE. . (In reply to comment #22) > Did kernel-PAE also get submitted? > > On F16 I just did a 'yum --enablerepo=updates-testing list kernel-PAE' but do > not see any newer kernel-PAE. They are all built together, so yes. Maybe your mirror is a bit stale. It can be found here: http://dl.fedoraproject.org/pub/fedora/linux/updates/testing/17/i386/kernel-PAE-3.3.0-8.fc17.i686.rpm I don't believe the F16 version has been pushed yet. kernel-3.3.0-8.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report. Created attachment 574485 [details]
Xeon - xm dmesg
Created attachment 574486 [details]
E5700 - xm dmesg
Hi, thanks a lot. Great job! Attached I sending my dmesg e xm dmesg, to yours appreciation. Intel Xeon and E5700. kernel-3.3.0-8.fc17 has been pushed to the Fedora 17 stable repository. If problems still persist, please make note of it in this bug report. This is still a problem with current Fedora 15 (still on maintenance). It looks like Suresh's patch didn't get reverted until 3.2.15, which we don't have. I can confirm that 2.6.43.2-6.fc15.x86_64 in testing boots OK, though I don't know if the decision to jump to 3.3 on f15 has been made yet. I have the impression that the F15 3.3 kernels are ready but they don't get enough testing before they are obsoleted by the next update. If you want to give the 2.6.43.2-6.fc15 kernel positive karma you can do so at https://admin.fedoraproject.org/updates/FEDORA-2012-6406/kernel-2.6.43.2-6.fc15 (In reply to comment #30) > I have the impression that the F15 3.3 kernels are ready but they don't get > enough testing before they are obsoleted by the next update. If you want to > give the 2.6.43.2-6.fc15 kernel positive karma you can do so at > https://admin.fedoraproject.org/updates/FEDORA-2012-6406/kernel-2.6.43.2-6.fc15 Yes, that's exactly what is happening. It's rather frustrating. Thanks for the explanation guys.
>billmcgonigle - 2012-04-25 04:20:47
>stable for 24 hours on a Xen server. Fixes Xen crash on boot regression >currently shipping in F15!
>bodhi - 2012-04-25 04:20:48
>Critical path update approved
>bodhi - 2012-04-25 04:20:50
>This update has reached the stable karma threshold and will be pushed to the >stable updates repository
oh, so this one's gonna be my fault. ;)
|