Bug 745961

Summary: Xen 4.1.1 won't boot 3.1-rc9 Dom0, kernel panic instead on Dell Mini 10.
Product: [Fedora] Fedora Reporter: Adam Miller <maxamillion>
Component: xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: berrange, jforbes, ketuzsezr, kraxel, m.a.young, virt-maint, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-02 19:06:35 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
dmesg output on bare metal
none
lspci output on bare metal
none
dsdt.dsl out from iasl
none
Screenshot of fault none

Description Adam Miller 2011-10-13 10:18:56 EDT
Description of problem:
I get about 5 seconds worth of Xen kernel output and then it will either kernel panic with the following error or it will reboot the machine:
https://picasaweb.google.com/103451550643082159750/Fail#5662979668130020322

NOTE: It will only display that error message for roughly 5 seconds or so before auto-rebooting the system. Not entirely sure that's relevant but I thought I'd mention it.

Version-Release number of selected component (if applicable):
xen-4.1.1-6.f16.x86_64

How reproducible:
Every time

Steps to Reproduce:
1. Install Fedora 16 Beta  
2. Run 'yum -y update' , , ,
3. Reboot
4. Run 'yum -y install 
5. Reboot and select 'Xen 4.1.1 -> Fedora Linux, with Xen 4.1.1 and Linux 3.1.0-rc9.git.0.0.fc16.x86_64'@virtualization xen' from grub2 menu
  
Actual results:
Kernel Panic

Expected results:
Functional Xen dom0

Additional info:
Comment 1 Konrad Rzeszutek Wilk 2011-10-13 12:24:07 EDT
Adam,

We need (when you run baremetal), your full dmesg output, lspci. Also is it possible for you to setup a serial console (to get the full output when running with Xen). If not, there are also options on the Xen command line to use the VGA buffer to printout the debug information (this Wiki http://wiki.xen.org/xenwiki/XenSerialConsole has a wealth of information on how to setup the debug options).

And what kind of motherboard/CPU is this?

Ah, and did the earlier kernel (3.1.0-rc8.git) work for you? Or is this the first time you have tried this?
Comment 2 Adam Miller 2011-10-13 15:04:44 EDT
Unfortunately this machine lacks a serial port, its just a netbook I have (only spare machine I had laying around to try out F16 Xen).

dmesg and lspci attached.

Its an Intel Atom N450 ... not sure about the motherboard though, whatever is in the Dell Mini 1012.

I tried with the 3.1.0-rc8.git and got the same crash/failure.
Comment 3 Adam Miller 2011-10-13 15:06:32 EDT
Created attachment 528082 [details]
dmesg output on bare metal
Comment 4 Adam Miller 2011-10-13 15:07:16 EDT
Created attachment 528083 [details]
lspci output on bare metal
Comment 5 Konrad Rzeszutek Wilk 2011-10-13 16:09:28 EDT
Nothing jumps at me from the logs. Can you boot with Xen and on the Linux command line append: debug loglevel=8 vga=ask

and when it asks what VGA mode you want, pick the 80x60 (or something suitable large). When you get to the crash, take a picture and hopefully there will be more data.

For extra credits, you can add on the Xen hypervisor line: 'vga=text-80x60,keep console=vga console_to_ring sync_console loglvl=all noreboot' and on the Linux line just append 'debug loglevel=8'. That should give a nice view of what is happening.
Comment 6 Adam Miller 2011-10-13 17:51:24 EDT
ok, so round 1 with the linux kernel line using the 'debug loglevel=8 vga=ask' ... I never got asked for the vga mode and I get two different failures now and it promptly reboots (took me a few tries to get an in focus snap of both):

https://picasaweb.google.com/103451550643082159750/Fail#5663094447036365778
https://picasaweb.google.com/103451550643082159750/Fail#5663094837565972210

When I modify both the Xen and the Linux kernel lines I get the following considerably nicer output and it does in fact not auto-reboot:

https://picasaweb.google.com/103451550643082159750/Fail#5663096077355442898

Many thanks for the quick replies and please let me know if there's any more info I can provide! :)

-AdamM
Comment 7 Konrad Rzeszutek Wilk 2011-10-15 08:37:21 EDT
That looks suspiciously like the ACPI DSDT is trying to fiddle with the APIC bits. Could you retrieve the DSDT and attach it to this bug please? The way to do that is to install 'iasl' and do:

cat /sys/firmware/acpi/tables/DSDT > /tmp/dsdt
[root@tst006 ~]# iasl -d /tmp/dsdt

Intel ACPI Component Architecture
AML Disassembler version 20100528 [Feb  9 2011]
Copyright (c) 2000 - 2010 Intel Corporation
Supports ACPI Specification Revision 4.0a

Loading Acpi table from file /tmp/dsdt
Acpi table [DSDT] successfully installed and loaded
Pass 1 parse of [DSDT]
Pass 2 parse of [DSDT]
Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)
..................................................................................................................................................................................................................................................................................................................................................
Parsing completed
Disassembly completed, written to "/tmp/dsdt.dsl"

And attach the dsdtl.dsl file here please.
Comment 8 Adam Miller 2011-10-17 09:57:29 EDT
Created attachment 528542 [details]
dsdt.dsl out from iasl
Comment 9 Konrad Rzeszutek Wilk 2011-10-25 22:32:10 EDT
Hmm, I was thinking it might be http://lists.xensource.com/archives/html/xen-devel/2011-03/msg01149.html and .. the original analysis that Jan did: http://www.gossamer-threads.com/lists/xen/devel/175555

But boy, reading ACPI code is not yet my forte.
Comment 10 Konrad Rzeszutek Wilk 2011-10-25 22:56:51 EDT
Is there some excerpt from the crash before this: https://picasaweb.google.com/103451550643082159750/Fail#5663096077355442898

I am wondering if you see any of this when you boot it under Xen?

io: create slab <bio-0> at 0
ACPI: Added _OSI(Module Device)
ACPI: Added _OSI(Processor Device)
ACPI: Added _OSI(3.0 _SCP Extensions)
ACPI: Added _OSI(Processor Aggregator Device)
ACPI: EC: Look up EC in DSDT
ACPI: SSDT 000000007f5c7244 00203 (v02  PmRef  Cpu0Ist 00003000 INTL 20050624) 
ACPI: Dynamic OEM Table Load:
ACPI: SSDT           (null) 00203 (v02  PmRef  Cpu0Ist 00003000 INTL 20050624) 
ACPI: SSDT 000000007f5c6bb6 00609 (v02  PmRef  Cpu0Cst 00003001 INTL 20050624) 
ACPI: Dynamic OEM Table Load:
ACPI: SSDT           (null) 00609 (v02  PmRef  Cpu0Cst 00003001 INTL 20050624) 
ACPI: SSDT 000000007f5c7447 000D4 (v02  PmRef  Cpu1Ist 00003000 INTL 20050624) 
ACPI: Dynamic OEM Table Load:
ACPI: SSDT           (null) 000D4 (v02  PmRef  Cpu1Ist 00003000 INTL 20050624) 
ACPI: SSDT 000000007f5c71bf 00085 (v02  PmRef  Cpu1Cst 00003000 INTL 20050624) 
ACPI: Dynamic OEM Table Load:
ACPI: SSDT           (null) 00085 (v02  PmRef  Cpu1Cst 00003000 INTL 20050624) 
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: EC: GPE = 0x19, I/O: command/status = 0x66, data = 0x62
ACPI: No dock devices found. 
HEST: Table not found.
PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
ACPI Error: [CAPB] Namespace lookup failure, AE_ALREADY_EXISTS (20110623/dsfield-143)
ACPI Error: Method parse/execution failed [\_SB_.PCI0._OSC] (Node ffff88007b743848), AE_ALREADY_EXISTS (20110623/psparse-536)
ACPI: Marking method _OSC as Serialized because of AE_ALREADY_EXISTS error
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3f])
Comment 11 Adam Miller 2011-10-26 22:17:05 EDT
Yes, I have very similar messages but they scroll by too quickly for me to be able to capture a photo.

I also updated to 4.1.2 that's in updates-testing and I still have the same issue.

Please let me know if there's any more info I can provide.

-AdamM
Comment 12 Konrad Rzeszutek Wilk 2011-10-27 10:37:20 EDT
(In reply to comment #11)
> Yes, I have very similar messages but they scroll by too quickly for me to be
> able to capture a photo.
> 

And there is a awesome solution for that. On the Linux line add 'boot_delay=X'.

        boot_delay=     Milliseconds to delay each printk during boot.
                        Values larger than 10 seconds (10000) are changed to
                        no delay (0). 
                        Format: integer
Comment 13 Konrad Rzeszutek Wilk 2011-11-02 14:49:00 EDT
Created attachment 531424 [details]
Screenshot of fault

I was able to get one of these machines and with some fiddling in the hypervisor got it to slow down the output to capture it on video.

The issue does not look to be ACPI related on the Dell Mini 1012 I've here - and it seems to be something .. well, different.
Comment 14 Konrad Rzeszutek Wilk 2011-11-11 18:22:34 EST
Well, not sure what had caused it - but I installed Linux 3.2-rc1 on the laptop, and viola - it booted!
Comment 15 Konrad Rzeszutek Wilk 2011-11-11 18:34:02 EST
(In reply to comment #14)
> Well, not sure what had caused it - but I installed Linux 3.2-rc1 on the
> laptop, and viola - it booted!

With Xen 4.1 on it obviously
Comment 16 Michael Young 2012-03-02 19:06:35 EST
As this works for Konrad I am assuming it is fixed.