Bug 240331 - Running 32-bit PV rawhide on 64-bit rawhide HV crashes
Summary: Running 32-bit PV rawhide on 64-bit rawhide HV crashes
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel-xen
Version: rawhide
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Xen Maintainance List
QA Contact: Virtualization Bugs
Whiteboard: bzcl34nup
Depends On:
TreeView+ depends on / blocked
Reported: 2007-05-16 16:03 UTC by Daniel Berrangé
Modified: 2009-12-14 20:40 UTC (History)
2 users (show)

Fixed In Version: F8
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2008-04-04 01:04:22 UTC
Type: ---

Attachments (Terms of Use)
Kernel boot logs (7.49 KB, text/plain)
2007-05-16 16:03 UTC, Daniel Berrangé
no flags Details

Description Daniel Berrangé 2007-05-16 16:03:30 UTC
Description of problem:
I have a host running

# uname -r
# uname -m

I attempt to provision a 32-bit host 

# virt-install --name fc7x32 --location
http://reducto.boston.redhat.com/pungi/dev16.0/7/Fedora/i386/os/ --ram 500
--file /var/lib/xen/images/fc7x32.img --file-size 4 --nographics --paravirt

And it panics with

Oops: 0000 [#1]
last sysfs file: 
Modules linked in: xenblk xennet iscsi_tcp libiscsi scsi_transport_iscsi sr_mod
sd_mod scsi_mod cdrom ipv6 ext2 ext3 mbcache jbd squashfs pcspkr loop nfs
nfs_acl lockd sunrpc vfat fat cramfs
CPU:    0
EIP:    0061:[<e0965928>]    Not tainted VLI
EFLAGS: 00010887   (2.6.20-2925.8.fc7xen #1)
EIP is at blkif_int+0x5a/0x18c [xenblk]
eax: 18009c00   ebx: deea2000   ecx: c12c02e0   edx: ca000100
esi: 00000000   edi: dee870ac   ebp: c135ffc8   esp: c135ff9c
ds: 007b   es: 007b   ss: 0069
Process swapper (pid: 0, ti=c135f000 task=c12c02e0 task.ti=c130a000)
Stack: 00000000 c135ffc8 c10385d2 00000001 00000002 00000001 ca000100 c12fd738 
       dee8f660 00000000 00000000 c135ffe0 c1048ee9 00000106 c12fd700 00000106 
       dee8f660 c135fff8 c104a2e4 c12fd728 c130af18 00000106 c104a24a c130af34 
Call Trace:
 [<c1005d9e>] show_trace_log_lvl+0x1a/0x2f
 [<c1005e4e>] show_stack_log_lvl+0x9b/0xa3
 [<c1005fea>] show_registers+0x194/0x26a
 [<c10061f1>] die+0x131/0x246
 [<c11f9831>] do_page_fault+0xafc/0xc80
 [<c11f7b25>] error_code+0x35/0x3c
 [<c1048ee9>] handle_IRQ_event+0x1a/0x4a
 [<c104a2e4>] handle_level_irq+0x9a/0xea
 [<c10070a8>] do_IRQ+0xc6/0xee
Code: e8 66 90 8b 43 20 89 45 e0 e9 ed 00 00 00 8b 43 24 31 f6 48 23 45 e0 6b c0
6c 8d 78 40 03 7b 28 8b 17 69 c2 9c 00 00 00 89 55 ec <8b> 94 18 c8 00 00 00 8d
44 18 5c 89 45 f0 89 55 dc eb 11 8b 55 
EIP: [<e0965928>] blkif_int+0x5a/0x18c [xenblk] SS:ESP 0069:c135ff9c
 <0>Kernel panic - not syncing: Fatal exception in interrupt

After this there are also 2 dodgy looking messages in HV logs:

# xm dmesg | tail -2
(XEN) grant_table.c:230:d0 Bad ref (2688768).
(XEN) grant_table.c:230:d0 Bad ref (2688768).

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Run the virt-install command shown above
2. Watch it boot...
Actual results:
Crashes during boot

Expected results:

Additional info:

Comment 1 Daniel Berrangé 2007-05-16 16:03:30 UTC
Created attachment 154838 [details]
Kernel boot logs

Comment 2 Daniel Berrangé 2007-05-16 16:46:38 UTC
I'm guessing this is a result of our kernel-xen still being based on the 3.0.4
patches. There was this changeset in 3.1.0 tree which I'm sure would be a
pre-requisite for 32-on-64 to work:

changeset:   13573:c9ac0bace498
user:        kfraser@localhost.localdomain
date:        Wed Jan 24 10:38:17 2007 +0000
files:       linux-2.6-xen-sparse/drivers/xen/blkback/blkback.c
linux-2.6-xen-sparse/drivers/xen/blkback/xenbus.c linux-2.
linux-2.6-xen-sparse/drivers/xen/blktap/xenbus.c linux-2.6-xen-sparse/include/xen/b
lkif.h xen/include/public/io/blkif.h
bimodal blkback: Support multiple ring protocols.

This is needed for 32-on-64 support.  Right now there are three
protocols: native, x86_32 and x86_64.  If needed it can be extended.

Interface changes (io/blkif.h)
 * Define the x86_32 and x86_64 structs additionally to the native
 * Add helper functions to convert them requests to native.

Backend changes:
 * Look at the "protocol" name of the frontend and switch ring
   handling accordingly.  If the protocol node isn't present it
   assumes native protocol.
 * As the request struct is copied anyway before being processed (for
   security reasons) it is converted to native at that point so most
   backend code doesn't need to know what the frontend speaks.
 * In case of blktap this is completely transparent to userspace, the
   kernel/userspace ring is always native no matter what the frontend

Signed-off-by: Gerd Hoffmann <kraxel@suse.de>

There's a couple of other changesets too, so I think we should wait until
kernel-xen is updated to 3.0.5 before investigating this further.

Comment 3 Red Hat Bugzilla 2007-07-25 01:41:10 UTC
change QA contact

Comment 4 Bug Zapper 2008-04-04 00:44:07 UTC
Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.

If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we're following is outlined here:

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

Comment 5 Daniel Berrangé 2008-04-04 01:04:22 UTC
This is addressed in F8.

Note You need to log in before you can comment on or make changes to this bug.