Bug 438000 - F-9 pv_ops xen: x86_64 guest install aborts during formatting
Summary: F-9 pv_ops xen: x86_64 guest install aborts during formatting
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel-xen
Version: rawhide
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Eduardo Habkost
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: PvOpsTracker
TreeView+ depends on / blocked
 
Reported: 2008-03-18 16:33 UTC by Orion Poplawski
Modified: 2009-12-14 20:41 UTC (History)
2 users (show)

Fixed In Version: kernel-xen-2.6.25-0.15.rc8.fc9
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-04-04 08:25:32 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
guest log (13.36 KB, text/plain)
2008-03-18 17:40 UTC, Orion Poplawski
no flags Details
xend.log (544.69 KB, text/plain)
2008-03-18 17:41 UTC, Orion Poplawski
no flags Details
virt-install.log (3.64 KB, text/plain)
2008-03-18 17:42 UTC, Orion Poplawski
no flags Details

Description Orion Poplawski 2008-03-18 16:33:55 UTC
Description of problem:

While installing rawhide x86_64 guest via kickstart on F8 x86_64 host:

Welcome to Fedora for x86_64

          ┌─────────────────────┤ Formatting ├──────────────────────┐
          │                                                         │
          │ Formatting / file system...                             │
          │                                                         │
          │                           70%                           │
Guest installation complete... restarting guest.                    │
libvir: Xen Daemon error : POST operation failed: (xend.err "Boot loader didn't
return any data!")
virDomainCreate() failed POST operation failed: (xend.err "Boot loader didn't
return any data!")
Domain installation may not have been
 successful.  If it was, you can restart your domain
 by running 'virsh start xenfdev64'; otherwise, please
 restart your installation.
Tue, 18 Mar 2008 10:27:50 ERROR    virDomainCreate() failed POST operation
failed: (xend.err "Boot loader didn't return any data!")Space> selects   | 
<F12> next screen
Traceback (most recent call last):
  File "/usr/sbin/virt-install", line 517, in <module>
    main()
  File "/usr/sbin/virt-install", line 503, in main
    dom.create()
  File "/usr/lib64/python2.5/site-packages/libvirt.py", line 240, in create
    if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self)
libvirtError: virDomainCreate() failed POST operation failed: (xend.err "Boot
loader didn't return any data!")

Version-Release number of selected component (if applicable):
11.4.0.54

How reproducible:
Every time

Comment 1 Jeremy Katz 2008-03-18 17:18:08 UTC
That looks like the kernel causing a reboot...

Comment 2 Daniel Berrangé 2008-03-18 17:21:59 UTC
On the Dom0 host, please edit /etc/sysconfig/xend and turn on guest console logging

  XENCONSOLED_LOG_GUESTS=yes

Then do  'service xend restart'

Then try the guest installation again and attach the log file from
/var/log/xen/console/guest-XXXXXX.log  where XXXXX is the name of your guest

Also provide /var/log/xen/xend.log

And provide the /root/.virtinst/virt-install.log file and details of the
parameters supplied to virt-install.


Comment 3 Orion Poplawski 2008-03-18 17:40:50 UTC
Created attachment 298428 [details]
guest log

Installing with the following:

# virt-install -n xenfdev64 -r 512 --vcpus=2 -f
/export/data1/xen_disk_xenfdev64 -m 40:00:00:00:00:03 --nographics --accelerate
-p -x ks=nfs:saga:/export/data1/ks/rawhide-nox64.cfg --arch=x86_64 -l
http://fedora.cora.nwra.com/fedora/linux/development/x86_64/os

Comment 4 Orion Poplawski 2008-03-18 17:41:30 UTC
Created attachment 298429 [details]
xend.log

Comment 5 Orion Poplawski 2008-03-18 17:42:09 UTC
Created attachment 298430 [details]
virt-install.log

Comment 6 James Laska 2008-03-20 17:56:24 UTC
Seeing this also on F9 rawhide also (kernel-xen-2.6.25-0.2.rc4.fc9.x86_64)

text-mode pv install during disk format
http://jlaska.fedorapeople.org/f9-pv-install-panic.txt

Comment 7 Mark McLoughlin 2008-03-20 18:47:53 UTC
jlaska's oops:

BUG: unable to handle kernel paging request at 0000000000001000
IP: [<ffffffff80491490>] _cond_resched+0x0/0x38                     │
PGD 546b067 PUD 5447067 PMD 0
Oops: 0002 [1] DEBUG_PAGEALLOC
CPU 0
Modules linked in: sha256_generic aes_generic cbc dm_crypt crypto_blkcipher
dm_emc dm_round_robin dm_multipath dm_snapshot dm_mirror dm_zero dm_mod xfs jfs
reiserfs lock_nolock gfs2 msdos linear raid10 raid456 async_xor async_memcpy
async_tx xor raid1 raid0 xen_netfront xen_blkfront iscsi_tcp libiscsi
scsi_transport_iscsi scsi_mod ext2 ext3 jbd ext4dev jbd2 mbcache crc16 squashfs
pcspkr edd loop nfs lockd nfs_acl sunrpc vfat fat cramfs
Pid: 1108, comm: mke2fs Not tainted 2.6.25-0.2.rc4.fc9xen #1
RIP: e030:[<ffffffff80491490>]  [<ffffffff80491490>] _cond_resched+0x0/0x38
RSP: e02b:ffff88000c5a9be0  EFLAGS: 00010206
RAX: 0000000000001000 RBX: 0000000000001000 RCX: 000000000000003c> next screen
RDX: 0000000000000000 RSI: ffff880003e3cfb8 RDI: ffff88000c5a9ce8
RBP: ffff88000c5a9ce8 R08: 0000000000001000 R09: ffff880003e3cfb8
R10: 0000000000000000 R11: ffff88001f823b90 R12: 0000000000001000
R13: 0000000000001000 R14: 00000000701b8000 R15: ffff88000c5a9c88
FS:  00007fb0e7a577e0(0000) GS:ffffffff80612000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000001000 CR3: 000000000c504000 CR4: 0000000000002620
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Process mke2fs (pid: 1108, threadinfo ffff88000c5a8000, task ffff880005404000)
Stack:  ffffffff8027997a ffff88000c5a9ce8 ffffffff802bbc4e 0000000000000001
 ffff88001f823b78 0000000000000000 ffff88000c5a9e68 00000000701b2000
 ffff88000c5a9de8 ffff88000d18f000 ffff88001f823b78 ffffffff804b3f00
Call Trace:
 [<ffffffff8027997a>] ? generic_file_buffered_write+0x1cb/0x6c6
 [<ffffffff802bbc4e>] ? xattr_getsecurity+0x30/0x7e
 [<ffffffff802376a3>] ? current_fs_time+0x22/0x29
 [<ffffffff80300fa5>] ? security_inode_need_killpriv+0x11/0x13
 [<ffffffff8027a3df>] __generic_file_aio_write_nolock+0x35c/0x390
 [<ffffffff802b3750>] ? file_update_time+0xb1/0xea
 [<ffffffff802a7564>] ? pipe_write+0x53e/0x550
 [<ffffffff8027a513>] generic_file_aio_write_nolock+0x3b/0x8d
 [<ffffffff802a096e>] do_sync_write+0xe7/0x12d
 [<ffffffff802c6568>] ? block_llseek+0x35/0x8b
 [<ffffffff80246193>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff80308c30>] ? selinux_file_permission+0x10f/0x118
 [<ffffffff80301028>] ? security_file_permission+0x11/0x13
 [<ffffffff802a12a0>] vfs_write+0xae/0x157
 [<ffffffff802a140d>] sys_write+0x47/0x6f
 [<ffffffff80211179>] tracesys+0xbe/0xc3


Comment 8 Mark McLoughlin 2008-03-25 16:59:00 UTC
Confirmed with 2.6.25-0.4.rc4.fc9xen on x86_64; oops as soon as anaconda starts
to  format the disk

Seems not to happen on x86_64; re-assigning to Eduardo

Comment 9 Eduardo Habkost 2008-03-31 17:35:20 UTC
From the Oops dump:

Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

I am investigating what could be zeroing the memory where code is stored. Bug 
#438392 seems to be caused by the same problem: kmem_cache pointer is zero on 
kmem_cache_alloc(), that seems to be caused by radix_tree_node_cachep being 
zeroed.

Comment 10 Eduardo Habkost 2008-04-03 22:29:01 UTC
I have just found the cause: kernel text and data segments were not being 
reserved at boot. A fix should go to Rawhide soon.

Comment 11 Mark McLoughlin 2008-04-04 08:25:32 UTC
Should be fixed with kernel-xen-2.6.25-0.15.rc8.fc9


Note You need to log in before you can comment on or make changes to this bug.