Bug 438000

Summary: F-9 pv_ops xen: x86_64 guest install aborts during formatting
Product: [Fedora] Fedora Reporter: Orion Poplawski <orion>
Component: kernel-xenAssignee: Eduardo Habkost <ehabkost>
Status: CLOSED RAWHIDE QA Contact: Virtualization Bugs <virt-bugs>
Severity: low Docs Contact:
Priority: low    
Version: rawhideCC: jlaska, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-xen-2.6.25-0.15.rc8.fc9 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-04 04:25:32 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 434756    
Attachments:
Description Flags
guest log
none
xend.log
none
virt-install.log none

Description Orion Poplawski 2008-03-18 12:33:55 EDT
Description of problem:

While installing rawhide x86_64 guest via kickstart on F8 x86_64 host:

Welcome to Fedora for x86_64

          ┌─────────────────────┤ Formatting ├──────────────────────┐
          │                                                         │
          │ Formatting / file system...                             │
          │                                                         │
          │                           70%                           │
Guest installation complete... restarting guest.                    │
libvir: Xen Daemon error : POST operation failed: (xend.err "Boot loader didn't
return any data!")
virDomainCreate() failed POST operation failed: (xend.err "Boot loader didn't
return any data!")
Domain installation may not have been
 successful.  If it was, you can restart your domain
 by running 'virsh start xenfdev64'; otherwise, please
 restart your installation.
Tue, 18 Mar 2008 10:27:50 ERROR    virDomainCreate() failed POST operation
failed: (xend.err "Boot loader didn't return any data!")Space> selects   | 
<F12> next screen
Traceback (most recent call last):
  File "/usr/sbin/virt-install", line 517, in <module>
    main()
  File "/usr/sbin/virt-install", line 503, in main
    dom.create()
  File "/usr/lib64/python2.5/site-packages/libvirt.py", line 240, in create
    if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self)
libvirtError: virDomainCreate() failed POST operation failed: (xend.err "Boot
loader didn't return any data!")

Version-Release number of selected component (if applicable):
11.4.0.54

How reproducible:
Every time
Comment 1 Jeremy Katz 2008-03-18 13:18:08 EDT
That looks like the kernel causing a reboot...
Comment 2 Daniel Berrange 2008-03-18 13:21:59 EDT
On the Dom0 host, please edit /etc/sysconfig/xend and turn on guest console logging

  XENCONSOLED_LOG_GUESTS=yes

Then do  'service xend restart'

Then try the guest installation again and attach the log file from
/var/log/xen/console/guest-XXXXXX.log  where XXXXX is the name of your guest

Also provide /var/log/xen/xend.log

And provide the /root/.virtinst/virt-install.log file and details of the
parameters supplied to virt-install.
Comment 3 Orion Poplawski 2008-03-18 13:40:50 EDT
Created attachment 298428 [details]
guest log

Installing with the following:

# virt-install -n xenfdev64 -r 512 --vcpus=2 -f
/export/data1/xen_disk_xenfdev64 -m 40:00:00:00:00:03 --nographics --accelerate
-p -x ks=nfs:saga:/export/data1/ks/rawhide-nox64.cfg --arch=x86_64 -l
http://fedora.cora.nwra.com/fedora/linux/development/x86_64/os
Comment 4 Orion Poplawski 2008-03-18 13:41:30 EDT
Created attachment 298429 [details]
xend.log
Comment 5 Orion Poplawski 2008-03-18 13:42:09 EDT
Created attachment 298430 [details]
virt-install.log
Comment 6 James Laska 2008-03-20 13:56:24 EDT
Seeing this also on F9 rawhide also (kernel-xen-2.6.25-0.2.rc4.fc9.x86_64)

text-mode pv install during disk format
http://jlaska.fedorapeople.org/f9-pv-install-panic.txt
Comment 7 Mark McLoughlin 2008-03-20 14:47:53 EDT
jlaska's oops:

BUG: unable to handle kernel paging request at 0000000000001000
IP: [<ffffffff80491490>] _cond_resched+0x0/0x38                     │
PGD 546b067 PUD 5447067 PMD 0
Oops: 0002 [1] DEBUG_PAGEALLOC
CPU 0
Modules linked in: sha256_generic aes_generic cbc dm_crypt crypto_blkcipher
dm_emc dm_round_robin dm_multipath dm_snapshot dm_mirror dm_zero dm_mod xfs jfs
reiserfs lock_nolock gfs2 msdos linear raid10 raid456 async_xor async_memcpy
async_tx xor raid1 raid0 xen_netfront xen_blkfront iscsi_tcp libiscsi
scsi_transport_iscsi scsi_mod ext2 ext3 jbd ext4dev jbd2 mbcache crc16 squashfs
pcspkr edd loop nfs lockd nfs_acl sunrpc vfat fat cramfs
Pid: 1108, comm: mke2fs Not tainted 2.6.25-0.2.rc4.fc9xen #1
RIP: e030:[<ffffffff80491490>]  [<ffffffff80491490>] _cond_resched+0x0/0x38
RSP: e02b:ffff88000c5a9be0  EFLAGS: 00010206
RAX: 0000000000001000 RBX: 0000000000001000 RCX: 000000000000003c> next screen
RDX: 0000000000000000 RSI: ffff880003e3cfb8 RDI: ffff88000c5a9ce8
RBP: ffff88000c5a9ce8 R08: 0000000000001000 R09: ffff880003e3cfb8
R10: 0000000000000000 R11: ffff88001f823b90 R12: 0000000000001000
R13: 0000000000001000 R14: 00000000701b8000 R15: ffff88000c5a9c88
FS:  00007fb0e7a577e0(0000) GS:ffffffff80612000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000001000 CR3: 000000000c504000 CR4: 0000000000002620
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Process mke2fs (pid: 1108, threadinfo ffff88000c5a8000, task ffff880005404000)
Stack:  ffffffff8027997a ffff88000c5a9ce8 ffffffff802bbc4e 0000000000000001
 ffff88001f823b78 0000000000000000 ffff88000c5a9e68 00000000701b2000
 ffff88000c5a9de8 ffff88000d18f000 ffff88001f823b78 ffffffff804b3f00
Call Trace:
 [<ffffffff8027997a>] ? generic_file_buffered_write+0x1cb/0x6c6
 [<ffffffff802bbc4e>] ? xattr_getsecurity+0x30/0x7e
 [<ffffffff802376a3>] ? current_fs_time+0x22/0x29
 [<ffffffff80300fa5>] ? security_inode_need_killpriv+0x11/0x13
 [<ffffffff8027a3df>] __generic_file_aio_write_nolock+0x35c/0x390
 [<ffffffff802b3750>] ? file_update_time+0xb1/0xea
 [<ffffffff802a7564>] ? pipe_write+0x53e/0x550
 [<ffffffff8027a513>] generic_file_aio_write_nolock+0x3b/0x8d
 [<ffffffff802a096e>] do_sync_write+0xe7/0x12d
 [<ffffffff802c6568>] ? block_llseek+0x35/0x8b
 [<ffffffff80246193>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff80308c30>] ? selinux_file_permission+0x10f/0x118
 [<ffffffff80301028>] ? security_file_permission+0x11/0x13
 [<ffffffff802a12a0>] vfs_write+0xae/0x157
 [<ffffffff802a140d>] sys_write+0x47/0x6f
 [<ffffffff80211179>] tracesys+0xbe/0xc3
Comment 8 Mark McLoughlin 2008-03-25 12:59:00 EDT
Confirmed with 2.6.25-0.4.rc4.fc9xen on x86_64; oops as soon as anaconda starts
to  format the disk

Seems not to happen on x86_64; re-assigning to Eduardo
Comment 9 Eduardo Habkost 2008-03-31 13:35:20 EDT
From the Oops dump:

Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

I am investigating what could be zeroing the memory where code is stored. Bug 
#438392 seems to be caused by the same problem: kmem_cache pointer is zero on 
kmem_cache_alloc(), that seems to be caused by radix_tree_node_cachep being 
zeroed.
Comment 10 Eduardo Habkost 2008-04-03 18:29:01 EDT
I have just found the cause: kernel text and data segments were not being 
reserved at boot. A fix should go to Rawhide soon.
Comment 11 Mark McLoughlin 2008-04-04 04:25:32 EDT
Should be fixed with kernel-xen-2.6.25-0.15.rc8.fc9