Description of problem: When trying to install rhel4 paravirt guests on rhel5.3 dom0, installation crashes with kernel backtrace: Oops: 0003 [#1] SMP Modules linked in: dm_snapshot dm_mirror dm_zero dm_mod ext3 jbd msdos raid6 raid5 xor raid1 raid0 xenblk xennet sr_mod sd_mod scsi_mod cdrom loop nfs nfs_acl lockd sunrpc vfat fat cramfs CPU: 0 EIP: 0061:[<c0112510>] Not tainted VLI EFLAGS: 00010246 (2.6.9-78.ELxenU) | <Space> selects | <F12> next screen EIP is at pgd_free+0x11b/0x158 eax: 00000000 ebx: d40bb000 ecx: 00000400 edx: 80000001 esi: 00000000 edi: d40bb000 ebp: 00000003 esp: d41ede4c ds: 007b es: 007b ss: 0068 Process make_fonts_map. (pid: 2289, threadinfo=d41ed000 task=d42118f0) Stack: eb50c300 eb50c300 00007ff0 eb50c300 eb50c300 c011ad70 d4530000 d41ede78 c0165240 eb50c300 d42118f0 00000005 001c801e 00000080 c015b6a7 d40c6544 d40c6040 d41ed000 c0000000 00000000 d4527900 d41ed000 ec1cd400 c016530f Call Trace: [<c011ad70>] __mmdrop+0x21/0x3a [<c0165240>] exec_mmap+0x1df/0x200 [<c015b6a7>] vfs_read+0xcf/0xd8 [<c016530f>] flush_old_exec+0x46/0x24b [<c0181cc4>] load_elf_binary+0x385/0xd3d [<c0181ed3>] load_elf_binary+0x594/0xd3d [<c0148e7b>] kmap_high+0x19/0x21c [<c0149091>] kunmap_high+0x13/0x95 [<c01490f6>] kunmap_high+0x78/0x95 [<c0164b87>] copy_strings+0x22f/0x23a [<c018193f>] load_elf_binary+0x0/0xd3d [<c0165dfa>] search_binary_handler+0xb8/0x257 [<c0166111>] do_execve+0x178/0x210 [<c0105da0>] sys_execve+0x2c/0x8e [<c010740f>] syscall_call+0x7/0xb Code: 8b 04 98 89 f1 c1 e0 0c 81 e1 ff 0f 00 00 89 c6 09 ce 6a 00 8d 9e ff ff ff bf 89 df 53 e8 55 01 00 00 59 31 c0 b9 00 04 00 00 5e <f3> ab 53 ff 35 44 31 36 c0 e8 b6 26 03 00 80 3d 04 77 2f c0 00 <0>Fatal exception: panic in 5 seconds Kernel panic - not syncing: Fatal exception Guest installation complete... restarting guest. virDomainCreate() failed POST operation failed: (xend.err "Error creating domain: Boot loader didn't return any data!") Domain installation may not have been successful. If it was, you can restart your domain by running 'virsh start rhel4.7_i386_pv_guest'; otherwise, please restart your installation. Mon, 06 Oct 2008 18:52:26 ERROR virDomainCreate() failed POST operation failed: (xend.err "Error creating domain: Boot loader didn't return any data!") Traceback (most recent call last): File "/usr/sbin/virt-install", line 559, in ? main() File "/usr/sbin/virt-install", line 545, in main dom.create() File "/usr/lib/python2.4/site-packages/libvirt.py", line 228, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: virDomainCreate() failed POST operation failed: (xend.err "Error creating domain: Boot loader didn't return any data!") Version-Release number of selected component (if applicable): # uname -a Linux intel-s3ea2-03.rhts.bos.redhat.com 2.6.18-117.el5xen #1 SMP Mon Sep 29 22:57:44 EDT 2008 i686 i686 i386 GNU/Linux # rpm -qa | grep xen xen-3.0.3-73.el5 xen-devel-3.0.3-73.el5 kernel-xen-2.6.18-117.el5 xen-libs-3.0.3-73.el5 How reproducible: Very. This (intel-s3ea2-03.rhts.bos.redhat.com) is an rhts machine so you can reserve it and reproduce it there too. Steps to Reproduce: 1. Install a rhel5.3 tree. 2. virt-install --name rhel4.7_i386_pv_guest --location nfs:bigpapi.bos.redhat.com:/vol/engineering/redhat/released/RHEL-4/U7/AS/i386/tree --nonsparse --paravirt --file /var/lib/xen/images/rhel4.7_i386_pv_guest.img -s 5 -r 1024 --nographics 3. Continue with installation process Actual results: Crashes. Expected results: should complete Additional info: this is an intel box, hardware info about the box can be seen here: http://rhts.redhat.com/cgi-bin/rhts/system.cgi?id=224 Also from xend.log: [2008-10-06 18:52:11 xend.XendDomainInfo 4413] WARNING (XendDomainInfo:931) Domain has crashed: name=rhel4.7_i386_pv_guest id=1. [2008-10-06 18:52:25 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:1568) XendDomainInfo.destroy: domid=1 [2008-10-06 18:52:25 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:1576) XendDomainInfo.destroyDomain(1) [2008-10-06 18:52:26 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:200) XendDomainInfo.create(['vm', ['name', 'rhel4.7_i386_pv_guest'], ['memory', '1024'], ['maxmem', '1024'], ['vcpus', '1'], ['uuid', '8e252747-1594-aade-8767-4359dfe26704'], ['bootloader', '/usr/bin/pygrub'], ['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['on_crash', 'restart'], ['device', ['tap', ['dev', 'xvda'], ['uname', 'tap:aio:/var/lib/xen/images/rhel4.7_i386_pv_guest.img'], ['mode', 'w']]], ['device', ['vif', ['mac', '00:16:3e:09:9b:b2'], ['bridge', 'xenbr1']]]]) [2008-10-06 18:52:26 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:312) parseConfig: config is ['vm', ['name', 'rhel4.7_i386_pv_guest'], ['memory', '1024'], ['maxmem', '1024'], ['vcpus', '1'], ['uuid', '8e252747-1594-aade-8767-4359dfe26704'], ['bootloader', '/usr/bin/pygrub'], ['on_poweroff', 'destroy'], ['on_reboot', 'restart'], ['on_crash', 'restart'], ['device', ['tap', ['dev', 'xvda'], ['uname', 'tap:aio:/var/lib/xen/images/rhel4.7_i386_pv_guest.img'], ['mode', 'w']]], ['device', ['vif', ['mac', '00:16:3e:09:9b:b2'], ['bridge', 'xenbr1']]]] [2008-10-06 18:52:26 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:417) parseConfig: result is {'shadow_memory': None, 'start_time': None, 'uuid': '8e252747-1594-aade-8767-4359dfe26704', 'on_crash': 'restart', 'on_reboot': 'restart', 'localtime': None, 'image': None, 'on_poweroff': 'destroy', 'bootloader_args': None, 'cpus': None, 'name': 'rhel4.7_i386_pv_guest', 'backend': [], 'vcpus': 1, 'cpu_weight': None, 'features': None, 'vcpu_avail': None, 'memory': 1024, 'device': [('tap', ['tap', ['dev', 'xvda'], ['uname', 'tap:aio:/var/lib/xen/images/rhel4.7_i386_pv_guest.img'], ['mode', 'w']]), ('vif', ['vif', ['mac', '00:16:3e:09:9b:b2'], ['bridge', 'xenbr1']])], 'bootloader': '/usr/bin/pygrub', 'cpu': None, 'maxmem': 1024} [2008-10-06 18:52:26 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:1358) XendDomainInfo.construct: None [2008-10-06 18:52:26 xend 4413] DEBUG (balloon:143) Balloon: 1048772 KiB free; need 2048; done. [2008-10-06 18:52:26 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:1406) XendDomainInfo.initDomain: 2 1.0 [2008-10-06 18:52:26 xend 4413] ERROR (XendBootloader:84) Boot loader didn't return any data! [2008-10-06 18:52:26 xend.XendDomainInfo 4413] ERROR (XendDomainInfo:212) Domain construction failed Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 205, in create vm.initDomain() File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1411, in initDomain self.configure_bootloader() File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1932, in configure_bootloader self.info['image']) File "/usr/lib/python2.4/site-packages/xen/xend/XendBootloader.py", line 85, in bootloader raise VmError, msg VmError: Boot loader didn't return any data! [2008-10-06 18:52:26 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:1568) XendDomainInfo.destroy: domid=2 [2008-10-06 18:52:26 xend.XendDomainInfo 4413] DEBUG (XendDomainInfo:1576) XendDomainInfo.destroyDomain(2) [2008-10-06 18:52:26 xend 4413] ERROR (SrvBase:88) Request create failed. Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/web/SrvBase.py", line 85, in perform return op_method(op, req) File "/usr/lib/python2.4/site-packages/xen/xend/server/SrvDomainDir.py", line 82, in op_create raise XendError("Error creating domain: " + str(ex)) XendError: Error creating domain: Boot loader didn't return any data!
I tested this out on an Intel machine locally: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz With kernel -118 and xen -73, and wasn't able to reproduce the problem. I also tried with kernel -117 and xen -73, and also wasn't able to reproduce the problem. So it's probably hardware specific. We'll have to jump on the RHTS machine that Gurhan mentioned in the initial description and reproduce it there. Chris Lalancette
Well, the good news here is that I'm pretty sure this is a RHEL-4 Xen guest bug, not a dom0 bug. I wasn't able to start the install *at all* with 5.2 on this particular hardware, and on 5.3 stuff I get the crash here. So it will have to be looked at for 4.8, but I don't think (at the moment) this is a RHEL-5 blocker. I'm going to update the component to reflect this. Chris Lalancette
OK, I found it. We've been missing a patch that's been in upstream Xen dom kernels basically forever; RHEL-5 has this patch, but we do not. It basically makes it so that we can take a "spurious" page fault, when the hypervisor has changed the pte mapping underneath us from R/0 -> R/W; that's exactly what is happening in this sequence in arch/i386/mm/pgtable-xen.c: make_lowmem_page_writable( pmd, XENFEAT_writable_page_tables); memset(pmd, 0, PTRS_PER_PMD*sizeof(pmd_t)); It all makes sense; the only thing I don't understand is why we've gotten away with it up until this point. Maybe this processor family changes something with the way TLB's are done, or something like that, which is why we only see it here. Anyway, I'll attach a backport of upstream Xen c/s 10425 which fixes the issue for me; I've only tested it on i386 so far, but I'll also need to test it on x86_64. Chris Lalancette
Created attachment 319854 [details] Backport of upstream Xen c/s 10425, to fix the RHEL-4 crash
*** Bug 466932 has been marked as a duplicate of this bug. ***
Committed in 78.16.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Just as an update, I was able to manually install 4.8 guest on rhel5.3 release kernel without an issue. I won't verify the bug just yet, however so far things are looking good: [root@dhcp71-25 ~]# uname -a Linux dhcp71-25.rhts.bos.redhat.com 2.6.9-80.ELxenU #1 SMP Fri Jan 23 16:57:22 EST 2009 i686 i686 i386 GNU/Linux
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1024.html