First of all: I know it's a misconfiguration, but the result could be more graceful. We'd like to use the same OS images for Xen FV and KVM testing. Description of problem: Our RHEL 4.7 FV guest images were installed on Xen, and we enabled the xen-vnif driver to include it into our testing ('alias eth0 xen-vnif' in /etc/modprobe.conf). However, starting those images on KVM fails horribly. Version-Release number of selected component (if applicable): RHEL 4.7 kernel 2.6.9-78 How reproducible: Every time on a recent KVM (we use daily builds for testing). Actual results: The guest starts up until it comes to the hardware initialization stage. While initializing the network interface following kernel panic occurs: Welcome to Red Hat Enterprise Linux ES Press 'I' to enter interactive startup. Setting clock (localtime): Fri Feb 13 13:01:28 CET 2009 [ OK ] Starting udev: [ OK ] Initializing hardware... storageUnable to handle kernel NULL pointer dereference at virtual address 00000048 printing eip: *pde = 00000000 Oops: 0002 [#1] Modules linked in: xen_vnif floppy ext3 jbd CPU: 0 EIP: 0060:[<c01ec482>] Not tainted VLI EFLAGS: 00010246 (2.6.9-78.EL) EIP is at kobject_add+0x55/0xd7 eax: 00000048 ebx: 00000000 ecx: 00000000 edx: ffff0001 esi: 00000048 edi: f8824478 ebp: 00000000 esp: c2624f34 ds: 007b es: 007b ss: 0068 Process modprobe (pid: 1482, threadinfo=c2624000 task=f7c10680) Stack: f8824478 ffffffea f8824448 00000000 c01ec51d f8824478 c0397d94 c0258700 f8824448 f8824458 c037d800 c2624000 c0258b5c 1d244b3c 00000000 0000000a c033252a 00000000 00000000 f8824420 c0397c80 f8824448 c0397c80 c037d800 Call Trace: [<c01ec51d>] kobject_register+0x19/0x39 [<c0258700>] bus_add_driver+0x36/0x97 [<c0258b5c>] driver_register+0x51/0x58 [<c026b307>] xenbus_register_driver_common+0x60/0x71 [<c026b32f>] xenbus_register_frontend+0x17/0x26 [<f8826028>] netif_init+0x28/0x2a [xen_vnif] [<c01420cf>] sys_init_module+0xe9/0x20d [<c03277cb>] syscall_call+0x7/0xb Code: 89 c5 8b 47 28 85 c0 74 6d 8b 18 31 c9 ba 42 00 00 00 b8 5c 2c 33 c0 e8 75 4d f3 ff 8d 73 48 e8 f1 99 13 00 ba 01 00 ff ff 89 f0 <0f> c1 10 85 d2 0f 85 06 04 00 00 85 ed 75 0d 8b 47 28 83 c0 10 <0>Fatal exception: panic in 5 seconds Kernel panic - not syncing: Fatal exception Expected results: At least a running guest without a working NIC to be able to fix the misconfiguration without a rescue disk.
Ug, yes. That clearly won't work, but it definitely should just die gracefully, fail to register the xen* devices, and finish booting. Chris Lalancette
Created attachment 333510 [details] Patch to fix panic while modprobing under KVM guest The attached patch seems to fix the issue for me. There were actually two issues here; the first one was that we weren't properly returning an error when we couldn't load, and the second was that we weren't properly unregistering things when we did fail. The first is fixed by upstream c/s 14622; the second was still a problem upstream, so I've sent a patch there.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Committed in 83.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
~~ Attention! Snap 4 Released ~~ RHEL 4.8 Snapshot 4 has been released on partners.redhat.com. There should be a fix present that addresses this bug. NOTE: there is only a short time left to test, please test and report back results on this bug ASAP. The latest kernel build can be obtained here: http://people.redhat.com/vgoyal/rhel4/ If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have found a NEW bug, clone this bug and describe the issues you encountered. Further questions can be directed to your Red Hat Partner Manager. If you have VERIFIED the bug fix. Please select your PartnerID from the Verified field above. Please leave a comment with your test results details. Include which arches tested, package version and any applicable logs.
This particular issue is fixed. Not sure about the other xen_* modules getting loaded. Test details and results: KVM host setup KVM module version: kvm-84-6624-gdefdf1e KVM userspace version: kvm-84-512-ga1075de Kernel version: 2.6.27.21-170.2.56.fc10.x86_64 Base OS: Fedora release 10 (Cambridge) Processor: Fam: 15, Model: 107, Stepping: 1 (AMD Athlon 64 X2) Memory: 4096 MB Main Board: ASUS M2N-MX SE Plus Guest setup Guest 1 OS: Red Hat Enterprise Linux 4.8 Beta Kernel version: 2.6.9-82.ELhugemem Virtual CPUs: 2 Memory: 1792 MB Test: Boot test Result: Failed Comments: * /etc/modprobe.conf contains the line 'alias eth0 xen-vnif' * Kernel panic while initializing hardware * Unable to handle kernel NULL pointer dereference Guest 2 OS: Red Hat Enterprise Linux 4.8 Snapshot 3 Kernel version: 2.6.9-86.ELhugemem Virtual CPUs: 2 Memory: 1792 MB Test: Boot test Result: Succeeded Comments: * /etc/modprobe.conf contains the line 'alias eth0 xen-vnif' * xen-vnif doesn't get loaded (checked with lsmod) * xen_balloon and xen_platform_pci modules got loaded, but they don't seem to cause trouble * Guest NIC is working like expected, module e1000 got loaded, guest received an address * Used bridging (-net nic,model=e1000 -net tap,ifname=tap0)
Created attachment 339887 [details] Log of the failing RHEL 4.8 Beta guest
Created attachment 339888 [details] Log of the working RHEL 4.8 Snapshot 3 guest
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1024.html