Bug 485421 - Kernel panic when running xen-vnif enabled FV guest image on KVM
Kernel panic when running xen-vnif enabled FV guest image on KVM
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.7
All Linux
low Severity medium
: rc
: ---
Assigned To: Chris Lalancette
Red Hat Kernel QE team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-13 09:38 EST by Frank Arnold
Modified: 2009-05-18 15:35 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-05-18 15:35:52 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to fix panic while modprobing under KVM guest (3.14 KB, patch)
2009-02-27 12:39 EST, Chris Lalancette
no flags Details | Diff
Log of the failing RHEL 4.8 Beta guest (11.83 KB, text/plain)
2009-04-16 13:48 EDT, Frank Arnold
no flags Details
Log of the working RHEL 4.8 Snapshot 3 guest (12.21 KB, text/plain)
2009-04-16 13:49 EDT, Frank Arnold
no flags Details

  None (edit)
Description Frank Arnold 2009-02-13 09:38:13 EST
First of all:
I know it's a misconfiguration, but the result could be more graceful.
We'd like to use the same OS images for Xen FV and KVM testing.

Description of problem:
Our RHEL 4.7 FV guest images were installed on Xen, and we enabled the xen-vnif driver to include it into our testing ('alias eth0 xen-vnif' in /etc/modprobe.conf). However, starting those images on KVM fails horribly.

Version-Release number of selected component (if applicable):
RHEL 4.7 kernel 2.6.9-78

How reproducible:
Every time on a recent KVM (we use daily builds for testing).

Actual results:
The guest starts up until it comes to the hardware initialization stage.
While initializing the network interface following kernel panic occurs:

                Welcome to Red Hat Enterprise Linux ES
                Press 'I' to enter interactive startup.
Setting clock  (localtime): Fri Feb 13 13:01:28 CET 2009 [  OK  ]
Starting udev:  [  OK  ]
Initializing hardware...  storageUnable to handle kernel NULL pointer dereference at virtual address 00000048
 printing eip:
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: xen_vnif floppy ext3 jbd
CPU:    0
EIP:    0060:[<c01ec482>]    Not tainted VLI
EFLAGS: 00010246   (2.6.9-78.EL)
EIP is at kobject_add+0x55/0xd7
eax: 00000048   ebx: 00000000   ecx: 00000000   edx: ffff0001
esi: 00000048   edi: f8824478   ebp: 00000000   esp: c2624f34
ds: 007b   es: 007b   ss: 0068
Process modprobe (pid: 1482, threadinfo=c2624000 task=f7c10680)
Stack: f8824478 ffffffea f8824448 00000000 c01ec51d f8824478 c0397d94 c0258700
       f8824448 f8824458 c037d800 c2624000 c0258b5c 1d244b3c 00000000 0000000a
       c033252a 00000000 00000000 f8824420 c0397c80 f8824448 c0397c80 c037d800
Call Trace:
 [<c01ec51d>] kobject_register+0x19/0x39
 [<c0258700>] bus_add_driver+0x36/0x97
 [<c0258b5c>] driver_register+0x51/0x58
 [<c026b307>] xenbus_register_driver_common+0x60/0x71
 [<c026b32f>] xenbus_register_frontend+0x17/0x26
 [<f8826028>] netif_init+0x28/0x2a [xen_vnif]
 [<c01420cf>] sys_init_module+0xe9/0x20d
 [<c03277cb>] syscall_call+0x7/0xb
Code: 89 c5 8b 47 28 85 c0 74 6d 8b 18 31 c9 ba 42 00 00 00 b8 5c 2c 33 c0 e8 75 4d f3 ff 8d 73 48 e8 f1 99 13 00 ba 01 00 ff ff 89 f0 <0f> c1 10 85 d2 0f 85 06 04 00 00 85 ed 75 0d 8b 47 28 83 c0 10
 <0>Fatal exception: panic in 5 seconds
Kernel panic - not syncing: Fatal exception


Expected results:
At least a running guest without a working NIC to be able to fix the misconfiguration without a rescue disk.
Comment 1 Chris Lalancette 2009-02-13 10:10:01 EST
Ug, yes.  That clearly won't work, but it definitely should just die gracefully, fail to register the xen* devices, and finish booting.

Chris Lalancette
Comment 2 Chris Lalancette 2009-02-27 12:39:51 EST
Created attachment 333510 [details]
Patch to fix panic while modprobing under KVM guest

The attached patch seems to fix the issue for me.  There were actually two issues here; the first one was that we weren't properly returning an error when we couldn't load, and the second was that we weren't properly unregistering things when we did fail.  The first is fixed by upstream c/s 14622; the second was still a problem upstream, so I've sent a patch there.
Comment 3 RHEL Product and Program Management 2009-02-27 13:21:04 EST
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 5 Vivek Goyal 2009-03-11 10:11:02 EDT
Committed in 83.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/
Comment 7 Chris Ward 2009-04-16 12:08:21 EDT
~~ Attention! Snap 4 Released ~~
RHEL 4.8 Snapshot 4 has been released on partners.redhat.com. There should
be a fix present that addresses this bug. NOTE: there is only a short time
left to test, please test and report back results on this bug ASAP.

The latest kernel build can be obtained here:
http://people.redhat.com/vgoyal/rhel4/

If you encounter any issues, please set the bug back to the ASSIGNED state and
describe the issues you encountered. If you have found a NEW bug, clone this
bug and describe the issues you encountered. Further questions can be
directed to your Red Hat Partner Manager.

If you have VERIFIED the bug fix. Please select your PartnerID from the
Verified field above. Please leave a comment with your test results details.
Include which arches tested, package version and any applicable logs.
Comment 8 Frank Arnold 2009-04-16 13:45:06 EDT
This particular issue is fixed. Not sure about the other xen_* modules
getting loaded.


Test details and results:

KVM host setup
  KVM module version:    kvm-84-6624-gdefdf1e
  KVM userspace version: kvm-84-512-ga1075de
  Kernel version:        2.6.27.21-170.2.56.fc10.x86_64
  Base OS:               Fedora release 10 (Cambridge)
  Processor:             Fam: 15, Model: 107, Stepping: 1 (AMD Athlon 64 X2)
  Memory:                4096 MB
  Main Board:            ASUS M2N-MX SE Plus

Guest setup
  Guest 1
    OS:               Red Hat Enterprise Linux 4.8 Beta
    Kernel version:   2.6.9-82.ELhugemem
    Virtual CPUs:     2
    Memory:           1792 MB
    Test:             Boot test
    Result:           Failed
    Comments:
    * /etc/modprobe.conf contains the line 'alias eth0 xen-vnif'
    * Kernel panic while initializing hardware
    * Unable to handle kernel NULL pointer dereference

  Guest 2
    OS:               Red Hat Enterprise Linux 4.8 Snapshot 3
    Kernel version:   2.6.9-86.ELhugemem
    Virtual CPUs:     2
    Memory:           1792 MB
    Test:             Boot test
    Result:           Succeeded
    Comments:
    * /etc/modprobe.conf contains the line 'alias eth0 xen-vnif'
    * xen-vnif doesn't get loaded (checked with lsmod)
    * xen_balloon and xen_platform_pci modules got loaded, but they don't seem
      to cause trouble
    * Guest NIC is working like expected, module e1000 got loaded, guest
      received an address
    * Used bridging (-net nic,model=e1000 -net tap,ifname=tap0)
Comment 9 Frank Arnold 2009-04-16 13:48:08 EDT
Created attachment 339887 [details]
Log of the failing RHEL 4.8 Beta guest
Comment 10 Frank Arnold 2009-04-16 13:49:18 EDT
Created attachment 339888 [details]
Log of the working RHEL 4.8 Snapshot 3 guest
Comment 12 errata-xmlrpc 2009-05-18 15:35:52 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1024.html

Note You need to log in before you can comment on or make changes to this bug.