Bug 504018 - kernel oops at kvm_get_intr_delivery_bitmask+0x4e/0x86
Summary: kernel oops at kvm_get_intr_delivery_bitmask+0x4e/0x86
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Eduardo Habkost
QA Contact: Lawrence Lim
URL:
Whiteboard:
: 503886 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-06-03 20:35 UTC by Eduardo Habkost
Modified: 2016-04-26 16:44 UTC (History)
6 users (show)

Fixed In Version: kvm-83-65.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 09:33:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Relevant chunks from upstream commit 25b94fba003685051fa7c5e51f293ef1bfdb5ed5 (1.95 KB, patch)
2009-06-03 22:33 UTC, Eduardo Habkost
no flags Details | Diff
Complete cherry-pick of upstream commit 25b94fba003685051fa7c5e51f293ef1bfdb5ed5 (6.33 KB, patch)
2009-06-03 22:34 UTC, Eduardo Habkost
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2009:1272 0 normal SHIPPED_LIVE New package: kvm 2009-09-01 09:34:32 UTC

Description Eduardo Habkost 2009-06-03 20:35:14 UTC
Description of problem:
virtlab7.virt.bos.redhat.com login: Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
 [<ffffffff882d09a4>] :kvm:kvm_get_intr_delivery_bitmask+0x4e/0x86
PGD 2267ec067 PUD 22e61f067 PMD 0
Oops: 0000 [1] SMP
last sysfs file: /class/misc/kvm/dev
CPU 3
Modules linked in: kvm_amd(U) kvm(U) ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT iptable_filter ip_tables autofs4 hidp nfs lockd fscache nfs_acl rfcomm l2cap bluetooth sunrpc bridge ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sg tpm_infineon i2c_piix4 tpm tpm_bios serio_raw i2c_core ide_cd tg3 pcspkr floppy cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod sata_svw libata shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 6168, comm: qemu-system-x86 Tainted: G      2.6.18-151.el5 #1
RIP: 0010:[<ffffffff882d09a4>]  [<ffffffff882d09a4>] :kvm:kvm_get_intr_delivery_bitmask+0x4e/0x86
RSP: 0018:ffff81012df35bb8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff81012df35be8 RCX: ffffffff80306c28
RDX: ffffffff80306c28 RSI: 0000000000000001 RDI: ffffffff80306c20
RBP: ffff81012df35bd8 R08: ffffffff80306c28 R09: 0000000000000001
R10: 0000000000000002 R11: 0000000000100100 R12: ffff81012ddcf000
R13: ffff81012ddcf020 R14: 0000000000000001 R15: 0000000000000001
FS:  00002b8017417740(0000) GS:ffff81010439edc0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000028 CR3: 0000000224a21000 CR4: 00000000000006e0
Process qemu-system-x86 (pid: 6168, threadinfo ffff81012df34000, task ffff81012ff287e0)
Stack:  ffff81012f922f00 ffff810112900000 ffff81012ddcf000 ffffffff882cf18f
 0000000000000001 ffff810126bc56c0 0100000000000931 ffffffff882cd88b
 ffff81012f922f00 ffff810112900000 00000000ffffffff 0000000000000001
Call Trace:
 [<ffffffff882cf18f>] :kvm:ioapic_service+0x55/0x12a
 [<ffffffff882cd88b>] :kvm:kvm_pic_set_irq+0xc6/0xd0
 [<ffffffff882d0b8c>] :kvm:kvm_set_irq+0x65/0xa3
 [<ffffffff882c1d6e>] :kvm:kvm_arch_vm_ioctl+0x37e/0x62e
 [<ffffffff8003263b>] sock_common_recvmsg+0x2d/0x43
 [<ffffffff80030ead>] sock_recvmsg+0x107/0x15f
 [<ffffffff882bb6c4>] :kvm:kvm_vm_ioctl+0xa79/0xad0
 [<ffffffff800e8326>] core_sys_select+0x234/0x265
 [<ffffffff8006ed86>] do_gettimeofday+0x40/0x8f
 [<ffffffff8002c011>] sys_recvfrom+0x116/0x130
 [<ffffffff80042691>] do_ioctl+0x21/0x6b
 [<ffffffff80030ae3>] vfs_ioctl+0x457/0x4b9
 [<ffffffff800b706c>] audit_syscall_entry+0x180/0x1b3
 [<ffffffff8004cd38>] sys_ioctl+0x59/0x78
 [<ffffffff8005e28d>] tracesys+0xd5/0xe0


Code: 8b 40 28 0f ab 45 00 eb 2a e8 8a 19 dc f7 85 c0 74 19 40 b6
RIP  [<ffffffff882d09a4>] :kvm:kvm_get_intr_delivery_bitmask+0x4e/0x86
 RSP <ffff81012df35bb8>
CR2: 0000000000000028
 <0>Kernel panic - not syncing: Fatal exception


Version-Release number of selected component (if applicable):
kernel-2.6.18-151.el5
kvm-83-59.el5


How reproducible:
Always.


Steps to Reproduce:
1. qemu-kvm -hda teste.img -cdrom /mnt/data/autotest/iso/linux/Fedora-11-x86_64-DVD.iso -boot d -vnc :5

  
Actual results:
Kernel Oops.

Expected results:
No kernel oops.  :)

Additional info:
Reproduced on virtlab7 virtlab machine - AMD Barcelona
Quad-Core AMD Opteron(tm) Processor 2358 SE @ 1.5GHz

Comment 1 Eduardo Habkost 2009-06-03 22:32:41 UTC
Bug seems to be introduced by:

commit fb38b4af83f3100db3995afdbe3dbc1257fb096c
Author: Sheng Yang <sheng.com>
Date:   Thu May 21 17:09:49 2009 -0700

    KVM: bit ops for deliver_bitmap

    It's also convenient when we extend KVM supported vcpu number in the future.

    Signed-off-by: Sheng Yang <sheng.com>
    Signed-off-by: Avi Kivity <avi>
    (cherry picked from commit 59e499fa1a06d9995dc908e5c24fe3b2dac4c2da)
    Signed-off-by: Chris Wright <chrisw>
    Bugzilla: 498084
    Message-Id: <1242950989-30198-14-git-send-email-chrisw>
    Signed-off-by: Eduardo Habkost <ehabkost>
    RH-Upstream-status: applied
    Acked-by: Juan Quintela <quintela>
    Acked-by: Marcelo Tosatti <mtosatti>
    Acked-by: Don Dutile <ddutile>


Issues on the commit above are addressed by the following upstream commit:

commit 25b94fba003685051fa7c5e51f293ef1bfdb5ed5
Author: Sheng Yang <sheng.com>
Date:   Wed Mar 4 13:33:02 2009 +0800

    KVM: Merge kvm_ioapic_get_delivery_bitmask into kvm_get_intr_delivery_bitmask

    Gleb fixed bitmap ops usage in kvm_ioapic_get_delivery_bitmask.

    Sheng merged two functions, as well as fixed several issues in
    kvm_get_intr_delivery_bitmask
    1. deliver_bitmask is a bitmap rather than a unsigned long intereger.
    2. Lowest priority target bitmap wrong calculated by mistake.
    3. Prevent potential NULL reference.
    4. Declaration in include/kvm_host.h caused powerpc compilation warning.
    5. Add warning for guest broadcast interrupt with lowest priority delivery mode.
    6. Removed duplicate bitmap clean up in caller of kvm_get_intr_delivery_bitmask.

    Signed-off-by: Gleb Natapov <gleb>
    Signed-off-by: Sheng Yang <sheng.com>
    Signed-off-by: Marcelo Tosatti <mtosatti>


We may pull only the chunks that are obvious and may prevent the Oops from happening, or cherry-pick the whole commit. I will attach patches for both approaches, here.

Comment 2 Eduardo Habkost 2009-06-03 22:33:46 UTC
Created attachment 346468 [details]
Relevant chunks from upstream commit 25b94fba003685051fa7c5e51f293ef1bfdb5ed5

Comment 3 Eduardo Habkost 2009-06-03 22:34:48 UTC
Created attachment 346469 [details]
Complete cherry-pick of upstream commit 25b94fba003685051fa7c5e51f293ef1bfdb5ed5

Comment 4 Eduardo Habkost 2009-06-04 14:17:47 UTC
*** Bug 503886 has been marked as a duplicate of this bug. ***

Comment 10 Miya Chen 2009-07-01 09:43:16 UTC
michen-->Eduardo, I do not have AMD Barcelona machine, need your help to verify this bug.

Try in intel host, the following problem still exists:
in Red Hat Enterprise Virtualization Hypervisor release 5.4-2.0.99 (8.2)(kvm-83-81.el5), restart guest by system_reset in monitor during guest installation, guest got kernel panic.

Steps to Reproduce:
1. install the F11-x86_64 guest from cdrom
2. type ' system_reset ' into qemu monitor
3. vm restart. redo the installation

CLI:
/usr/libexec/qemu-kvm -drive file=fedora11-64.qcow2,if=ide,cache=off,index=0 -net nic,macaddr=20:20:20:78:69:23 -net tap,script=/etc/qemu-ifup -rtc-td-hack -no-hpet -usbdevice tablet -drive file=Fedora-11-x86_64-DVD.iso,media=cdrom,index=2 -cpu qemu64,+sse2 -smp 2 -m 2048 -boot d -vnc :21 -monitor stdio

Actual results:
Guest: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(202,17)

host demsg:
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin
vcpu not ready for apic_round_robin

host CPU:
processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 23
model name      : Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz

Comment 13 Eduardo Habkost 2009-07-02 19:11:34 UTC
(In reply to comment #10)
> michen-->Eduardo, I do not have AMD Barcelona machine, need your help to verify
> this bug.

I don't see this bug happening on the machine where I've found it, anymore, so it seems to be properly solved.


> 
> Try in intel host, the following problem still exists:
> in Red Hat Enterprise Virtualization Hypervisor release 5.4-2.0.99
> (8.2)(kvm-83-81.el5), restart guest by system_reset in monitor during guest
> installation, guest got kernel panic.

Please open a new BZ for this problem, if it was not reported yet. It is completely different issue, unrelated to this BZ.

Comment 14 Miya Chen 2009-07-03 02:34:10 UTC
(In reply to comment #13)
> (In reply to comment #10)
> > michen-->Eduardo, I do not have AMD Barcelona machine, need your help to verify
> > this bug.
> 
> I don't see this bug happening on the machine where I've found it, anymore, so
> it seems to be properly solved.
> 
> 
according to above, change bug status to "verified" 
> > 
> > Try in intel host, the following problem still exists:
> > in Red Hat Enterprise Virtualization Hypervisor release 5.4-2.0.99
> > (8.2)(kvm-83-81.el5), restart guest by system_reset in monitor during guest
> > installation, guest got kernel panic.
> 
> Please open a new BZ for this problem, if it was not reported yet. It is
> completely different issue, unrelated to this BZ.

Comment 16 errata-xmlrpc 2009-09-02 09:33:28 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1272.html


Note You need to log in before you can comment on or make changes to this bug.