Bug 696478

Summary: RHEL5.6 guest kernel panic during boot on AMD host
Product: Red Hat Enterprise Linux 6 Reporter: Qingtang Zhou <qzhou>
Component: qemu-kvmAssignee: Gleb Natapov <gleb>
Status: CLOSED WORKSFORME QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.1CC: juzhang, knoel, michen, mkenneth, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-04-20 08:44:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
guest dmesg none

Description Qingtang Zhou 2011-04-14 06:14:47 UTC
Created attachment 491965 [details]
guest dmesg

Description of problem:
RHEL5.6 crash during boot.

Version-Release number of selected component (if applicable):
kernel version: 2.6.32-130.el6.x86_64
qemu version: qemu-kvm-0.12.1.2-2.156.el6

How reproducible:
sometimes

Steps to Reproduce:
1. Run commands on host:
grep -q el5 /proc/version && ([ -e /dev/ksm ] && true || modprobe ksm && ksmctl start 5000 50 ) || (echo 1 > /sys/kernel/mm/ksm/run && echo 5000 > /sys/kernel/mm/ksm/pages_to_scan && echo 50 > /sys/kernel/mm/ksm/sleep_millisecs); echo 'never' > /sys/kernel/mm/redhat_transparent_hugepage/enabled

2.start vm with command:
qemu -name 'vm1' \
-chardev socket,id=qmp_monitor_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20110411-111144-vdyn,server,nowait \
-mon chardev=qmp_monitor_id_qmpmonitor1,mode=control \
-chardev socket,id=serial_id_20110411-111144-vdyn,path=/tmp/serial-20110411-111144-vdyn,server,nowait \
-device isa-serial,chardev=serial_id_20110411-111144-vdyn \
-drive file='RHEL-Server-5.6-32-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,snapshot=on,format=qcow2,aio=native \
-device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 \
-device virtio-net-pci,netdev=id4N6P0v,mac=9a:6e:6b:60:81:17,id=ndev00id4N6P0v,bus=pci.0,addr=0x3 \
-netdev tap,id=id4N6P0v,vhost=on,ifname='t0-111144-vdyn',script='scripts/qemu-ifup-vbr0',downscript='no' \
-m 3100 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic \
-spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host,driftfix=none \
-M rhel6.1.0 -boot order=cdn,once=c,menu=off   -usbdevice tablet \
-no-kvm-pit-reinjection -pidfile /tmp/vm1-20110411-111144-VSaQ.pid -enable-kvm 

3. kernel panic occur.
  
Actual results:
guest kernel panic during boot.

Expected results:
guest boot successfully.

Additional info:
guest kernel panic log:
2011-04-11 11:13:18: BUG: unable to handle kernel paging request at virtual address 00100100
2011-04-11 11:13:18:  printing eip:
2011-04-11 11:13:18: c05a40da
2011-04-11 11:13:18: *pde = b9ac7067
2011-04-11 11:13:18: Oops: 0000 [#1]
2011-04-11 11:13:18: SMP
2011-04-11 11:13:18: last sysfs file: /devices/pci0000:00/0000:00:01.2/usb1/1-1/1-1:1.0/bInterfaceProtocol
2011-04-11 11:13:18: Modules linked in: rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables ip6_tables x_tables loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi ac lp floppy joydev pcspkr parport_pc i2c_piix4 i2c_core parport tpm_tis tpm serio_raw tpm_bios ide_cd cdrom virtio_net dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd uhci_hcd ohci_hcd ehci_hcd
2011-04-11 11:13:18: CPU:    0
2011-04-11 11:13:18: EIP:    0060:[<c05a40da>]    Tainted: G S   ---- VLI
2011-04-11 11:13:18: EFLAGS: 00010006   (2.6.18-256.el5 #1)
2011-04-11 11:13:18: EIP is at evdev_event+0xf5/0x12f
2011-04-11 11:13:18: eax: f6d17c08   ebx: 000ffcf0   ecx: 00020001   edx: 0000001d
2011-04-11 11:13:18: esi: f7dd50c0   edi: 00000000   ebp: 00000000   esp: c076fe94
2011-04-11 11:13:18: ds: 007b   es: 007b   ss: 0068
2011-04-11 11:13:18: Process swapper (pid: 0, ti=c076f000 task=c068f3c0 task.ti=c070a000)
2011-04-11 11:13:18: Stack: 00000000 c06ba1c0 c2ac8800 f7dd50dc 00000000 c05a1da0 00000000 00000000
2011-04-11 11:13:18:        c2ac8800 c2a28e60 c2a32000 f6e00e80 c059d00d 00000000 00000003 f6e00e84
2011-04-11 11:13:18:        f7da94a8 c05987d8 00000000 00000000 00000100 00000000 00000000 00000001
2011-04-11 11:13:18: Call Trace:
2011-04-11 11:13:18:  [<c05a1da0>] input_event+0x3e6/0x407
2011-04-11 11:13:18:  [<c059d00d>] hidinput_report_event+0x1d/0x40
2011-04-11 11:13:18:  [<c05987d8>] hid_input_report+0x34e/0x399
2011-04-11 11:13:18:  [<c0599a41>] hid_irq_in+0x49/0xcc
2011-04-11 11:13:18:  [<c058dd3e>] usb_hcd_giveback_urb+0x28/0x53
2011-04-11 11:13:18:  [<f88335f9>] uhci_giveback_urb+0x10d/0x135 [uhci_hcd]
2011-04-11 11:13:18:  [<c04e1206>] elv_next_request+0x127/0x134
2011-04-11 11:13:18:  [<f8833bcc>] uhci_scan_schedule+0x4db/0x730 [uhci_hcd]
2011-04-11 11:13:18:  [<f883559a>] uhci_irq+0x118/0x12e [uhci_hcd]
2011-04-11 11:13:18:  [<c058e634>] usb_hcd_irq+0x23/0x50
2011-04-11 11:13:18:  [<c04500fd>] handle_IRQ_event+0x45/0x8c
2011-04-11 11:13:18:  [<c04501c8>] __do_IRQ+0x84/0xd6
2011-04-11 11:13:18:  [<c0450144>] __do_IRQ+0x0/0xd6
2011-04-11 11:13:18:  [<c04074d8>] do_IRQ+0x9b/0xc3
2011-04-11 11:13:18:  [<c040597a>] common_interrupt+0x1a/0x20
2011-04-11 11:13:18:  [<c0403c1c>] default_idle+0x0/0x59
2011-04-11 11:13:18:  [<c0403c4d>] default_idle+0x31/0x59
2011-04-11 11:13:18:  [<c0403d14>] cpu_idle+0x9f/0xb9
2011-04-11 11:13:18:  [<c070f9fc>] start_kernel+0x37b/0x383
2011-04-11 11:13:18:  =======================
2011-04-11 11:13:18: Code: 83 00 04 00 00 ba 1d 00 00 00 40 83 e0 3f 89 83 00 04 00 00 8d 83 08 04 00 00 e8 d8 34 ee ff 8b 9b 10 04 00 00 81 eb 10 04 00 00 <8b> 83 10 04 00 00 0f 18 00 90 8d 93 10 04 00 00 8d 46 50 39 c2
2011-04-11 11:13:18: EIP: [<c05a40da>] evdev_event+0xf5/0x12f SS:ESP 0068:c076fe94
2011-04-11 11:13:18:  <0>Kernel panic - not syncing: Fatal exception in interrupt
2011-04-11 11:13:18:  WARNING: at kernel/panic.c:137 panic()
2011-04-11 11:13:18:  [<c0425610>] panic+0x147/0x15b
2011-04-11 11:13:18:  [<c040650c>] die+0x240/0x274
2011-04-11 11:13:18:  [<c0623088>] do_page_fault+0x41e/0x531
2011-04-11 11:13:18:  [<c0622c6a>] do_page_fault+0x0/0x531
2011-04-11 11:13:19:  [<c0405abd>] error_code+0x39/0x40
2011-04-11 11:13:19:  [<c05a40da>] evdev_event+0xf5/0x12f
2011-04-11 11:13:19:  [<c05a1da0>] input_event+0x3e6/0x407
2011-04-11 11:13:19:  [<c059d00d>] hidinput_report_event+0x1d/0x40
2011-04-11 11:13:19:  [<c05987d8>] hid_input_report+0x34e/0x399
2011-04-11 11:13:19:  [<c0599a41>] hid_irq_in+0x49/0xcc
2011-04-11 11:13:19:  [<c058dd3e>] usb_hcd_giveback_urb+0x28/0x53
2011-04-11 11:13:19:  [<f88335f9>] uhci_giveback_urb+0x10d/0x135 [uhci_hcd]
2011-04-11 11:13:19:  [<c04e1206>] elv_next_request+0x127/0x134
2011-04-11 11:13:19:  [<f8833bcc>] uhci_scan_schedule+0x4db/0x730 [uhci_hcd]
2011-04-11 11:13:19:  [<f883559a>] uhci_irq+0x118/0x12e [uhci_hcd]
2011-04-11 11:13:19:  [<c058e634>] usb_hcd_irq+0x23/0x50
2011-04-11 11:13:19:  [<c04500fd>] handle_IRQ_event+0x45/0x8c
2011-04-11 11:13:19:  [<c04501c8>] __do_IRQ+0x84/0xd6
2011-04-11 11:13:19:  [<c0450144>] __do_IRQ+0x0/0xd6
2011-04-11 11:13:19:  [<c04074d8>] do_IRQ+0x9b/0xc3
2011-04-11 11:13:19:  [<c040597a>] common_interrupt+0x1a/0x20
2011-04-11 11:13:19:  [<c0403c1c>] default_idle+0x0/0x59
2011-04-11 11:13:19:  [<c0403c4d>] default_idle+0x31/0x59
2011-04-11 11:13:19:  [<c0403d14>] cpu_idle+0x9f/0xb9
2011-04-11 11:13:19:  [<c070f9fc>] start_kernel+0x37b/0x383
2011-04-11 11:13:19:  =======================
2011-04-11 11:13:19: WARNING: at drivers/input/serio/i8042.c:846 i8042_panic_blink()
2011-04-11 11:13:19:  [<c05a0414>] i8042_panic_blink+0xce/0x206
2011-04-11 11:13:19:  [<c04255d2>] panic+0x109/0x15b
2011-04-11 11:13:19:  [<c040650c>] die+0x240/0x274
2011-04-11 11:13:19:  [<c0623088>] do_page_fault+0x41e/0x531
2011-04-11 11:13:19:  [<c0622c6a>] do_page_fault+0x0/0x531
2011-04-11 11:13:19:  [<c0405abd>] error_code+0x39/0x40
2011-04-11 11:13:19:  [<c05a40da>] evdev_event+0xf5/0x12f
2011-04-11 11:13:19:  [<c05a1da0>] input_event+0x3e6/0x407
2011-04-11 11:13:19:  [<c059d00d>] hidinput_report_event+0x1d/0x40
2011-04-11 11:13:19:  [<c05987d8>] hid_input_report+0x34e/0x399
2011-04-11 11:13:19:  [<c0599a41>] hid_irq_in+0x49/0xcc
2011-04-11 11:13:19:  [<c058dd3e>] usb_hcd_giveback_urb+0x28/0x53
2011-04-11 11:13:19:  [<f88335f9>] uhci_giveback_urb+0x10d/0x135 [uhci_hcd]
2011-04-11 11:13:19:  [<c04e1206>] elv_next_request+0x127/0x134
2011-04-11 11:13:19:  [<f8833bcc>] uhci_scan_schedule+0x4db/0x730 [uhci_hcd]
2011-04-11 11:13:19:  [<f883559a>] uhci_irq+0x118/0x12e [uhci_hcd]
2011-04-11 11:13:19:  [<c058e634>] usb_hcd_irq+0x23/0x50
2011-04-11 11:13:19:  [<c04500fd>] handle_IRQ_event+0x45/0x8c
2011-04-11 11:13:19:  [<c04501c8>] __do_IRQ+0x84/0xd6
2011-04-11 11:13:19:  [<c0450144>] __do_IRQ+0x0/0xd6
2011-04-11 11:13:19:  [<c04074d8>] do_IRQ+0x9b/0xc3
2011-04-11 11:13:19:  [<c040597a>] common_interrupt+0x1a/0x20
2011-04-11 11:13:19:  [<c0403c1c>] default_idle+0x0/0x59
2011-04-11 11:13:19:  [<c0403c4d>] default_idle+0x31/0x59
2011-04-11 11:13:19:  [<c0403d14>] cpu_idle+0x9f/0xb9
2011-04-11 11:13:19:  [<c070f9fc>] start_kernel+0x37b/0x383
2011-04-11 11:13:19:  =======================
2011-04-11 11:13:19: WARNING: at drivers/input/serio/i8042.c:849 i8042_panic_blink()
2011-04-11 11:13:19:  [<c05a04c6>] i8042_panic_blink+0x180/0x206
2011-04-11 11:13:19:  [<c04255d2>] panic+0x109/0x15b
2011-04-11 11:13:19:  [<c040650c>] die+0x240/0x274
2011-04-11 11:13:19:  [<c0623088>] do_page_fault+0x41e/0x531
2011-04-11 11:13:19:  [<c0622c6a>] do_page_fault+0x0/0x531
2011-04-11 11:13:19:  [<c0405abd>] error_code+0x39/0x40
2011-04-11 11:13:19:  [<c05a40da>] evdev_event+0xf5/0x12f
2011-04-11 11:13:19:  [<c05a1da0>] input_event+0x3e6/0x407
2011-04-11 11:13:19:  [<c059d00d>] hidinput_report_event+0x1d/0x40
2011-04-11 11:13:19:  [<c05987d8>] hid_input_report+0x34e/0x399
2011-04-11 11:13:19:  [<c0599a41>] hid_irq_in+0x49/0xcc
2011-04-11 11:13:19:  [<c058dd3e>] usb_hcd_giveback_urb+0x28/0x53
2011-04-11 11:13:19:  [<f88335f9>] uhci_giveback_urb+0x10d/0x135 [uhci_hcd]
2011-04-11 11:13:19:  [<c04e1206>] elv_next_request+0x127/0x134
2011-04-11 11:13:19:  [<f8833bcc>] uhci_scan_schedule+0x4db/0x730 [uhci_hcd]
2011-04-11 11:13:19:  [<f883559a>] uhci_irq+0x118/0x12e [uhci_hcd]
2011-04-11 11:13:19:  [<c058e634>] usb_hcd_irq+0x23/0x50
2011-04-11 11:13:19:  [<c04500fd>] handle_IRQ_event+0x45/0x8c
2011-04-11 11:13:19:  [<c04501c8>] __do_IRQ+0x84/0xd6
2011-04-11 11:13:19:  [<c0450144>] __do_IRQ+0x0/0xd6
2011-04-11 11:13:19:  [<c04074d8>] do_IRQ+0x9b/0xc3
2011-04-11 11:13:19:  [<c040597a>] common_interrupt+0x1a/0x20
2011-04-11 11:13:19:  [<c0403c1c>] default_idle+0x0/0x59
2011-04-11 11:13:19:  [<c0403c4d>] default_idle+0x31/0x59
2011-04-11 11:13:19:  [<c0403d14>] cpu_idle+0x9f/0xb9
2011-04-11 11:13:19:  [<c070f9fc>] start_kernel+0x37b/0x383
2011-04-11 11:13:19:  =======================
2011-04-11 11:13:19: WARNING: at drivers/input/serio/i8042.c:851 i8042_panic_blink()
2011-04-11 11:13:19:  [<c05a0523>] i8042_panic_blink+0x1dd/0x206
2011-04-11 11:13:19:  [<c04255d2>] panic+0x109/0x15b
2011-04-11 11:13:19:  [<c040650c>] die+0x240/0x274
2011-04-11 11:13:19:  [<c0623088>] do_page_fault+0x41e/0x531
2011-04-11 11:13:19:  [<c0622c6a>] do_page_fault+0x0/0x531
2011-04-11 11:13:19:  [<c0405abd>] error_code+0x39/0x40
2011-04-11 11:13:19:  [<c05a40da>] evdev_event+0xf5/0x12f
2011-04-11 11:13:19:  [<c05a1da0>] input_event+0x3e6/0x407
2011-04-11 11:13:19:  [<c059d00d>] hidinput_report_event+0x1d/0x40
2011-04-11 11:13:19:  [<c05987d8>] hid_input_report+0x34e/0x399
2011-04-11 11:13:19:  [<c0599a41>] hid_irq_in+0x49/0xcc
2011-04-11 11:13:19:  [<c058dd3e>] usb_hcd_giveback_urb+0x28/0x53
2011-04-11 11:13:19:  [<f88335f9>] uhci_giveback_urb+0x10d/0x135 [uhci_hcd]
2011-04-11 11:13:19:  [<c04e1206>] elv_next_request+0x127/0x134
2011-04-11 11:13:19:  [<f8833bcc>] uhci_scan_schedule+0x4db/0x730 [uhci_hcd]
2011-04-11 11:13:19:  [<f883559a>] uhci_irq+0x118/0x12e [uhci_hcd]
2011-04-11 11:13:19:  [<c058e634>] usb_hcd_irq+0x23/0x50
2011-04-11 11:13:19:  [<c04500fd>] handle_IRQ_event+0x45/0x8c
2011-04-11 11:13:19:  [<c04501c8>] __do_IRQ+0x84/0xd6
2011-04-11 11:13:19:  [<c0450144>] __do_IRQ+0x0/0xd6
2011-04-11 11:13:19:  [<c04074d8>] do_IRQ+0x9b/0xc3
2011-04-11 11:13:19:  [<c040597a>] common_interrupt+0x1a/0x20
2011-04-11 11:13:19:  [<c0403c1c>] default_idle+0x0/0x59
2011-04-11 11:13:19:  [<c0403c4d>] default_idle+0x31/0x59
2011-04-11 11:13:19:  [<c0403d14>] cpu_idle+0x9f/0xb9
2011-04-11 11:13:19:  [<c070f9fc>] start_kernel+0x37b/0x383
2011-04-11 11:13:19:  =======================

Comment 2 Dor Laor 2011-04-14 08:12:07 UTC
Is this reproducible?

Comment 3 Qingtang Zhou 2011-04-14 08:26:25 UTC
(In reply to comment #2)
> Is this reproducible?
I am trying to reproduce this bug with autotest, hope it will have a result tomorrow.

Comment 4 Gleb Natapov 2011-04-14 09:06:50 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > Is this reproducible?
> I am trying to reproduce this bug with autotest, hope it will have a result
> tomorrow.

If you can reproduce it verify that it is still reproducible 1) with ksm disabled, 2) without usb tablet. Note: check only 1 or 2 not both simultaneously.

Comment 5 Qingtang Zhou 2011-04-15 03:01:44 UTC
(In reply to comment #4)
> If you can reproduce it verify that it is still reproducible 1) with ksm
> disabled, 2) without usb tablet. Note: check only 1 or 2 not both
> simultaneously.

Hi, Gleb,
I am trying to reproduce this problem as your advice, but seems it will take
a little more time, it's hard to reproduce.

Comment 6 Qingtang Zhou 2011-04-18 07:05:28 UTC
Hi,
I tried to boot RHEL4.9 guest for about 200 times, but no kernel panic occured again.

Comment 7 Gleb Natapov 2011-04-18 07:32:37 UTC
(In reply to comment #6)
> Hi,
> I tried to boot RHEL4.9 guest for about 200 times, but no kernel panic occured
> again.

Why RHEL4.9? The bug title talks about rhel5.6.

Comment 8 Qingtang Zhou 2011-04-18 09:04:08 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > Hi,
> > I tried to boot RHEL4.9 guest for about 200 times, but no kernel panic occured
> > again.
> 
> Why RHEL4.9? The bug title talks about rhel5.6.

Sorry, type error, my fault. I mean it's RHEL5.6.

Comment 10 Gleb Natapov 2011-04-20 08:44:56 UTC
Closing the bug since it is not reproducible. Reopen if you see it again.