Bug 491631

Summary: Fedora11 Kernel BUG while installing kvm guest os using qcow2
Product: [Fedora] Fedora Reporter: IBM Bug Proxy <bugproxy>
Component: kvmAssignee: Glauber Costa <gcosta>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: rawhideCC: berrange, clalance, ehabkost, gcosta, markmc, quintela, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-03-25 10:16:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
total boot log none

Description IBM Bug Proxy 2009-03-23 12:30:26 UTC
=Comment: #0=================================================
Pavan Naregundi <pavan.naregundi.com> - 

Installing kvm guest os(F11Alpha) using qcow2 on F11Alpha as host OS, produced a kernel bug. Below
is the call trace of the same.
Attachment: total boot log

Machine: x3650
# uname -a
Linux mx3650.in.ibm.com 2.6.29-0.66.rc3.fc11.x86_64 #1 SMP Thu Jan 29 14:44:32 EST 2009 x86_64
x86_64 x86_64 GNU/Linux

==================
detecting hardware...
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:302!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0C0F:02/uevent
CPU 0 
Modules linked in: iscsi_ibft iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext2 ext4 jbd2
crc16 squashfs pcspkr floppy nfs lockd nfs_acl auth_rpcgss sunrpc vfat fat cramfs
Pid: 663, comm: udevd Not tainted 2.6.29-0.66.rc3.fc11.x86_64 #1
RIP: 0010:[<ffffffff810bb33e>]  [<ffffffff810bb33e>] do_wp_page+0x324/0x6d8
RSP: 0000:ffff880018ae7a88  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880018adf5b8 RCX: ffff880018adf5b8
RDX: ffffe200009f3ec8 RSI: 00007fffe1ab7ce0 RDI: 8000000000000065
RBP: ffff880018ae7af8 R08: ffff880018ade868 R09: ffffe20000a06aa8
R10: ffff880018ae5f00 R11: ffffffff813861d4 R12: ffffe200009f3ec8
R13: 80000000187fd065 R14: ffff880018ae5f00 R15: ffffe20000a06aa8
FS:  00007f47d9ab7790(0000) GS:ffffffff81934000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffe1ab7ce0 CR3: 0000000018ad0000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process udevd (pid: 663, threadinfo ffff880018ae6000, task ffff88001898a390)
Stack:
 0000000000000246 ffff880018ade868 ffff880018adf5b8 00007fffe1ab7ce0
 ffff880018ae5f00 ffff880018a80540 ffff880018ae5f00 00007fffe1ab7ce0
 ffff880018ae7af8 ffff880018adf5b8 ffffe20000a06aa8 80000000187fd065
Call Trace:
 [<ffffffff810bd1f1>] handle_mm_fault+0x7d7/0x88b
 [<ffffffff813861d4>] ? do_page_fault+0x58a/0xa35
 [<ffffffff81386280>] do_page_fault+0x636/0xa35
 [<ffffffff8102a24b>] ? kvm_clock_read+0x1c/0x1e
 [<ffffffff81016ea3>] ? sched_clock+0x9/0xc
 [<ffffffff8106bc68>] ? lock_release_holdtime+0x2c/0x123
 [<ffffffff8138338e>] ? _spin_unlock_irqrestore+0x40/0x57
 [<ffffffff8102ad8c>] ? pvclock_clocksource_read+0x42/0x7e
 [<ffffffff8106c0c4>] ? register_lock_class+0x20/0x35c
 [<ffffffff8102ad8c>] ? pvclock_clocksource_read+0x42/0x7e
 [<ffffffff8106d137>] ? mark_lock+0x22/0x3ad
 [<ffffffff8102ad8c>] ? pvclock_clocksource_read+0x42/0x7e
 [<ffffffff8105f2cb>] ? remove_wait_queue+0x2f/0x38
 [<ffffffff8102a24b>] ? kvm_clock_read+0x1c/0x1e
 [<ffffffff81016ea3>] ? sched_clock+0x9/0xc
 [<ffffffff8106bc68>] ? lock_release_holdtime+0x2c/0x123
 [<ffffffff81383395>] ? _spin_unlock_irqrestore+0x47/0x57
 [<ffffffff8106d719>] ? trace_hardirqs_on_caller+0x12f/0x153
 [<ffffffff8106d74a>] ? trace_hardirqs_on+0xd/0xf
 [<ffffffff8105f2cb>] ? remove_wait_queue+0x2f/0x38
 [<ffffffff8102ad8c>] ? pvclock_clocksource_read+0x42/0x7e
 [<ffffffff81383f20>] ? error_sti+0x5/0x6
 [<ffffffff81382f73>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff81383ce5>] page_fault+0x25/0x30
Code: 48 89 c1 41 bd 08 00 00 00 e8 75 95 f7 ff e9 43 03 00 00 49 8b 04 24 4c 89 e2 f6 c4 40 74 05
49 8b 54 24 10 8b 42 08 85 c0 75 04 <0f> 0b eb fe 48 8d 42 08 3e ff 42 08 4c 89 ff e8 53 80 2c 00 48 
RIP  [<ffffffff810bb33e>] do_wp_page+0x324/0x6d8
 RSP <ffff880018ae7a88>
---[ end trace 675aea73f911c202 ]---
waiting for hardware to initialize...
BUG: soft lockup - CPU#0 stuck for 61s! [ud

==============================

I followed these steps to get the error
1. Installed F11Alpha on x3650
2. qemu-img create -f qcow2 Fedora-11-Alpha.qcow2
3. qemu-kvm -cdrom Fedora-11-Alpha-x86_64-DVD.iso  -drive file=Fedora-11-Alpha.qcow2,if=scsi -vnc :2
-m 512 -serial stdio
Last step started the installation and got the above call trace.

This call trace is specifically produced when "if=scsi" and "qcow2" is used

=Comment: #1=================================================
Anoop V. Chakkalakkal <anoop.vijayan.com> - 
There are problems reported with KVM paravirt clock
[https://bugzilla.redhat.com/show_bug.cgi?id=475598]. But the patch
[http://www.redhat.com/archives/fedora-kernel-list/2009-January/msg00109.html] for disabling this
seems to present in the kernel.

Is it on the guest that the error message shows up? or in the base machine? Accordingly, could you
please try booting the respective kernel with clocksource=acpi_pm?

Another thing to try is disable freq. scaling
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

=Comment: #2=================================================
Pavan Naregundi <pavan.naregundi.com> - 
> Is it on the guest that the error message shows up? or in the base machine?
Error message was see in guest machine. while booting up the guest installation. And the call trace
is not 100% reproducible. However, i have seen this around 5 times.

> Accordingly, could you please try booting the respective kernel with clocksource=acpi_pm?
I tried multiple times with this option in guest and i have not see the problem.

Comment 1 IBM Bug Proxy 2009-03-23 12:30:36 UTC
Created attachment 336279 [details]
total boot log

Comment 2 Mark McLoughlin 2009-03-25 10:16:23 UTC
Pretty sure this is a guest pvmmu issue in the F11Alpha kernel which has since been fixed. Please re-open if you see it again in the beta

See also:

  http://www.mail-archive.com/kvm@vger.kernel.org/msg10312.html

*** This bug has been marked as a duplicate of bug 480822 ***