Bug 1052861 - Double fault panic in L2 upon v2v conversion
Summary: Double fault panic in L2 upon v2v conversion
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Virtualization Tools
Classification: Community
Component: virt-v2v
Version: unspecified
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Matthew Booth
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-14 09:12 UTC by rom
Modified: 2014-09-10 12:56 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-09-10 12:56:49 UTC
Embargoed:


Attachments (Terms of Use)

Description rom 2014-01-14 09:12:23 UTC
Description of problem:
During v2v conversion for ova fedora image in a VM there is double fault panic happens and L2 crashes during libguestfs conversion.

The crash happens in different stages, but usually upon memory pressure in L0.

There are no error logs in L1 and cannot find strong correlation to patches that were added to L0-KVM to avoid L0 crash upon nested vm with high memory pressure - http://git.kernel.org/cgit/virt/kvm/kvm.git/patch/arch/x86/kvm/mmu.c?id=989c6b34f6a9480e397b170cc62237e89bf4fdb9.

Command within L1 to perform the conversion (fedora.ova - vmdk image of fedora - was placed in advance on the VM):

LIBGUESTFS_TRACE=1 LIBGUESTFS_DEBUG=1 /usr/bin/virt-v2v -i ova -os default -oc qemu:///system -of qcow2 -n default /var/tmp/fedora-v2v.ova


Version-Release number of selected component (if applicable):
L0: 
Kernel: 3.11.8-200.fc19 + nested crash patches
libvritd: 1.0.5.8
qemu: 1.6.1
libguestfs-test-tool 1.22.7fedora=19,release=1.fc19,libvirt
L1:
Kernel: 3.11.8-200..fc19.x86_64
libvirtd: 1.0.5.8
qemu: 1.6.1 + v2v patch (skip vmdk version verification)
libguestfs-test-tool 1.22.7fedora=19,release=2.fc19,libvirt
virt-v2v 0.9.0

L2:
Kernel: 3.11.10-301.fc20

How reproducible:
LIBGUESTFS_TRACE=1 LIBGUESTFS_DEBUG=1 /usr/bin/virt-v2v -i ova -os default -oc qemu:///system -of qcow2 -n default /var/tmp/fedora-v2v.ova

Steps to Reproduce:
1. Upload ova image to VM
2. Run v2v to perform the conversion
3. Add some memory pressure on L0 (dd if=/dev/urandom of=/tmp/bigfile count=6M)

Actual results:
guestfsd: main_loop: new request, len 0x3c
mount -o ro /dev/sdb /sysroot/
[   12.645305] PANIC: double fault, error_code: 0x0
[   12.645305] CPU: 0 PID: 141 Comm: mount Not tainted 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1
[   12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   12.645305] task: ffff88001cc816e0 ti: ffff88001cde6000 task.ti: ffff88001cde6000
[   12.645305] RIP: 0033:[<00007fa602c5b99b>]  [<00007fa602c5b99b>] 0x7fa602c5b99a
[   12.645305] RSP: 002b:00007fff4f5884a0  EFLAGS: 00010216
[   12.645305] RAX: 00007fa602008ff8 RBX: 00007fa601ff0000 RCX: 00007fa601ff0000
[   12.645305] RDX: 00000000003b7068 RSI: 00007fff4f588560 RDI: 00007fa601ff3d18
[   12.645305] RBP: 00007fff4f5885d0 R08: 00007fa60200f310 R09: 0000000000000000
[   12.645305] R10: 0000000000000022 R11: 00007fa60200f310 R12: 00007fa60200e9b0
[   12.645305] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fa602e6e990
[   12.645305] FS:  00007fa602e69880(0000) GS:ffff88001f000000(0000) knlGS:0000000000000000
[   12.645305] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   12.645305] CR2: 0000000000000000 CR3: 000000001d7fb000 CR4: 00000000000006f0
[   12.645305] 
[   12.645305] Kernel panic - not syncing: Machine halted.
[   12.645305] CPU: 0 PID: 141 Comm: mount Not tainted 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1
[   12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   12.645305]  ffff88001f005f58 ffff88001f005e90 ffffffff8164024b ffffffff819e89dc
[   12.645305]  ffff88001f005f08 ffffffff8163c272 0000000000000008 ffff88001f005f18
[   12.645305]  ffff88001f005eb8 ffffffff8163c8e5 0000000000000046 00000000000000b1
[   12.645305] Call Trace:
[   12.645305]  <#DF>  [<ffffffff8164024b>] dump_stack+0x45/0x56
[   12.645305]  [<ffffffff8163c272>] panic+0xc8/0x1d7
[   12.645305]  [<ffffffff8163c8e5>] ? printk+0x67/0x69
[   12.645305]  [<ffffffff81048ae1>] df_debug+0x31/0x40
[   12.645305]  [<ffffffff810132ed>] do_double_fault+0x5d/0x80
[   12.645305]  [<ffffffff81650b88>] double_fault+0x28/0x30
[   12.645305]  <<EOE>> 
[   12.645305] Rebooting in 1 seconds..libguestfs: child_cleanup: 0x3a05f50: child process died
libguestfs: sending SIGTERM to process 1526
libguestfs: trace: mount_ro = -1 (error)
libguestfs: trace: vfs_type "/dev/sda1"
libguestfs: trace: vfs_type = NULL (error)
libguestfs: check_for_filesystem_on: /dev/sda1 (failed to get vfs type)
libguestfs: trace: internal_parse_mountable "/dev/sda1"
libguestfs: trace: internal_parse_mountable = NULL (error)
libguestfs: trace: inspect_os = NULL (error)
libguestfs: trace: close


Additional info:
The same crash also happens when L0 is 3.11.9 (with kvm patch to avoid L0 crash in nested environment - http://git.kernel.org/cgit/virt/kvm/kvm.git/patch/arch/x86/kvm/mmu.c?id=989c6b34f6a9480e397b170cc62237e89bf4fdb9) but more rarely.

Comment 1 Richard W.M. Jones 2014-09-10 12:56:49 UTC
This is likely to be a general kernel problem.  However I would
suggest turning off nested KVM as things will be slower but much
more stable.


Note You need to log in before you can comment on or make changes to this bug.