Bug 922075 - nested VMX L2 virtual machine immediately paused after running (fixed in 3.12)
Summary: nested VMX L2 virtual machine immediately paused after running (fixed in 3.12)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-03-15 13:57 UTC by Pavel Zhukov
Modified: 2014-06-18 07:21 UTC (History)
25 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2014-01-08 14:25:25 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
L1 qemu log (32.18 KB, text/plain)
2013-03-15 14:02 UTC, Pavel Zhukov
no flags Details
blank nested VM (14.51 KB, text/plain)
2013-03-15 14:03 UTC, Pavel Zhukov
no flags Details
L0 cpuinfo (1.48 KB, text/plain)
2013-03-15 14:03 UTC, Pavel Zhukov
no flags Details
L1 cpuinfo (673 bytes, application/octet-stream)
2013-03-15 14:04 UTC, Pavel Zhukov
no flags Details

Description Pavel Zhukov 2013-03-15 13:57:41 UTC
Description of problem:
L2 virtual machine paused with error:
kvm: unhandled exit 0
kvm_run returned -22

Version-Release number of selected component (if applicable):
L0 (Fedora 18): qemu-kvm-1.2.2-6.fc18.x86_64
L1 (RHEV-H): QEMU PC emulator version 0.12.1 (qemu-kvm-0.12.1.2), Copyright (c) 2003-2008 Fabrice Bellard
L2: blank VM


How reproducible:
Reproducible on the Core2 host, nonreproducible on the i7 host
  
Actual results:
VM is paused or reports running but paused in fact

Additional info:
cpuinfo and qemu string are in attachments

Comment 1 Pavel Zhukov 2013-03-15 14:02:00 UTC
Created attachment 710672 [details]
L1 qemu log

Comment 2 Pavel Zhukov 2013-03-15 14:03:00 UTC
Created attachment 710673 [details]
blank nested VM

Comment 3 Pavel Zhukov 2013-03-15 14:03:35 UTC
Created attachment 710674 [details]
L0 cpuinfo

Comment 4 Pavel Zhukov 2013-03-15 14:04:09 UTC
Created attachment 710675 [details]
L1 cpuinfo

Comment 5 Paolo Bonzini 2013-03-15 15:35:20 UTC
Can you provide reproductions instructions, starting from bringing up the L1 guest all the way to the error?

Comment 6 Pavel Zhukov 2013-03-15 15:51:33 UTC
I've checked with 2 instances of RHEV-H isos. 

0) Enable nested KVM
1) Install RHEV-H with default settings.
2) virsh capabilities ->  copy <cpu> section -> virsh edit rhevh1 -> paste cpu section 
3) Start RHEV-H VM. 
4) create new VM using RHEV-M interface
5) Try to launch VM on all RHEVHs (2 are in affected host 1 in my laptop)
6) On affected host  VM becomes paused immediately or sometimes It's looks "running" from RHEV point of view but paused (spice console is opened but It's blank).  On my laptop (Just checked It's F17 machine, not F18) the same VM works perfect.

Comment 7 Cole Robinson 2013-04-01 22:08:25 UTC
Orit, any hints on what to look for here?

Comment 8 Orit Wasserman 2013-04-02 06:32:13 UTC
(In reply to comment #7)
> Orit, any hints on what to look for here?
The error means we have an invalid value in the VMCS (we don't have L0 logs but it usually invalid guest state error). It happens in the very beginning of running the guest so it probably is in real mode.
Now I'm taking a guess here but as it only happens for core2 it maybe related to 
unrestricted guest mode.
You can try disabling it in L0 kvm and see if it helps.

Comment 9 Orit Wasserman 2013-04-02 06:36:12 UTC
Hi,
what are L0 qemu and kernel version?
L1 qemu and kernel version?

Comment 10 Pavel Zhukov 2013-04-03 12:41:35 UTC
(In reply to comment #9)
> Hi,
> what are L0 qemu and kernel version?
L0 (Fedora 18): qemu-kvm-1.2.2-6.fc18.x86_64
kernel-3.8.2-206.fc18.x86_64
> L1 qemu and kernel version?
L1 (RHEV-H): QEMU PC emulator version 0.12.1 (qemu-kvm-0.12.1.2), Copyright (c) 2003-2008 Fabrice Bellard
qemu-kvm-rhev-0.12.1.2-2.385.el6_4
kernel-2.6.320-358.2.1.el6

Comment 11 Pavel Zhukov 2013-06-05 09:33:05 UTC
tested:

L0: F18
L1: rhevh
L2: F18 or F17 or RHEL6
Failed (machine is paused)

L0: gentoo (3.8.13 qemu-1.4.2)
L1: rhevh
L2: F17
Working without issues

Comment 12 Cole Robinson 2013-07-11 21:15:25 UTC
Pavel, can you list kernel versions for latest tested L0 F18? The kernel has is on a different major version now since you're info in Comment#10.

Comment 13 Pavel Zhukov 2013-07-15 08:55:33 UTC
Tested with: 3.9.4-301.fc19.x86_64
Changing to F19

Comment 14 Anil Vettathu 2013-07-15 16:31:38 UTC
L1 VM hungs on kernel-3.9.9-302.fc19.x86_64 also. 

L0: F19
L1: RHEV-H 6.4
L2: RHEL6

Comment 15 Ken Sugawara 2013-09-25 03:35:23 UTC
Same thing just happened on my f18 box:

L0: F18 kernel-3.10.12-100.fc18 / qemu-kvm-1.2.2-14.fc18
L1: RHEL 6 kernel-2.6.32-358.18.1.el6 / qemu-kvm-0.12.1.2-2.355.el6_4.7
L2: RHEL 6.4 install (pauses shortly after started; resume won't work complaining "libvirtError: internal error unable to execute QEMU command 'cont': Resetting the Virtual Machine is required")

Just my 2c,

Comment 16 Joe Giordano 2013-10-16 15:53:38 UTC
Seeing the same behavior - 




L0 - Lenovo Laptop T530i - 
F19 - 3.11.3-201.fc19.x86_64
libvirt-1.0.5.6-2.fc19.x86_64
qemu-kvm-1.4.2-11.fc19.x86_64

L1 -
1) mgr -  rhevm-3.1.0-43.el6ev.noarch / RHEL 6.4 - 2.6.32-358.el6.x86_64 
2) rhev-h 6.4-20130815.0.el6_4

Comment 17 Kashyap Chamarthy 2013-10-16 16:41:19 UTC
Hi,

I'm using these below versions, and not hitting the described issue of L2 pausing.

That said, I should state my environment info clearly:

1/ My L0 and L1 are running Kernels compiled from 'kvm.git queue'. That's the commit ID:

  $ git log | head -1
  commit 32024367cabf4c90fada531b949d2b109afc755c

2/ I'm running minimal F19 (@core) on L0, L1, L2

3/ (Even the below versions work just fine)

  $ rpm -q libvirt-daemon-kvm qemu-kvm
  libvirt-daemon-kvm-1.0.5.5-1.fc19.x86_64
  qemu-kvm-1.4.2-7.fc19.x86_64

4/ I'm using Intel Haswell

5/ libvirt qemu-system-x86_64 command line:

L1:
---
$ ps -ef | grep qemu
qemu      1521     1 99 11:23 ?        01:22:19 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name regular-guest -S -machine pc-i440fx-1.4,accel=kvm,usb=off -cpu host -m 10240 -smp 4,sockets=4,cores=1,threads=1 -uuid 4ed9ac0b-7f72-dfcf-68b3-e6fe2ac588b2 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/regular-guest.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot c -usb -drive file=/home/test/vmimages/regular-guest.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:80:c1:34,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

L2:
---
$ ps -ef | grep -i qemu
qemu      1186     1 99 11:24 ?        01:21:38 /usr/bin/qemu-system-x86_64 -machine accel=kvm -name nguest-01 -S -machine pc-i440fx-1.4,accel=kvm,usb=off -m 2048 -smp 2,sockets=2,cores=1,threads=1 -uuid b47c5cbb-b320-ce9d-c595-4e083b0e465d -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/nguest-01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/test/vmimages/nguest-01.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:be:d5:8e,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

Comment 18 Joe Giordano 2013-10-16 18:39:40 UTC
In my environment I was able to stop the VM pausing by Setting the L1 Hypervisor from SandyBridge to Nehalem in Virtmanager. I successfully installed a RHEL VM. 

L0 - Lenovo Laptop T530i - 
F19 - 3.11.3-201.fc19.x86_64
libvirt-1.0.5.6-2.fc19.x86_64
qemu-kvm-1.4.2-11.fc19.x86_64

L1 -
1) mgr -  rhevm-3.1.0-43.el6ev.noarch / RHEL 6.4 - 2.6.32-358.el6.x86_64 
2) rhev-h 6.4-20130815.0.el6_4

Comment 19 Anil Vettathu 2013-10-17 08:00:05 UTC
(In reply to Joe Giordano from comment #18)
> In my environment I was able to stop the VM pausing by Setting the L1
> Hypervisor from SandyBridge to Nehalem in Virtmanager. I successfully
> installed a RHEL VM. 
> 
> L0 - Lenovo Laptop T530i - 
> F19 - 3.11.3-201.fc19.x86_64
> libvirt-1.0.5.6-2.fc19.x86_64
> qemu-kvm-1.4.2-11.fc19.x86_64
> 
> L1 -
> 1) mgr -  rhevm-3.1.0-43.el6ev.noarch / RHEL 6.4 - 2.6.32-358.el6.x86_64 
> 2) rhev-h 6.4-20130815.0.el6_4

In my case, even Nehalem will also freeze the nested VM, if I try to start a VM inside it.

Comment 20 Cole Robinson 2013-10-31 20:45:04 UTC
*** Bug 1016748 has been marked as a duplicate of this bug. ***

Comment 21 Cole Robinson 2013-10-31 20:47:07 UTC
When I talked to the KVM maintainers, they told me '3.12 should be much better for nested VMX' :)

If you are still hitting issues, please try kernel-3.12 from rawhide (if it's easily installable) and report back here

sudo yum install fedora-release-rawhide
sudo yum --enablerepo=rawhide update kernel
reboot

Comment 22 Luc de Louw 2013-11-17 14:28:50 UTC
Can confirm the following as working:

L0: F19 with rawhide Kernel 3.13.0-0.rc0.git4.1.fc21.x86_64
L1: RHEL 6.5 beta
L2: RHEL 6.4

Comment 23 Cole Robinson 2013-11-17 19:21:54 UTC
Thanks Luc. Setting to POST and moving to kernel, this should be auto-closed when 3.12 is pushed to F19

Comment 24 Justin M. Forbes 2014-01-03 22:10:16 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 19 kernel bugs.

Fedora 19 has now been rebased to 3.12.6-200.fc19.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 20, and are still experiencing this issue, please change the version to Fedora 20.

If you experience different issues, please open a new bug report for those.

Comment 25 Josh Boyer 2014-01-08 14:25:25 UTC
Closing now that 3.12.6 is in F19 per comment #23.


Note You need to log in before you can comment on or make changes to this bug.