Bug 614438

Summary: kvm pauses due to unhandled exit
Product: Red Hat Enterprise Linux 6 Reporter: Haim <hateya>
Component: qemu-kvmAssignee: Karen Noel <knoel>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: aarcange, berrange, danken, eblake, hateya, mgoldboi, mkenneth, tburke, virt-maint, xen-maint, yeylon
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-07-22 21:05:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Haim 2010-07-14 13:49:37 UTC
Description of problem:

we have some conditions (unhanded conditions) where vms goes to pause, and the only event we see in libvirtd.log is:

16:10:08.953: debug : qemuHandleDomainStop:1307 : Transitioned guest libvirt-rhel54-033 to paused state due to unknown event

the problem with this cases, is we can't understand or deal with problem if any. 
vdsm is listen to STOP and PAUSE error reasons, however, as you see in this case, it doesn't say much. 

this happens on latest libvirt-0.8.1-15.el6.x86_64. 
I don't have exact repro steps, I just started to load my setup a bit (each server hosted about 25vms). 

16:10:08.953: debug : virJSONValueFromString:962 : result=0x7f4228001a30
16:10:08.953: debug : qemuMonitorJSONIOProcessEvent:86 : mon=0x98a940 obj=0x7f4228001a30
16:10:08.953: debug : qemuMonitorJSONIOProcessEvent:99 : handle STOP handler=0x477f00 data=(nil)
16:10:08.953: debug : qemuMonitorEmitStop:808 : mon=0x98a940
16:10:08.953: debug : qemuHandleDomainStop:1307 : Transitioned guest libvirt-rhel54-033 to paused state due to unknown event
16:10:08.959: debug : qemuMonitorJSONIOProcess:188 : Total used 81 bytes out of 81 available in buffer
16:10:08.959: debug : remoteRelayDomainEventLifecycle:118 : Relaying domain lifecycle event 3 0


18959 ?        Rl    10:13 /usr/libexec/qemu-kvm -S -M rhel6.0.0 -cpu qemu64,-svm -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name libvirt-rhel54-033 -uuid 71f0923d-20b9-4652-b873-6e8331fe14e9 -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/libvirt-rhel54-033.monitor,server,nowait -mon chardev=monitor,mode=control -rtc base=2010-6-14T5:12:49 -boot c -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x7 -drive file=/rhev/data-center/841af73a-d3bf-4bb8-9985-0603fdcf302e/aaac4a9b-ae1f-4e4b-9c71-d25eb10bc83f/images/85dda418-8b5a-462c-b66d-0f86f887f704/1ff0a695-8489-4908-87c0-9d30c7f74feb,if=none,id=drive-virtio-disk0,boot=on,format=qcow2,serial=2c-b66d-0f86f887f704,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/rhev/data-center/841af73a-d3bf-4bb8-9985-0603fdcf302e/aaac4a9b-ae1f-4e4b-9c71-d25eb10bc83f/images/95e2c849-eea2-488e-b5c5-db32ddb5a9d7/1e04df65-ab99-4bc5-b0be-93085dede825,if=none,id=drive-virtio-disk1,format=qcow2,serial=8e-b5c5-db32ddb5a9d7,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=60,id=hostnet0,vhost=on,vhostfd=61 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:23:71:4f,bus=pci.0,addr=0x6 -chardev socket,id=channel0,path=/var/lib/libvirt/qemu/channels/libvirt-rhel54-033.org.linux-kvm.port.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=0,chardev=channel0,name=org.linux-kvm.port.0 -usb -device usb-tablet,id=input0 -vnc 0:37,password -k en-us -vga cirrus -incoming exec:cat -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3

Comment 1 Daniel Berrangé 2010-07-14 13:58:18 UTC
Can you provide the XML configuration, and the full debug logs for libvirtd starting from the time the guest was booted.

Comment 3 Haim 2010-07-14 14:04:32 UTC
xml configuration: 

[root@silver-vdse tmp]# virsh dumpxml 142
<domain type='kvm' id='142'>
  <name>libvirt-rhel54-033</name>
  <uuid>71f0923d-20b9-4652-b873-6e8331fe14e9</uuid>
  <memory>524288</memory>
  <currentMemory>524288</currentMemory>
  <vcpu>1</vcpu>
  <os>
    <type arch='x86_64' machine='rhel6.0.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
  </features>
  <cpu match='exact'>
    <model>qemu64</model>
    <topology sockets='1' cores='1' threads='1'/>
    <feature policy='require' name='sse2'/>
    <feature policy='disable' name='svm'/>
  </cpu>
  <clock offset='variable' adjustment='-25200'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='block' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source dev='/rhev/data-center/841af73a-d3bf-4bb8-9985-0603fdcf302e/aaac4a9b-ae1f-4e4b-9c71-d25eb10bc83f/images/85dda418-8b5a-462c-b66d-0f86f887f704/1ff0a695-8489-4908-87c0-9d30c7f74feb'/>
      <target dev='hda' bus='virtio'/>
      <serial>2c-b66d-0f86f887f704</serial>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source dev='/rhev/data-center/841af73a-d3bf-4bb8-9985-0603fdcf302e/aaac4a9b-ae1f-4e4b-9c71-d25eb10bc83f/images/95e2c849-eea2-488e-b5c5-db32ddb5a9d7/1e04df65-ab99-4bc5-b0be-93085dede825'/>
      <target dev='hdb' bus='virtio'/>
      <serial>8e-b5c5-db32ddb5a9d7</serial>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <controller type='virtio-serial' index='0' ports='16'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='00:1a:4a:23:71:4f'/>
      <source bridge='rhevm'/>
      <target dev='vnet37'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </interface>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channels/libvirt-rhel54-033.org.linux-kvm.port.0'/>
      <target type='virtio' name='org.linux-kvm.port.0'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='0'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='5937' autoport='yes' listen='0' keymap='en-us'/>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
  </devices>
  <seclabel type='dynamic' model='selinux'>
    <label>system_u:system_r:svirt_t:s0:c718,c777</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c718,c777</imagelabel>
  </seclabel>
</domain>


LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -S -M rhel6.0.0 -cpu qemu64,-svm -enable-kvm -m 512 -smp 1,sockets=1,cores=1,threads=1 -name libvirt-rhel54-033 -uuid 71f0923d-20b9-4652-b873-6e8331fe14e9 -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/libvirt-rhel54-033.monitor,server,nowait -mon chardev=monitor,mode=control -rtc base=2010-6-14T5:12:49 -boot c -device virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x7 -drive file=/rhev/data-center/841af73a-d3bf-4bb8-9985-0603fdcf302e/aaac4a9b-ae1f-4e4b-9c71-d25eb10bc83f/images/85dda418-8b5a-462c-b66d-0f86f887f704/1ff0a695-8489-4908-87c0-9d30c7f74feb,if=none,id=drive-virtio-disk0,boot=on,format=qcow2,serial=2c-b66d-0f86f887f704,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/rhev/data-center/841af73a-d3bf-4bb8-9985-0603fdcf302e/aaac4a9b-ae1f-4e4b-9c71-d25eb10bc83f/images/95e2c849-eea2-488e-b5c5-db32ddb5a9d7/1e04df65-ab99-4bc5-b0be-93085dede825,if=none,id=drive-virtio-disk1,format=qcow2,serial=8e-b5c5-db32ddb5a9d7,cache=none -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1 -netdev tap,fd=60,id=hostnet0,vhost=on,vhostfd=61 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:23:71:4f,bus=pci.0,addr=0x6 -chardev socket,id=channel0,path=/var/lib/libvirt/qemu/channels/libvirt-rhel54-033.org.linux-kvm.port.0,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=0,chardev=channel0,name=org.linux-kvm.port.0 -usb -device usb-tablet,id=input0 -vnc 0:37,password -k en-us -vga cirrus -incoming exec:cat -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3 
cat: write error: Broken pipe
kvm: unhandled exit 31
kvm_run returned -22
kvm: unhandled exit 31
kvm_run returned -22
kvm: unhandled exit 31
kvm_run returned -22
kvm: unhandled exit 31
kvm_run returned -22
kvm: unhandled exit 31
kvm_run returned -22


is it enough ?

Comment 4 Daniel Berrangé 2010-07-14 14:17:26 UTC
This last KVM error message

  kvm_run returned -22

Comes from this code:

  int kvm_cpu_exec(CPUState *env)
  {
    int r;

    r = kvm_run(env);
    if (r < 0) {
        printf("kvm_run returned %d\n", r);
        vm_stop(0);
    }

    return 0;
  }


which explains why libvirt is getting a 'PAUSE' event notification for the guest that is not related to an I/O error.

This previous error message

  kvm: unhandled exit 31

indicates some kind of flaw in KVM.

Comment 5 Dor Laor 2010-07-15 11:55:11 UTC
Can you set ksm off and re-try? We had a similar exit message due to ksm bug.

Comment 6 RHEL Program Management 2010-07-15 14:17:20 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release. It has
been denied for the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 7 Haim 2010-07-18 12:05:05 UTC
Dor - removed KSM from cluster (using rhevm) and I still cannot resume the vm. 
get the following error in libvirtd.log. 

15:06:23.245: debug : qemuHandleDomainStop:1307 : Transitioned guest libvirt-rhel54-44 to paused state due to unknown event

Comment 8 Dor Laor 2010-07-18 12:26:59 UTC
Are you using the latest kernel (.44) and qemu-kvm (qemu-kvm-0.12.1.2-2.96.el6)

Comment 9 Haim 2010-07-18 12:40:33 UTC
2.6.32-44.el6.x86_64
qemu-kvm-0.12.1.2-2.96.el6.x86_64

Comment 10 Andrea Arcangeli 2010-07-19 12:02:19 UTC
I would suggest checking it doesn't happen with this build (especially if host hits swapping):

https://brewweb.devel.redhat.com/taskinfo?taskID=2603333


no need to disable KSM or THP with this build.

Comment 11 Dor Laor 2010-07-19 12:49:03 UTC
QE, please test the above

Comment 12 Avi Kivity 2010-07-22 16:06:11 UTC
Isn't this #606131?

Comment 13 Andrea Arcangeli 2010-07-22 18:09:05 UTC
Yes, it's bug #606131 I asked to test the build to be sure.. but we can probably safely mark it as dup already.

Comment 14 Dor Laor 2010-07-22 21:05:16 UTC

*** This bug has been marked as a duplicate of bug 606131 ***