RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 844378 - error messages may not be propagated properly during failed migration
Summary: error messages may not be propagated properly during failed migration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.3
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-30 12:51 UTC by Haim
Modified: 2016-04-26 14:38 UTC (History)
10 users (show)

Fixed In Version: libvirt-0.10.2-32.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-10-14 04:14:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
libvirt logs (19.27 MB, application/x-tar)
2012-07-30 12:53 UTC, Haim
no flags Details
libvirt client log from source host (165.76 KB, application/octet-stream)
2012-08-02 07:58 UTC, Jiri Denemark
no flags Details
libvirtd log from source host (262.03 KB, application/octet-stream)
2012-08-02 07:59 UTC, Jiri Denemark
no flags Details
libvirtd log from destination host (521.51 KB, application/octet-stream)
2012-08-02 07:59 UTC, Jiri Denemark
no flags Details
/var/log/libvirt/qemu/vm.log from destination host (2.11 KB, application/octet-stream)
2012-08-02 08:00 UTC, Jiri Denemark
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1374 0 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2014-10-14 08:11:54 UTC

Description Haim 2012-07-30 12:51:17 UTC
Description of problem:

- start vm with 1 virtIO disk pre-allocated
- unplug disk
- migrate vm from source to destination

error (vdsm log):

Thread-25957::ERROR::2012-07-30 19:00:46,217::vm::176::vm.Vm::(_recover) vmId=`3f6cb042-2be6-4c81-8429-56e6d803f438`::Domain not found: no domain with matching
 name 'vm2'
Thread-25957::ERROR::2012-07-30 19:00:46,395::vm::240::vm.Vm::(run) vmId=`3f6cb042-2be6-4c81-8429-56e6d803f438`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 223, in run
    self._startUnderlyingMigration()
  File "/usr/share/vdsm/libvirtvm.py", line 451, in _startUnderlyingMigration
    None, maxBandwidth)
  File "/usr/share/vdsm/libvirtvm.py", line 491, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 82, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1104, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
libvirtError: Domain not found: no domain with matching name 'vm2'

libvirt error: 

2012-07-30 16:07:48.611+0000: 2731: debug : qemuProcessAutoDestroyAdd:4187 : vm=vm2, conn=0x7fef000023f0
2012-07-30 16:07:48.611+0000: 2731: debug : qemuDriverCloseCallbackSet:533 : vm=vm2, uuid=3f6cb042-2be6-4c81-8429-56e6d803f438, conn=0x7fef000023f0, cb=0x48e6f
0
2012-07-30 16:07:48.611+0000: 2731: debug : qemuProcessStart:3662 : Writing domain status to disk
2012-07-30 16:07:48.650+0000: 2731: debug : qemuMigrationPrepareAny:1340 : Received no lockstate
2012-07-30 16:07:48.651+0000: 2731: debug : qemuMigrationBakeCookie:729 : cookielen=394 cookie=<qemu-migration>
  <name>vm2</name>
  <uuid>3f6cb042-2be6-4c81-8429-56e6d803f438</uuid>
  <hostname>nott-vds2.qa.lab.tlv.redhat.com</hostname>
  <hostuuid>35303738-3635-4742-3830-333334344130</hostuuid>
  <graphics type='spice' port='5900' listen='0' tlsPort='5901'>
    <cert info='subject' value='O=qa.lab.tlv.redhat.com,CN=nott-vds2.qa.lab.tlv.redhat.com'/>
  </graphics>
</qemu-migration>

2012-07-30 16:07:48.651+0000: 2731: debug : qemuDomainCleanupAdd:1873 : vm=vm2, cb=0x494500
2012-07-30 16:07:48.651+0000: 2724: debug : virDomainFree:2313 : dom=0x1398020, (VM: name=vm2, uuid=3f6cb042-2be6-4c81-8429-56e6d803f438)
2012-07-30 16:07:49.349+0000: 2724: debug : qemuMonitorRef:201 : QEMU_MONITOR_REF: mon=0x7feefc0cc5e0 refs=3
2012-07-30 16:07:49.349+0000: 2724: error : qemuMonitorIO:574 : internal error End of file from monitor
2012-07-30 16:07:49.349+0000: 2724: debug : qemuMonitorIO:607 : Error on monitor internal error End of file from monitor
2012-07-30 16:07:49.349+0000: 2724: debug : qemuMonitorUnref:210 : QEMU_MONITOR_UNREF: mon=0x7feefc0cc5e0 refs=2
2012-07-30 16:07:49.349+0000: 2724: debug : qemuMonitorIO:630 : Triggering EOF callback
2012-07-30 16:07:49.349+0000: 2724: debug : qemuProcessHandleMonitorEOF:285 : Received EOF on 0x7feefc13bd00 'vm2'
2012-07-30 16:07:49.349+0000: 2724: debug : qemuProcessHandleMonitorEOF:304 : Monitor connection to 'vm2' closed without SHUTDOWN event; assuming the domain cr
ashed

vm xml:


<domain type='kvm' id='4'>
  <name>vm2</name>
  <uuid>3f6cb042-2be6-4c81-8429-56e6d803f438</uuid>
  <memory unit='KiB'>524288</memory>
  <currentMemory unit='KiB'>524288</currentMemory>
  <vcpu placement='static'>1</vcpu>
  <cputune>
    <shares>1020</shares>
  </cputune>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>Red Hat</entry>
      <entry name='product'>RHEV Hypervisor</entry>
      <entry name='version'>6Server-6.3.0.3.el6</entry>
      <entry name='serial'>38373035-3536-4247-3830-333334343930_78:E7:D1:E4:8C:70</entry>
      <entry name='uuid'>3f6cb042-2be6-4c81-8429-56e6d803f438</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='rhel6.3.0'>hvm</type>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
  </features>
  <cpu mode='custom' match='exact'>
    <model fallback='allow'>Conroe</model>
    <topology sockets='1' cores='1' threads='1'/>
  </cpu>
  <clock offset='variable' adjustment='0'>
    <timer name='rtc' tickpolicy='catchup'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source startupPolicy='optional'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <serial></serial>
      <alias name='ide0-1-0'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <controller type='ide' index='0'>
      <alias name='ide0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='00:1a:4a:23:61:66'/>
      <source bridge='rhevm'/>
      <target dev='vnet1'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <console type='pty' tty='/dev/pts/2'>
      <source path='/dev/pts/2'/>
      <target type='virtio' port='0'/>
      <alias name='console0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/channels/vm2.com.redhat.rhevm.vdsm'/>
      <target type='virtio' name='com.redhat.rhevm.vdsm'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <alias name='channel1'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <input type='mouse' bus='ps2'/>
    <graphics type='spice' port='5902' tlsPort='5903' autoport='yes' listen='0' keymap='en-us' passwdValidTo='1970-01-01T00:00:01'>
      <listen type='address' address='0'/>
      <channel name='main' mode='secure'/>
      <channel name='display' mode='secure'/>
      <channel name='inputs' mode='secure'/>
      <channel name='cursor' mode='secure'/>
      <channel name='playback' mode='secure'/>
      <channel name='record' mode='secure'/>
    </graphics>
    <video>
      <model type='qxl' vram='65536' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c279,c690</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c279,c690</imagelabel>
  </seclabel>
</domain>

Comment 2 Haim 2012-07-30 12:53:15 UTC
Created attachment 601259 [details]
libvirt logs

Comment 3 Jiri Denemark 2012-07-30 14:06:03 UTC
I've been already looking at this...

Comment 5 Itamar Heim 2012-07-31 11:21:46 UTC
is it only for unplug, or also for VM started without disks to begin with?

Comment 6 Haim 2012-07-31 11:28:27 UTC
(In reply to comment #5)
> is it only for unplug, or also for VM started without disks to begin with?

it appears to be related to hot-plug, I started a vm without disks boot from network and I manage to migrate it.

Comment 7 Jiri Denemark 2012-08-01 14:07:18 UTC
Several migration failures can be found in the logs. Most of them were caused
by bug 807023, i.e., the disk was in fact not unplugged but libvirt didn't
know about it. Thus libvirt removed the disk from domain XML and started qemu
on destination without that disk but source qemu still saw the disk attached
and sent its status to destination. Once qemu on destination host saw that, it
immediately aborted with

    "Unknown savevm section or instance '0000:00:05.0/virtio-blk' 0"

error message. Libvirtd on source detected it during Finish phase and reported

    "Domain not found: no domain with matching name 'vm'".


In the last failure, however, qemu didn't even have a chance to fail because
someone explicitly killed the domain using virDomainDestroy. Thus the Prepare
phase failed, although the error was not properly propagated:

    "An error occurred, but the cause is unknown".

Since the root cause is already tracked by the other bug, we can use this bug
to track the issue with error message propagation.

Comment 8 Jiri Denemark 2012-08-02 07:58:41 UTC
Created attachment 601889 [details]
libvirt client log from source host

Comment 9 Jiri Denemark 2012-08-02 07:59:09 UTC
Created attachment 601890 [details]
libvirtd log from source host

Comment 10 Jiri Denemark 2012-08-02 07:59:38 UTC
Created attachment 601891 [details]
libvirtd log from destination host

Comment 11 Jiri Denemark 2012-08-02 08:00:09 UTC
Created attachment 601892 [details]
/var/log/libvirt/qemu/vm.log from destination host

Comment 12 Jiri Denemark 2012-08-02 08:02:34 UTC
Time on destination is approximately 50 seconds ahead of source.

Comment 13 zhpeng 2012-08-06 03:25:52 UTC
I can reproduce it with libvirt-0.9.10-21.el6.x86_64 and libvirt-0.10.0-0rc0.el6
Steps:
1. change log level to 1 on src and dst host, restart libvirtd
2. prepare a guest with allocated virtio disk
3. hot unplug the disk
4. migrate and fail
5. check the error info and log

Comment 20 Jiri Denemark 2014-03-20 22:17:09 UTC
This is fixed upstream by v1.2.2-291-gcfa7cea:

commit cfa7ceab7735410c0427136236bf8bad10670816
Author: Jiri Denemark <jdenemar>
Date:   Mon Mar 17 11:04:07 2014 +0100

    qemu: Return meaningful error when qemu dies early
    
    https://bugzilla.redhat.com/show_bug.cgi?id=844378
    
    When qemu dies early after connecting to its monitor but before we
    actually try to read something from the monitor, we would just fail
    domain start with useless message:
    
        "An error occurred, but the cause is unknown"
    
    This is because the real error gets reported in a monitor EOF handler
    executing within libvirt's event loop.
    
    The fix is to take any error set in qemuMonitor structure and propagate
    it into the thread-local error when qemuMonitorClose is called and no
    thread-local error is set.
    
    Signed-off-by: Jiri Denemark <jdenemar>

Notes:
    To reproduce the issue, just apply the following patch and kill
    qemu process while libvirtd is sleeping:
    
    diff --git i/src/qemu/qemu_process.c w/src/qemu/qemu_process.c
    index 400625a..3b77ecc 100644
    --- i/src/qemu/qemu_process.c
    +++ w/src/qemu/qemu_process.c
    @@ -1415,6 +1415,9 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomainObjPtr vm, int logfd)
         if (mon)
             ignore_value(qemuMonitorSetDomainLog(mon, logfd));
    
    +    VIR_DEBUG("Sleeping");
    +    sleep(5);
    +
         virObjectLock(vm);
         virObjectUnref(vm);
         priv->monStart = 0;

Comment 24 zhe peng 2014-04-15 09:02:08 UTC
Can reproduce this with build:libvirt-0.10.2-29.el6.x86_64
# error: Failed to start domain rhel6.5
error: An error occurred, but the cause is unknown
verify with build:libvirt-0.10.2-32.el6.x86_64

step:
1:apply Jiri's reproducer on libvirt
2:rebuild libvirt pkg and install
3:# virsh start rhel6.5 & sleep 3;killall -9 qemu-kvm
error: internal error End of file from monitor

expect error msg.move to verified.

Comment 26 errata-xmlrpc 2014-10-14 04:14:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1374.html


Note You need to log in before you can comment on or make changes to this bug.