Bug 615941
Summary: | Migrate fail with error 'An undefined error has ocurred'. | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | wangyimiao <yimwang> | |
Component: | libvirt | Assignee: | Osier Yang <jyang> | |
Status: | CLOSED DUPLICATE | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 6.0 | CC: | berrange, clalance, dallan, ddumas, dyuan, eblake, ivars.strazdins, jdenemar, llim, veillard, weizhan, xen-maint, yoyzhang, zhpeng | |
Target Milestone: | rc | Keywords: | TestOnly | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 647090 696948 698141 (view as bug list) | Environment: | ||
Last Closed: | 2011-09-22 01:40:44 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 584077, 670727, 698496 | |||
Bug Blocks: | 647090, 696948, 698141 |
Description
wangyimiao
2010-07-19 10:25:54 UTC
This issue has been proposed when we are only considering blocker issues in the current Red Hat Enterprise Linux release. It has been denied for the current Red Hat Enterprise Linux release. ** If you would still like this issue considered for the current release, ask your support representative to file as a blocker on your behalf. Otherwise ask that it be considered for the next Red Hat Enterprise Linux release. ** I logged on 10.66.93.211 I could see the errors in /var/log/messages Jul 19 08:34:28 dhcp-93-211 libvirtd: 08:34:28.016: error : qemuMonitorJSONCheckError:316 : internal error unable to execute QEMU command 'migrate': An undefined error has ocurred There was nothing suspicious in /var/log/libvirt/qemu/vm1.log, it's launched with: LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -S -M rhel6.0.0 -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -name vm1 -uuid 99df8f8c-ee31-3848-d220-80a3147b444f -nodefconfig -nodefaults -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait -mon chardev=monitor,mode=control -rtc base=utc -boot c -drive file=/mnt/wang/yimwang/RHEL-Server-6-64-virtio.qcow2,if=none,id=drive-ide0-0-0,boot=on,format=qcow2 -device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=24,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:c9:67:21,bus=pci.0,addr=0x3 -chardev pty,id=serial0 -device isa-serial,chardev=serial0 -usb -vnc 127.0.0.1:0 -k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 that looks normal to me. but based on the report, the source machine i.e. where vm1 was running is on 10.66.93.197 and that machine is not available to check, so I could not try to reproduce this, Daniel Do you still have 10.66.93.197 state ? Or can you reproduce this from another source machine and giving us access to that machine ? The informations are likely to be saved on the source logs Daniel Hi 'DV', I do the seem the operations on "10.66.93.211"(Define a VM as 'vm1' and migrate it to "10.66.93.211"), so I just keep machine ''10.66.93.211'' live. And now I will keep '10.66.93.197' and "10.66.93.211" are available. '10.66.93.197' is not maine,if it unavailable,plesse tell me by 'bz'or 'IRC'. Hi 'DV', I do the same operate on "10.66.93.211" (Define a VM as 'vm1' and migrate it to "10.66.93.211"), so I just keep machine ''10.66.93.211'' live. And now I will keep '10.66.93.197' and "10.66.93.211" are available. '10.66.93.197'is not mine,if it unavailable,please tell me by 'bz' or 'IRC'. Thanks! Tried to replicate the problem here using the same kernel, qemu, and libvirt versions, but no luck. :( Yimiao, can you please rebuild these hosts (and make the sure problem still occurs), so I can access them remotely? dyuan was kind enough to give access to the hosts, and the problem still remains. I've not been able to get the migration to work successfully using virt-manager though. What settings did you use for that? As the 'Additional info' in original description, should select 'New host' and also fill the same ip address in 'Advanced options'-> 'Address', then migration work successfully. If don't fill 'Advanced options'-> 'Address', will get the error after click the 'Migrate'. I'm not sure what the advanced settings do additionally, because can't find any difference from virt-manager.log and '/var/log/libvirt/qemu/guest.log'. This bug is being painful to find out what the cause is. I have been debugging this for several days, without yet establishing the clear reason for the failure. Qemu is reporting an unknown failure when the migration is being performed with virsh. However, the migration succeeds when virt-manager (on one host) is used following dyuan's steps. (tested a few minutes ago) The bug itself can be reproduced when attempting to migrate from 10.66.92.154 (source host) to 10.66.93.205 (destination host), using the steps Yimiao has given in this BZ. The migration DOES work when using virt-manager, following the steps Yimiao gave in the "Additional Info" part of this BZ (and repeated by Dyuan in response to my question). Additionally, the migration DOES succeed when using virsh in the opposite direction. I'm still investigating. The cause is proving time consuming and tricky to find out though. :( (so far) weizhang, that's a different bug. Please file a BZ (if the bug isn't already reported). :) The cause of this problem has turned out to be name resolution (thanks DV). Adding an entry for the destination host to /etc/hosts, on the source server, allows the migration to work. 10.66.93.205 dhcp-93-205.nay.redhat.com I'll discuss with the rest of the libvirt team if this is something we should write a check for, so people aren't caught by this in future. Huh, very odd. I thought we should have caught that with our FQDN check in qemudDomainMigratePrepare2: /* Get hostname */ if ((hostname = virGetHostname(NULL)) == NULL) goto cleanup; if (STRPREFIX(hostname, "localhost")) { qemuReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("hostname on destination resolved to localhost, but migration requires an FQDN")); goto cleanup; } As you suggest, it's probably worth seeing what the output from virGetHostname() was, and to add a check to reject whatever bogus value it is reporting. Chris Lalancette Since the problem here is really that it's difficult to troubleshoot migration when name resolution is broken, I'm going to move this BZ to 6.1. Initial patches submitted upstream to solve this problem: http://www.redhat.com/archives/libvir-list/2010-August/msg00138.html http://www.redhat.com/archives/libvir-list/2010-August/msg00139.html To get the destination QEMU uri, we execute hostname() on destination host to get the hostname, and then use it to form QEMU uri, (e.g. tcp://10.66.70.83). so, if the destination host is mis-configured, it will give us a illed address which would not actually be addressed by source host, then the migrate will fail. But we can't do further more on libvirt side (guessing the hostname on destination side will make things more complex and confused), as qemu discard the errors for migrate, (actually all of the errors related to getaddrinfo(3)/connect(2) failing?), If qemu is able to report any error, we can then have better diagnose log. So, create a depedant bug against qemu-kvm. https://bugzilla.redhat.com/show_bug.cgi?id=670727 *** Bug 680162 has been marked as a duplicate of this bug. *** *** Bug 681109 has been marked as a duplicate of this bug. *** *** Bug 618562 has been marked as a duplicate of this bug. *** test on kernel-2.6.32-166.el6.x86_64 qemu-kvm-0.12.1.2-2.169.el6.x86_64 libvirt-0.9.3-7.el6.x86_64 without adding hostname and ip on /etc/hosts still report: error: internal error unable to execute QEMU command 'migrate': An undefined error has ocurred test on virt-manager-0.9.0-5.el6.x86_64 libvirt-0.9.4-1.el6.x86_64 qemu-kvm-0.12.1.2-2.175.el6.x86_64 kernel-2.6.32-175.el6.x86_64 without adding hostname and ip on /etc/hosts report: Unable to migrate guest: out of memory (In reply to comment #34) > test on > virt-manager-0.9.0-5.el6.x86_64 > libvirt-0.9.4-1.el6.x86_64 > qemu-kvm-0.12.1.2-2.175.el6.x86_64 > kernel-2.6.32-175.el6.x86_64 > > without adding hostname and ip on /etc/hosts > > report: > Unable to migrate guest: out of memory The difference from this bug is that the hostname here without dnsdomainname. Do we need to file a new bug? It's unlikely the testing box is really out of memory, but please confirm that first, if it's really out of memory, that's not bug, otherwise please file a new bug with the debug log. Thanks. (In reply to comment #35) > (In reply to comment #34) > > test on > > virt-manager-0.9.0-5.el6.x86_64 > > libvirt-0.9.4-1.el6.x86_64 > > qemu-kvm-0.12.1.2-2.175.el6.x86_64 > > kernel-2.6.32-175.el6.x86_64 > > > > without adding hostname and ip on /etc/hosts > > > > report: > > Unable to migrate guest: out of memory > > The difference from this bug is that the hostname here without dnsdomainname. > Do we need to file a new bug? You are most likely hitting a bug fixed by upstream commit 63e4af45f274adf1821498970dfa3902caf1bc8c. But we don't (AFAIK) have a BZ for that yet. Please, file a new BZ with the steps needed for reproducing the bug. This is fixed after we changed to use qemu fd: protocol for migration, see BZ https://bugzilla.redhat.com/show_bug.cgi?id=720269, so I close this as a DUPLICATE with 720269. *** This bug has been marked as a duplicate of bug 720269 *** |