Bug 729567 - Out of memory when DNS resolving fails during migration
Summary: Out of memory when DNS resolving fails during migration
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.2
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-10 08:29 UTC by zhpeng
Modified: 2013-09-09 00:03 UTC (History)
7 users (show)

Fixed In Version: libvirt-0.9.4-3.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 11:26:01 UTC


Attachments (Terms of Use)
libvirt.log and vergrind log (1.55 KB, text/plain)
2011-08-10 08:29 UTC, zhpeng
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1513 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-12-06 01:23:30 UTC

Description zhpeng 2011-08-10 08:29:32 UTC
Created attachment 517541 [details]
libvirt.log and vergrind log

Description of problem:
when migrate a vm to another with virt-manager or virsh, error report. and migrate failed.

Version-Release number of selected component (if applicable):
virt-manager-0.9.0-5.el6.x86_64
libvirt-0.9.4-1.el6.x86_64
qemu-kvm-0.12.1.2-2.175.el6.x86_64
kernel-2.6.32-175.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1.create a nfs pool of two nodes
2.create a new vm use the new pool
3.open the new vm, right click it and select migrate to a new host.
  or # virsh migrate --live mig1 qemu+ssh://root@10.66.6.209/system

in debug info:

Actual results:
An error dialog popup. Add migrate fail.
shows:
Unable to migrate guest: out of memory

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 44, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/migrate.py", line 557, in _async_migrate
    vm.migrate(dstconn, migrate_uri, rate, live, secure, meter=meter)
  File "/usr/share/virt-manager/virtManager/domain.py", line 1123, in migrate
    self._backend.migrate(destconn.vmm, flags, newname, interface, rate)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 741, in migrate
    if ret is None:raise libvirtError('virDomainMigrate() failed', dom=self)
libvirtError: out of memory


Expected results:
1.migration successful

Additional info:

Comment 2 zhpeng 2011-08-10 09:55:02 UTC
I tested in newest environment:

kernel-2.6.32-178.el6.x86_64
libvirt-0.9.4-2.el6.x86_64
virt-manager-0.9.0-5.el6.x86_64
qemu-kvm-0.12.1.2-2.177.el6.x86_64

new migration test in two situations
1, add hostname in /etc/hosts
2. not add hostname

test 1 : success
test 2 : failed

and error is:
Unable to migrate guest: internal error unable to execute QEMU command
'migrate': An undefined error has ocurred

and like this bug  

https://bugzilla.redhat.com/show_bug.cgi?id=615941#c36
"Bug 615941 - Migrate fail with error 'An undefined error has ocurred'."

Comment 3 zhpeng 2011-08-10 10:59:06 UTC


Two different bugs. 

-------------------------------------------------------------
this bug environment:
virt-manager-0.9.0-5.el6.x86_64
libvirt-0.9.4-1.el6.x86_64
qemu-kvm-0.12.1.2-2.175.el6.x86_64
kernel-2.6.32-175.el6.x86_64

node A: zhpeng              10.66.6.209
node B: pengzhimoutest      10.66.5.246

not edit /etc/hosts

error:
Unable to migrate guest: out of memory

The bug is seems report wrong error message.
------------------------------------------------------------
and tested in newest environment:
virt-manager-0.9.0-5.el6.x86_64
libvirt-0.9.4-2.el6.x86_64
qemu-kvm-0.12.1.2-2.177.el6.x86_64
kernel-2.6.32-178.el6.x86_64

node A: zhpeng              10.66.6.209
node B: pengzhimoutest      10.66.5.246

not edit /etc/hosts

and error is:
Unable to migrate guest: internal error unable to execute QEMU command
'migrate': An undefined error has ocurred

Comment 4 Jiri Denemark 2011-08-10 14:18:49 UTC
Hmm, I'm seriously confused now. The "Unable to migrate guest: out of memory" error was fixed upstream but not in RHEL (and definitely not in libvirt-0.9.4-2.el6.x86_64) so it's strange you don't see the same error in newer environment. Are you sure nothing else changed? Especially wrt. DNS resolving on both machines? You should be able to reproduce the bug even with 0.9.4-2 version of libvirt.

Comment 5 Jiri Denemark 2011-08-10 14:23:54 UTC
Patch sent for review:
http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-August/msg00341.html

Comment 6 zhpeng 2011-08-11 01:34:22 UTC
Sorry about that.

I did it again.

That's Ctrl+c Ctrl+v's mistake

I can reproduce the bug at libvirt-0.9.4-2.el6.x86_64

The same error


Unable to migrate guest: out of memory

Traceback (most recent call last):
  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 44, in cb_wrapper
    callback(asyncjob, *args, **kwargs)
  File "/usr/share/virt-manager/virtManager/migrate.py", line 557, in _async_migrate
    vm.migrate(dstconn, migrate_uri, rate, live, secure, meter=meter)
  File "/usr/share/virt-manager/virtManager/domain.py", line 1123, in migrate
    self._backend.migrate(destconn.vmm, flags, newname, interface, rate)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 741, in migrate
    if ret is None:raise libvirtError('virDomainMigrate() failed', dom=self)
libvirtError: out of memory

Comment 8 zhpeng 2011-08-17 10:45:33 UTC
Hi,
I have Reproduced the bug with :
virt-manager-0.9.0-5.el6.x86_64
libvirt-0.9.4-1.el6.x86_64
qemu-kvm-0.12.1.2-2.175.el6.x86_64
kernel-2.6.32-175.el6.x86_64

And i have verified on the latest version on:
kernel-2.6.32-189.el6.x86_64
libvirt-0.9.4-4.el6.x86_64
qemu-kvm-0.12.1.2-2.183.el6.x86_64
virt-manager-0.9.0-5.el6.x86_64

The steps are as follows:
there are two nodes:
nodeA hostname: zhpeng ip: 10.66.6.209
nodeB hostname: pengzhimoutest ip: 10.66.5.246

1, prepare a nfs server for two nodes. # cat /etc/exports
/tmp/libvirt *(rw,no_root_squash) # service nfs start
2, create a new nfs pool of two nodes. # pool-list --all
Name State Autostart
-----------------------------------------
boot-scratch active yes
default active yes

   1. touch /root/nfs.xml
   2. cat /root/nfs.xml
      <pool type="netfs">
      <name>pool-mig</name>
      <source>
      <host name="10.66.6.209"/>
      <dir path="/tmp/libvirt"/>
      </source>
      <target>
      <path>/var/lib/libvirt/images/new</path>
      </target>
      </pool>

   1. pool-define /root/nfs.xml
      Pool pool-mig defined from /root/nfs.xml
   2. pool-start pool-mig
      Pool pool-mig started
   3. pool-autostart pool-mig
      Pool pool-mig marked as autostarted
   4. pool-list --all
      Name State Autostart
      -----------------------------------------
      boot-scratch active yes
      default active yes
      pool-mig active yes

3, create a new guest use pool-mig or define a new guest from pool-mig on nodeA. # touch /root/mig1.xml # cat mig1.xml
<domain type='kvm'>
<name>mig1</name>
<uuid>21b9be2b-fd3f-bea0-f36f-a0dc7a7cde12</uuid>
<memory>262144</memory>
<currentMemory>262144</currentMemory>
<vcpu>1</vcpu>
<os>
<type arch='x86_64' machine='rhel6.2.0'>hvm</type>
<boot dev='hd'/>
</os>
<features>
<acpi/>
<apic/>
<pae/>
</features>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>restart</on_crash>
<devices>
<emulator>/usr/libexec/qemu-kvm</emulator>
<disk type='file' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source file='/var/lib/libvirt/images/new/aaa.img'/>
<target dev='vda' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>
<interface type='network'>
<mac address='52:54:00:c9:02:bc'/>
<source network='default'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
<input type='tablet' bus='usb'/>
<input type='mouse' bus='ps2'/>
<graphics type='vnc' port='-1' autoport='yes'/>
<video>
<model type='cirrus' vram='9216' heads='1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
</video>
<memballoon model='virtio'>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</memballoon>
</devices>
</domain>

   1. define /root/mig1.xml
      Domain mig1 defined from /root/mig1.xml
   2. list --all
      Id Name State
      ----------------------------------
      - mig1 shut off
   3. start mig1
      Domain mig1 started

4,Ensure two nodes have no FQDN.
on nodeA: # hostname
zhpeng
on nodeB: # hostname
pengzhimoutest

5,Ensure /etc/hosts have no domain name resolve on two nodes 
# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 loclhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

6, migrate from nodeA to nodeB
[root@zhpeng libvirt]# virsh migrate --live mig1 qemu+ssh://root@10.66.5.246/system
root@10.66.5.246's password:
error: internal error getaddrinfo failed for 'zhpeng': Name or service not known

7, add domain name resolve to /etc/hosts of two nodes
   # cat /etc/hosts
    10.66.6.209    zhpeng
    10.66.5.246    pengzhimoutest

8, repeat step6
 [root@zhpeng libvirt]# virsh migrate --live mig1 qemu+ssh://root@10.66.5.246/system
  there is no error now.

9, check the guest migrated.
   On nodeB:
   # virsh list --all
    Id Name                 State
   ----------------------------------
     7 mig1                running

10, migrate guest back to nodeA and check on nodeA
 [root@pengzhimoutest libvirt]# virsh migrate --live mig1 qemu+ssh://root@10.66.6.209/system
 [root@zhpeng libvirt]# virsh list --all
    Id Name                 State
   ----------------------------------
     11 mig1                running


So the error information of step 6 is correct now. It's expected results. The test is pass.

Comment 9 dyuan 2011-08-18 10:39:09 UTC
Will re-check this bug once bug 731243 fixed, then move it to VERIFIED.

Comment 10 zhpeng 2011-08-29 08:15:09 UTC
I recheck the migration with:
kernel-2.6.32-192.el6.x86_64
libvirt-0.9.4-6.el6.x86_64
qemu-kvm-0.12.1.2-2.184.el6.x86_64
virt-manager-0.9.0-5.el6.x86_64

The test is pass.

Comment 11 min zhan 2011-08-29 08:42:22 UTC
According to Comment 10 and Comment 8, move it to VERIFIED.

Comment 12 errata-xmlrpc 2011-12-06 11:26:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html


Note You need to log in before you can comment on or make changes to this bug.