Bug 348861
| Summary: | libvirt error message (virt-manager) "out of memory/invalid argument in __VirtGetDomain" on guest save/migration | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Sanjay Rao <srao> | ||||||
| Component: | libvirt | Assignee: | Daniel Veillard <veillard> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
| Severity: | low | Docs Contact: | |||||||
| Priority: | low | ||||||||
| Version: | 5.1 | CC: | llange, llim, nzhang, syeghiay, virt-maint, xen-maint | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2009-09-02 09:22:51 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Sanjay Rao
2007-10-23 15:07:33 UTC
I did some more testing and it seems the message can easily reproduced by either
using 'migrate' or 'save' against a running guest. In either case we will see
the message
libvir: Xen Store error : out of memory
libvir: error : invalid argument in __virGetDomain
A quick strace of the virt-manager process (extract from 10k trace)
Note vBlade1 is the guest which is being "shutdown" as part of a
"virsh save vBlade1 vBlade1.sav" command
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
write(16, "\2\0\0\0\0\0\0\0\0\0\0\0\26\0\0\0", 16) = 16
write(16, "/local/domain/45/name\0", 22) = 22
read(16, "\2\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0", 16) = 16
read(16, "WinXP", 5) = 5
rt_sigaction(SIGPIPE, {SIG_IGN}, NULL, 8) = 0
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
write(16, "\2\0\0\0\0\0\0\0\0\0\0\0\26\0\0\0", 16) = 16
write(16, "/local/domain/46/name\0", 22) = 22
read(16, "\2\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0", 16) = 16
read(16, "vBlade1", 7) = 7
rt_sigaction(SIGPIPE, {SIG_IGN}, NULL, 8) = 0
rt_sigaction(SIGPIPE, {SIG_IGN}, {SIG_IGN}, 8) = 0
write(16, "\n\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0", 16) = 16
write(16, "46\0", 3) = 3
read(16, "\n\0\0\0\0\0\0\0\0\0\0\0\21\0\0\0", 16) = 16
read(16, "/local/domain/46\0", 17) = 17
rt_sigaction(SIGPIPE, {SIG_IGN}, NULL, 8) = 0
write(2, "libvir: error : invalid argument"..., 51) = 51
write(2, "libvir: Xen Store error : out of"..., 40) = 40
time(NULL) = 1193151043
Created attachment 292113 [details]
xsat xen config
the config for my domU
Created attachment 292115 [details]
my xend config
my xend config
my comments again: (got confused with the attachments :( ) i have the same error messages. The odd thing is that live migration worked. I use : rhel 5.1 xen-3.0.3-41.el5 libvirt-0.2.3-9.el5 the migration was started with the xm command : xm migrate xsat node2 Problem not reproduceable after fixing xsat xen config and restarting the nodes. I removed the r from the disc line, so it reads w now: before fix: disk = [ 'phy:/dev/mapper/VGSAN-master_root,xvda,wr', 'phy:/dev/mapper/VGSAN-master_satellite,xvdb,w' ] after fix: disk = [ 'phy:/dev/mapper/VGSAN-master_root,xvda,w', 'phy:/dev/mapper/VGSAN-master_satellite,xvdb,w' ] Okay, this message is being printed by libvirt, so is not specific to virt-manager. 'virsh save' would cause the same result. However, with the libvirt version queued up for 5.4, I can no longer reproduce this. So we will probably get this for free. It could even be fixed on stock 5.3, but I haven't tested. Reassigning to libvirt. Yes that looks fixed to me with the current rebase, at least I wasn't able to reproduce the problem anymore with 0.6.2 built for RHEL-5: [root@test2 ~]# virsh list Id Name State ---------------------------------- 0 Domain-0 running 4 migr5 idle [root@test2 ~]# virsh save migr5 /tmp/migr5.save Domain migr5 saved to /tmp/migr5.save [root@test2 ~]# rpm -q libvirt libvirt-0.6.2-1.el5 [root@test2 ~]# virsh restore /tmp/migr5.save Domain restored from /tmp/migr5.save [root@test2 ~]# virsh list Id Name State ---------------------------------- 0 Domain-0 running 5 migr5 idle [root@test2 ~]# Daniel Test on libvirt 0.6.3-3, the FV guest is running with 2G memory and both the dom-0 systems have 16G physical memory on rhel-5.4.
virt-manager error info:
Error migrating domain: POST operation failed: xend_post: error from xen daemon: (xend.err '/usr/lib64/xen/bin/xc_save 17 6 0 0 5 failed')
Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/engine.py", line 561, in migrate_domain
vm.migrate(destconn)
File "/usr/share/virt-manager/virtManager/domain.py", line 1378, in migrate
self.vm.migrate(self.connection.vmm, flags, None, dictcon.get_short_hostname(), 0)
File "/usr/lib64/python2.4/site-packages/libvirt.py", line 378, in migrate
if ret is None:raise libvirtError('virDomainMigrate() failed', dom=self)
libvirtError: POST operation failed: xend_post: error from xen daemon: (xend.err '/usr/lib64/xen/bin/xc_save 17 6 0 0 5 failed')
virsh error info:
[root@intel-5130-32-1 ~]# virsh list --all
Id Name State
----------------------------------
0 Domain-0 running
1 foo idle
[root@intel-5130-32-1 ~]# virsh migrate foo xen+ssh://10.66.83.192
root.83.192's password:
error: POST operation failed: xend_post: error from xen daemon: (xend.err "can't connect: Name or service not known")
comment #11 looks like a migration errror. But as #1 pointed that bug could be reproduced with the save command, and in #8 it disapeared. Now if migration fails that could be related to a variety of other reasons, let's not mix things, we need to check migration again, but I think it should be a different bug, especially with the error reported "Name or service not known" is clearly something completely different than the original bug, which to me is still in MODIFIED state. Daniel Actually this bug was already fixed, but still has the error "Name or service not known", for this issue will open a new bug later. [root@intel-5130-32-1 ~]# virsh start demo Domain demo started [root@intel-5130-32-1 ~]# virsh migrate demo xen+ssh://10.66.83.192 root.83.192's password: error: POST operation failed: xend_post: error from xen daemon: (xend.err "can't connect: Name or service not known") An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1269.html |