Description of problem: I cant migrating a VM. Version-Release number of selected component (if applicable): Kernel:2.6.18-53.el5xen How reproducible: root@xen1#xm migrate squid xen2 Steps to Reproduce: Actual results: migrating-Squid 1 511 1 ---s-- 3403.1 Expected results: Squid 1 511 1 r----- 3403.1 Additional info: xm log [2007-10-31 16:30:52 xend.XendDomainInfo 13238] INFO (XendDomainInfo:941) Domain has shutdown: name=migrating-Nagios id=1 reason=suspend. [2007-10-31 16:30:52 xend.XendDomainInfo 13238] INFO (XendDomainInfo:941) Domain has shutdown: name=migrating-Nagios id=1 reason=suspend. And never migrate de VM..
This is not nearly enough information to diagnose the problem. Please provide - /var/log/xen/xend.log from the source host - /var/log/xen/xend-error.log fromthe source host - /var/log/xen/xend.log from the destination host - /var/log/xen/xend-error.log from the destination host - The /etc/xen/[DOMAIN NAME] config file for the guest in question - The /etc/xen/xend-config.sxp file from both hosts.
/var/log/xen/xend.log from de Destination host File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1321, in check_name raise VmError("VM name '%s' already in use by domain %d" % VmError: VM name 'Nagios' already in use by domain 3 But #xm list from the destination host xm list Name ID Mem(MiB) VCPUs State Time(s) Domain-0 0 733 8 r----- 438.4 The /etc/xen/xend-config.sxp file from both hosts. Both hosts are same config with different ip address. xend-unix-server yes) (xend-relocation-server yes) (xend-port 8000) (xend-relocation-port 8002) (xend-address '10.10.1.147') (xend-relocation-address '10.10.1.147') (xend-relocation-hosts-allow '') (network-script network-bridge) (vif-script vif-bridge) (dom0-min-mem 196) (dom0-cpus 0) (vnc-listen '10.10.1.147') (vncpasswd 'xxxxxx')
I need the *FULL* log files I asked for, not merely a couple of lines. Please attach the full logs to this ticket. I also still need the guest configuration file.
I get the same problem : since the upgrade to 5.1, I am unable to successfully migrate from one server to an other. The setup has not changed, only 5.0 -> 5.1 was done (and reboot under kernel2.6.18-53.el5xen). During the migration, I alway get a problem during xm save (from /var/log/xen/xend.log) : [2007-12-24 09:53:18 xend 7487] DEBUG (XendCheckpoint:89) [xc_save]: /usr/lib/xen/bin/xc_save 22 3 0 0 1 [2007-12-24 09:53:18 xend 7487] INFO (XendCheckpoint:351) ERROR Internal error: Couldn't enable shadow mode [2007-12-24 09:53:18 xend 7487] INFO (XendCheckpoint:351) Save exit rc=1 [2007-12-24 09:53:18 xend 7487] ERROR (XendCheckpoint:133) Save failed on domain cube1 (3). Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 110, in save forkHelper(cmd, fd, saveInputHandler, False) File "/usr/lib/python2.4/site-packages/xen/xend/XendCheckpoint.py", line 339, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib/xen/bin/xc_save 22 3 0 0 1 failed [2007-12-24 09:53:18 xend.XendDomainInfo 7487] DEBUG (XendDomainInfo:1598) XendDomainInfo.resumeDomain(3) [2007-12-24 09:53:18 xend.XendDomainInfo 7487] WARNING (XendDomainInfo:923) Domain has crashed: name=migrating- cube1 id=3. [2007-12-24 09:53:18 xend.XendDomainInfo 7487] INFO (XendDomainInfo:1719) Dev 51712 still active, looping...
I have successfully solved this problem by adding a dom0_mem parameter to the kernel xen line on the domain 0 servers : title CentOS (2.6.18-92.1.6.el5xen) root (hd0,0) kernel /xen.gz-2.6.18-92.1.6.el5 dom0_mem=256M com2=57600,8n1 console=com2 module /vmlinuz-2.6.18-92.1.6.el5xen ro root=/dev/vg00/root xencons=xvc console=xvc0 module /initrd-2.6.18-92.1.6.el5xen.img It seams that without this parameter, domain0 tries to allocate automatically physical memory between dom0 and domU, but this seams to fails during migration. With this parameter, dom0 receive only the needed memory (here 256M) and I don't get migration failure anymore. Also this is probably a better setting to limit dom0 memory.
Artificially limiting Dom0 is not an acceptable fix to this issue. To fix this properly we need the complete log files and config files from a time immediately after the migration failed - /var/log/xen/xend.log from the source host - /var/log/xen/xend-error.log fromthe source host - /var/log/xen/xend.log from the destination host - /var/log/xen/xend-error.log from the destination host - The /etc/xen/[DOMAIN NAME] config file for the guest in question - The /etc/xen/xend-config.sxp file from both hosts. And also 'xm info' output from both nods, and 'xm list --long' output from both nodes
Actually, we recently put a patch into RHEL-5.4 that reduces the likelihood of live migration failing due to fragmentation (https://bugzilla.redhat.com/show_bug.cgi?id=469130). Given that limiting dom0 memory helped, this actually could explain this situation. Is there any chance one of the original reporters can boot their dom0 with the latest kernel here: http://people.redhat.com/dzickus/el5/ And see if it improves the situation? Chris Lalancette
No response from the reporters in many months, and I believe this issue is now fixed in 5.4. I'm going to close this out as CURRENTRELEASE; if it is still a problem, please feel free to reopen the bug. Chris Lalancette
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).