Description of problem: When creating many domains and doing reboots, not all domains are started. message from /var/log/xen/xend.log [2007-02-13 08:41:29 xend 2209] INFO (image:138) buildDomain os=linux dom=7 vcpus=1 [2007-02-13 08:41:35 xend 2209] INFO (image:214) configuring linux guest [2007-02-13 08:41:37 xend 2209] INFO (image:138) buildDomain os=linux dom=8 vcpus=1 [2007-02-13 08:41:39 xend 2209] INFO (XendDomain:370) Domain vm_5 (8) unpaused. [2007-02-13 08:41:39 xend.XendDomainInfo 2209] WARNING (XendDomainInfo:875) Domain has crashed: name=vm_4 id=7. [2007-02-13 08:41:40 xend.XendDomainInfo 2209] ERROR (XendDomainInfo:1661) VM vm_4 restarting too fast (13.252752 seconds since the last restart). Refusing to restart to avoid loops. Version-Release number of selected component (if applicable): Version=5 beta 2 Hardware=ibmx306m Memory=3GB CPU=Intel(R) Pentium(R) 4 CPU 3.00GHz (no HT/SMP enabled) xen-libs-3.0.3-8.el5 xen-3.0.3-8.el5 kernel-xen-2.6.18-1.2747.el5 How reproducible: everytime some VMs are missing. Steps to Reproduce: - ks install server using base + @virtualisation packages - ks install 9 guests using virt-install - sed -ie 's/XENDOMAINS_SAVE=.*/XENDOMAINS_SAVE=/' /etc/sysconfig/xendomains # this does a shutdown instead of suspend - ln -s /etc/xen/MY_VMS_* /etc/xen/auto - do reboot - after reboot verify that all VMs are started Actual results: I did 13 reboots. This is how often each VM came up automatically. vm_1 13 vm_2 13 vm_3 12 vm_4 11 vm_5 9 vm_6 6 vm_7 5 vm_8 6 vm_9 6 So vm_7 only started 5 times at 13 tries. Expected results: all machines are started at every reboot Additional info: fc6 with 2.6.19 kernel has similar behaviour. The "Domain has crashed:" entries
Hmm, this is a little worrying - if it can't deal with multiple VMs starting in very quick succession it sounds like there is some race condition/scalability issue hiding in either HV or the XenD stack. Can you reproduce this again & capture the output of 'xm dmesg' once booting has completed - this will hopefull show if there are any hypervisor issues being reported. Also can you attach the full /var/log/xen/xend.log, /var/log/xen/xend-debug.log, /var/log/xen/xen-hotplug.log and finally if any are HVM guests, also the qemu-dm-*.log files Finally, can you attach the /etc/xen config file for at least one of the guests - if they are all basically the same config one is sufficient - if every VM is different upload a representative set.
Created attachment 151099 [details] tgz logs and config I am using only RHEL5 xen-guests, no HVM. (see first post) [root@rhrc1s1 x]# crontab -l 01,31 * * * * /usr/sbin/xm list| logger -t XEN1 14,44 * * * * /usr/sbin/xm list| logger -t XEN2 15,45 * * * * /sbin/reboot [root@rhrc1s1 x]# uname -a Linux rhrc1s1 2.6.18-8.el5xen #1 SMP Fri Jan 26 14:42:21 EST 2007 i686 i686 i386 GNU/Linux xm dmesg >var/log/xen/xm.dmesg.out dmesg >var/log/xen/dmesg.out The tgz file contains /var/log/xen/* /etc/xen/*
We did some work in 5.1 to make this less likely to happen, but I'm not sure if it is completely fixed. Is this still a problem? Thanks, Chris Lalancette
Well, I have tried it using my SRPMS that can be found at http://people.redhat.com/minovotn/xen and I found no problem, I booted 9 domains total and all the domains booted correctly when testing on my box. The configuration was 4 PV and 5 FV machines...
Michal, I am unable to reproduce the problem with RH 5.3 after setting dom0_mem=512M in grub.conf. My tests ran 20 256 RH5.3 64 bit udoms. Please set the state to fixed.
OK, thanks for the testing! Will close as FIXED. Chris Lalancette