Description of problem: When I issue a manual fence_xvm command to test fencing a Xen VM, the instance is shut down but no restarted. If I run "fence_xvmd -fdddd" for debugging I see the following output: Domain UUID Owner State ------ ---- ----- ----- Domain-0 00000000-0000-0000-0000-000000000000 00001 00001 test ad8942f2-66a7-707c-765f-abe7ad5b06a9 00001 00002 Storing test Request to fence: test test is running locally Plain TCP request ipv4_connect: Connecting to client ipv4_connect: Success; fd = 11 Rebooting domain test... [[ XML Domain Info ]] <domain type='xen' id='1'> <name>test</name> <uuid>ad8942f2-66a7-707c-765f-abe7ad5b06a9</uuid> <os> <type>linux</type> <kernel>/boot/vmlinuz-2.6.18-92.1.22.el5xen</kernel> <initrd>/boot/initrd-2.6.18-92.1.22.el5xen-no-scsi.img</initrd> <root>/dev/sda5</root> </os> <memory>524288</memory> <vcpu>1</vcpu> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <interface type='bridge'> <source bridge='xenbr0'/> <target dev='vif1.0'/> <mac address='00:16:3E:24:D0:80'/> <script path='vif-bridge'/> </interface> <disk type='block' device='disk'> <driver name='phy'/> <source dev='sda5'/> <target dev='sda5'/> </disk> <disk type='block' device='disk'> <driver name='phy'/> <source dev='sda6'/> <target dev='sda6'/> </disk> <console tty='/dev/pts/2'/> </devices> </domain> [[ XML END ]] Virtual machine is Linux Unlinkiking os block [[ XML Domain Info (modified) ]] <?xml version="1.0"?> <domain type="xen" id="1"> <name>test</name> <uuid>ad8942f2-66a7-707c-765f-abe7ad5b06a9</uuid> <memory>524288</memory> <vcpu>1</vcpu> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <interface type="bridge"> <source bridge="xenbr0"/> <target dev="vif1.0"/> <mac address="00:16:3E:24:D0:80"/> <script path="vif-bridge"/> </interface> <disk type="block" device="disk"> <driver name="phy"/> <source dev="sda5"/> <target dev="sda5"/> </disk> <disk type="block" device="disk"> <driver name="phy"/> <source dev="sda6"/> <target dev="sda6"/> </disk> <console tty="/dev/pts/2"/> </devices> </domain> [[ XML END ]] [REBOOT] Calling virDomainDestroy(0xdbd710) Domain has been shut off Calling virDomainCreateLinux()... libvir: XML error : missing operating system information for test libvir: Xen Daemon error : XML description for domain is not well formed or invalid Version-Release number of selected component (if applicable): cman-2.0.84-2.el5_2.3 How reproducible: Every time Steps to Reproduce: 1. Create a trivial 1-node Dom0 cluster with fence_xvmd set to run 2. Create a trivial 2-node DomU cluster 3. Create your fence_xvm keys across all nodes 4. Manually run fence_xvm -H <ArbritrayDomUhostname> on Dom0 Actual results: The DomU is destroyed but not recreated, meaning this has to be done manually Expected results: DomU should be destroyed and recreated automatically. Additional info:
So, it looks like this was introduced with the rebase from libvirt 0.2.x to 0.3.x. The solution is to try both ways: * First, try virDomainCreateLinux() assuming the unmodified domain description will work, * after that, remove the <os/> block as was previously required and attempt to do it that way. This is important, but as I have found, not deemed 'critical' since the most important function of fencing is 'off'. 'On' (i.e. the other half of reboot) is not a critical action from a cluster perspective.
Created attachment 329116 [details] Fix Patch which implements a fix.
Created attachment 329117 [details] Logs Note that the fix works (the domain is still operational and was restarted). Furthermore, virDomainCreateLinux() works with the unaltered XML description. Unfortunately, it appears virDomainCreateLinux() doesn't return a successful return code.
Created attachment 329119 [details] Fixed patch. Corrected fix. Logic error.
I have been unable to reproduce on libvirt versions going back to 0.1.8 from the RHEL5 channel.
http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=c635f24c2fe4e07dd40c27dc7c44629b6ddbf045
Cause: Attempting to reboot a VM using fence_xvm Consequence: The VM would remain shut off in stead of restarting. Fix: An issue was addressed preventing correct VM creation. Result: The VM is now correctly restarted when an administrator wishes for the domain to reboot.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1341.html