Bug 480178 - fence_xvmd Fails to Reboot VM
Summary: fence_xvmd Fails to Reboot VM
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.2
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: 5.3
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-01-15 16:09 UTC by Gavin Edwards
Modified: 2009-09-02 11:06 UTC (History)
7 users (show)

Fixed In Version: cman-2.0.100-1.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-09-02 11:06:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Fix (2.40 KB, patch)
2009-01-15 18:02 UTC, Lon Hohberger
no flags Details | Diff
Logs (3.76 KB, text/plain)
2009-01-15 18:04 UTC, Lon Hohberger
no flags Details
Fixed patch. (2.61 KB, patch)
2009-01-15 18:17 UTC, Lon Hohberger
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:1341 0 normal SHIPPED_LIVE Low: cman security, bug fix, and enhancement update 2009-09-01 10:43:16 UTC

Description Gavin Edwards 2009-01-15 16:09:10 UTC
Description of problem:
When I issue a manual fence_xvm command to test fencing a Xen VM, the instance is shut down but no restarted.

If I run "fence_xvmd -fdddd" for debugging I see the following output:
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
test                     ad8942f2-66a7-707c-765f-abe7ad5b06a9 00001 00002
Storing test
Request to fence: test
test is running locally
Plain TCP request
ipv4_connect: Connecting to client
ipv4_connect: Success; fd = 11
Rebooting domain test...
[[ XML Domain Info ]]
<domain type='xen' id='1'>
  <name>test</name>
  <uuid>ad8942f2-66a7-707c-765f-abe7ad5b06a9</uuid>
  <os>
    <type>linux</type>
    <kernel>/boot/vmlinuz-2.6.18-92.1.22.el5xen</kernel>
    <initrd>/boot/initrd-2.6.18-92.1.22.el5xen-no-scsi.img</initrd>
    <root>/dev/sda5</root>
  </os>
  <memory>524288</memory>
  <vcpu>1</vcpu>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <interface type='bridge'>
      <source bridge='xenbr0'/>
      <target dev='vif1.0'/>
      <mac address='00:16:3E:24:D0:80'/>
      <script path='vif-bridge'/>
    </interface>
    <disk type='block' device='disk'>
      <driver name='phy'/>
      <source dev='sda5'/>
      <target dev='sda5'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='phy'/>
      <source dev='sda6'/>
      <target dev='sda6'/>
    </disk>
    <console tty='/dev/pts/2'/>
  </devices>
</domain>

[[ XML END ]]
Virtual machine is Linux
Unlinkiking os block
[[ XML Domain Info (modified) ]]
<?xml version="1.0"?>
<domain type="xen" id="1">
  <name>test</name>
  <uuid>ad8942f2-66a7-707c-765f-abe7ad5b06a9</uuid>
  <memory>524288</memory>
  <vcpu>1</vcpu>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <interface type="bridge">
      <source bridge="xenbr0"/>
      <target dev="vif1.0"/>
      <mac address="00:16:3E:24:D0:80"/>
      <script path="vif-bridge"/>
    </interface>
    <disk type="block" device="disk">
      <driver name="phy"/>
      <source dev="sda5"/>
      <target dev="sda5"/>
    </disk>
    <disk type="block" device="disk">
      <driver name="phy"/>
      <source dev="sda6"/>
      <target dev="sda6"/>
    </disk>
    <console tty="/dev/pts/2"/>
  </devices>
</domain>

[[ XML END ]]
[REBOOT] Calling virDomainDestroy(0xdbd710)
Domain has been shut off
Calling virDomainCreateLinux()...
libvir: XML error : missing operating system information for test
libvir: Xen Daemon error : XML description for domain is not well formed or invalid

Version-Release number of selected component (if applicable):
cman-2.0.84-2.el5_2.3

How reproducible:
Every time

Steps to Reproduce:
1. Create a trivial 1-node Dom0 cluster with fence_xvmd set to run
2. Create a trivial 2-node DomU cluster
3. Create your fence_xvm keys across all nodes
4. Manually run fence_xvm -H <ArbritrayDomUhostname> on Dom0
  
Actual results:
The DomU is destroyed but not recreated, meaning this has to be done manually

Expected results:
DomU should be destroyed and recreated automatically.

Additional info:

Comment 1 Lon Hohberger 2009-01-15 17:57:00 UTC
So, it looks like this was introduced with the rebase from libvirt 0.2.x to 0.3.x.  The solution is to try both ways:

 * First, try virDomainCreateLinux() assuming the unmodified domain description will work,
 * after that, remove the <os/> block as was previously required and attempt to do it that way.

This is important, but as I have found, not deemed 'critical' since the most important function of fencing is 'off'.  'On' (i.e. the other half of reboot) is not a critical action from a cluster perspective.

Comment 2 Lon Hohberger 2009-01-15 18:02:54 UTC
Created attachment 329116 [details]
Fix

Patch which implements a fix.

Comment 3 Lon Hohberger 2009-01-15 18:04:44 UTC
Created attachment 329117 [details]
Logs

Note that the fix works (the domain is still operational and was restarted).  Furthermore, virDomainCreateLinux() works with the unaltered XML description.  Unfortunately, it appears virDomainCreateLinux() doesn't return a successful return code.

Comment 4 Lon Hohberger 2009-01-15 18:17:40 UTC
Created attachment 329119 [details]
Fixed patch.

Corrected fix.  Logic error.

Comment 5 Lon Hohberger 2009-01-15 18:23:42 UTC
I have been unable to reproduce on libvirt versions going back to 0.1.8 from the RHEL5 channel.

Comment 11 Lon Hohberger 2009-07-22 19:12:54 UTC
Cause: Attempting to reboot a VM using fence_xvm

Consequence: The VM would remain shut off in stead of restarting.

Fix: An issue was addressed preventing correct VM creation.

Result: The VM is now correctly restarted when an administrator wishes for the domain to reboot.

Comment 13 errata-xmlrpc 2009-09-02 11:06:33 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1341.html


Note You need to log in before you can comment on or make changes to this bug.