Bug 472785

Summary: Add /sbin/fence_libvirt script to cman package
Product: [Fedora] Fedora Reporter: James Laska <jlaska>
Component: cmanAssignee: Jan Friesse <jfriesse>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: 10CC: agk, bpeck, ccaulfie, cfeist, crobinso, fdinitto, jfriesse, jturner, lhh, mbroz, mdehaan, mjenner, swhiteho
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-21 06:03:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
/sbin/fence_libvirt
none
Modified version of fence agent none

Description James Laska 2008-11-24 16:32:29 UTC
Created attachment 324501 [details]
/sbin/fence_libvirt

Description of problem:

There is no /sbin/fence_libvirt script that I can find.  I'm looking for something analogous to the fence_lpar command that I use to control power mgmt on remote PowerPC hardware management systems.

The following patch adds a /sbin/fence_libvirt script.  I've tested this on an F10 libvirtd.

Version-Release number of selected component (if applicable):
cman-2.99.12-1.fc10.x86_64


# /sbin/fence_libvirt -x -a hostname -l USER -p PASSWORD -s GUEST -o status
Status: ON

# /sbin/fence_libvirt -x -a hostname -l USER -p PASSWORD -s GUEST -o off
Success: Powered OFF

# /sbin/fence_libvirt -x -a hostname -l USER -p PASSWORD -s GUEST -o on
Success: Powered ON

Comment 1 Fabio Massimo Di Nitto 2008-11-24 20:24:13 UTC
I am not sure I understand why we need an extra script when we have fence_xvmd.

Lon, you are the expert here...

Fabio

Comment 2 Lon Hohberger 2008-11-24 21:19:03 UTC
Really, it's a case of "six in one, half a dozen in the other".  I noted to send it to cluster-devel (for community review), but have not seen it float across the mailing list yet.  I don't have any reasons it shouldn't be included, which is kind of why I was going to defer it to the community.


This one trades some bare metal configuration for some guest configuration.  It:
 * Uses ssh and the standard python fencing library
 * Doesn't need a key file distributed (if you are going to share SSH keys then you might as well use fence_xvm[d]...)
 * Has no firewall implications (apart from ssh)
 * Doesn't require cman to be installed on the host - just libvirt and ssh
 * It also doesn't require a daemon to be running (though, we're talking about a whomping 2-2.5MB of RSS... i.e. not a lot; sshd+bash+virsh chews up tons more memory when this fence agent is operating)

By design, it is limited in what it can do.  It can't do the following:
 * Manage migratory VMs where the location of the VM is not known,
 * Respond to callers if the bare metal machine hosting it is dead + fenced,
 * Be managed (today) using Conga
 * Is lower-performance & more memory-hungry on the client side (between SSH connections, fork/exec/etc.)

I'd wager it's not something Red Hat would include in RHEL since it's redundant with fence_xvm[d], and a functional subset thereof.

Sample configuration looks like this:

  <fencedevices>
    <fencedevice agent="fence_libvirt" name="virsh_fence"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="foo" >
      <fence>
        <method name="1">
          <device name="virsh_fence" ipaddr="10.1.1.2" login="root"
                  passwd="xyzpdq" managed="domain_foo" />
        </method>
      </fence>
    </clusternode>
  </clusternodes>

Compared to the following with fence_xvm[d]:

  <fencedevices>
    <fencedevice agent="fence_xvm" name="xvm_fence"/>
  </fencedevices>
  <clusternodes>
    <clusternode name="foo" >
      <fence>
        <method name="1">
          <device name="xvm_fence" domain="domain_foo" />
        </method>
      </fence>
    </clusternode>
  </clusternodes>

Effectively, it saves key distribution which is required with fence_xvm[d] as well as adding fence_xvmd to rc.local for running in non-cluster mode.

My expectation is that James would maintain it if it gets accepted by the community.  I think it is likely that we currently do not have capacity to maintain two fence agents that are so close in function, especially once fence_xvm has native guest->host channel support (which would entirely obviate key distribution when using oVirt).

Comment 3 Lon Hohberger 2008-11-24 21:29:52 UTC
For completeness, sample fence_xvm command line using comment #1 as a template:

# /sbin/fence_xvm -H GUEST -o off

# /sbin/fence_xvm -H GUEST -o reboot

By design, fence_xvm does not support starting up guests because it's designed to operate in environments where another entity is responsible for starting / stopping virtual machines (oVirt, a parent cluster, Qumranet, etc.).  More specifically, there could be multiple machines upon which the guest is allowed to start, and it's not fence_xvm[d]'s job to "pick" a location.  That is, to start a guest, one would have to do it manually.

Given a set of virtual machines with static locations and no other management interaction, you can have "on" capability.

Comment 4 Michael DeHaan 2008-11-24 21:32:36 UTC
I can think of several reasons to have something libvirt-remote based instead.

(a)  users don't want to have to edit XML config files, as these suck
(b)  virsh is simpler and offers multiple transport options which are up to the user
(c)  xvm seems to imply Xen and we want something generic

Comment 5 Lon Hohberger 2008-11-24 22:56:24 UTC
(In reply to comment #4)
> I can think of several reasons to have something libvirt-remote based instead.
> 
> (a)  users don't want to have to edit XML config files, as these suck

(1) This isn't required by either agent in non-clustered environments; we provided command line examples for both.  

(2) This bugzilla is for a cluster fencing agent.  You have to edit the cluster config for the cluster to make use of fencing agents.  This happens to be XMLm but soon LDAP will be supported ;).  Virsh doesn't make this problem go away.


> (b)  virsh is simpler and offers multiple transport options which are up to the
> user

You're certainly correct about virsh allowing different transports.

For the rest of your point here, this would be "sort of" correct if we were comparing command-line use of virsh directly to fence_xvm or fence_libvirt to manage non-clustered virtual machines.

If command-line use is the target use case for fence_libvirt, then one might better use virsh directly instead of either fence_xvm[d] or fence_libvirt.

I.E.:

  virsh -c qemu://hostname/system destroy GUEST

... is functionally equivalent to:

  fence_libvirt -x -a hostname -l USER -p PASSWORD -s GUEST -o off

... and:

  fence_xvm -H GUEST

... and:

  ssh USER@hostname -c "virsh destroy GUEST"

... etc.


> (c)  xvm seems to imply Xen and we want something generic

Implications can be deceiving.  It's always called libvirt APIs directly and can work with any back-end that virsh can.

Comment 6 Lon Hohberger 2008-11-24 22:57:35 UTC
(Meant any back-end that libvirt can, since fence_xvm doesn't use virsh, but rather libvirt C APIs directly.)

Comment 7 Fabio Massimo Di Nitto 2008-11-25 05:11:27 UTC
Ok, I am happy to

Comment 8 Fabio Massimo Di Nitto 2008-11-25 05:13:36 UTC
I am absolutely happy to plug this in, given the different +1 opinions on this matter.

I added Honza to the CC list for a review of the agent. A few changes are required to be plugged in but nothing too complicated.

Honza: can you please make sure the agent conform to the required standards that Mark and you are working on?

I'll be happy to see this in master and STABLE2 branch before 2.99.13 release next week.

Fabio

Comment 9 Jan Friesse 2008-11-25 09:03:52 UTC
So my few words.

This agent uses our new infrastructure. This is nice. What is not so nice, that it has bad option for name of virtual machine (should be plug -n, not managed -s). This is very easy to fix.

Bigger problem is handling shell cmd-prompt. It took me very very long time to make it works correctly and relatively bullet-proof, and I really know it's still not  as good as it can be (try look on pexpect ssh module pxssh.py. They use Levenshtein distance and few new-lines to identify, what the hell is command prompt).

Next little problem is, that this code doesn't have support for list/monitor operation, but is really new feature.

I don't think, that is real for this code to go to RHEL tree, but it will be in master/STABLE2 for sure.

Comment 10 Jan Friesse 2008-11-25 17:17:27 UTC
Created attachment 324631 [details]
Modified version of fence agent

Attachment contains modified version of fence agent.

It contains following changes:
- Better support for shell command line handling
- Proper name of plug option
- Fix node info. Node can be not only running, but in blocked too. This basically means running, but waiting for I/O 
- Proper handling of nonexistent guest machines
- Automatically force using -x
- Support for list/monitor operations
- Man page

Now it's integrated in our master/STABLE2 tree.

Comment 11 Fabio Massimo Di Nitto 2008-11-25 17:54:49 UTC
Excellent work Jan.

Thanks
Fabio

Comment 12 Bug Zapper 2008-11-26 05:52:26 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping