Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 571557

Summary:	RHEL 5: Managing vms with clustertools and libvirt
Product:	Red Hat Enterprise Linux 5	Reporter:	Shane Bradley <sbradley>
Component:	doc-Cluster_Administration	Assignee:	Steven J. Levine <slevine>
Status:	CLOSED ERRATA	QA Contact:	ecs-bugs
Severity:	medium	Docs Contact:
Priority:	medium
Version:	5.4	CC:	abaron, iannis, mhideo, slevine, syeghiay, tao, teigland
Target Milestone:	rc	Keywords:	Documentation
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	746776 772324 (view as bug list)		Environment:
Last Closed:	2012-02-21 05:18:33 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	746776, 772324
Deadline:	2011-11-04

Description Shane Bradley 2010-03-08 20:29:52 UTC

Description of problem:
There is no locking that prevents a vm managed with clustertools from
being manually started on another node. Thus if vm1 is started with
clusvcadm on nodeA then someone starts vm1 on nodeB with
virsh(non-clustertools) then corruption can/will occur.

Version-Release number of selected component (if applicable):
rgmanager-2.0.52-1.el5_4.3
libvirt-0.6.3-20.1.el5_4

How reproducible:
Everytime

Steps to Reproduce:
1. Start vm as a clustered service on nodeA 
2. Start vm with virsh on nodeB
  
Actual results:
The vm will become corrupted because vm is running on two nodes at
same time.

Expected results:
The vm will fail to start on node if it is already running.

Additional info:
Here is updated doc from engineering that explains some of this
information.
http://sources.redhat.com/cluster/wiki/VirtualMachineBehaviors

In RHEL5.5 we have added a way to minimize the chances for this
occuring by adding tranisent domain support which included non-default
path location for config files. This could be used as a preventive
measure. https://bugzilla.redhat.com/show_bug.cgi?id=545916

Comment 3 Lon Hohberger 2010-03-23 19:30:17 UTC

While we layer on top of libvirt with the cluster tools, the converse is not true.  We do not layer the libvirt tools on top of the cluster stack for any purpose; libvirt is not designed nor intended to be cluster-aware.

Comment 6 Lon Hohberger 2010-03-24 19:59:20 UTC

One method to reduce the chances of "double-start" problems due to mistakes while using cluster/non-cluster tools in a clustered environment is to:

 * use rgmanager 2.0.52-1.el5_4.3 or later package release, and
 * store the virtual machine configuration files in a non-default location.

This non-default location may reside anywhere.  The advantage of using an NFS share or a shared GFS or GFS2 file system is that the administrator does not need to keep the configuration files in sync across the cluster members.  However, it is also permissible to use a local directory as long as the administrator keeps the contents synchronized somehow cluster-wide.

In the cluster configuration, virtual machines may reference this non-default location by using the "path" attribute. (Note: The 'path' attribute is a directory or set of directories separated by the colon ':' character, not a path to a specific file!)

This makes it more difficult to accidentally "start" a VM using xm or virsh, as the config file will be unknown out of the box to libvirt or the xm tool.

Comment 14 Steven J. Levine 2011-07-12 17:06:30 UTC

John -- I'm just looking at this for the first time, as part of a final RHEL 5.7 check -- it wasn't assigned to me so I had missed it. I'm not sure where this will go for RHEL 5 (where it goes in RHEL 6 is also uncertain, although it looks as if I can incorporate it into the new Managing Virtual Machines document that I'm working on, if it applies to that release as well). But I think there's no way we're going to address this for RHEL 5.7 this week.

Can we move this to RHEL 5.8?

Comment 19 Steven J. Levine 2011-11-01 19:42:14 UTC

I sent this note to Lon on 10/15, so I'm moving this to NEEDINFO:

Lon,

In response to BZ#571557, I am adding a small section to the RHEL 5
Cluster Administration document on configuring virtual machines in a
cluster, with the information you provided in that bug.

Could you look over what I've written?

http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/s1-vm-considerations-CA.html

I also added an "Important" note before Table C.21, the table of virtual
machine resource parameters:

http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/ap-ha-resource-params-CA.html#tb-vm-resource-CA

After you review this, I'll write something similar up for RHEL 6 -- but
I need to check on what virtual machine tools we support -- we don't
support virsh in RHEL 6, do we?

-Steven

Comment 20 Lon Hohberger 2011-12-19 22:17:52 UTC

The first section looks great.  Second section looks fine, but maybe we should indicate the corruption is specifically in the VMs themselves.

Comment 21 Steven J. Levine 2011-12-21 15:30:06 UTC

Lon:

The second section, the "Important" note, includes this sentence:

"Using virsh or libvirt tools to start the machine can result in the virtual machine running in more than one place, which can cause data corruption."

Would adding just a few word clarify this:

"Using virsh or libvirt tools to start the machine can result in the virtual machine running in more than one place, which can cause data corruption in the
virtual machine."

That same sentence also appears in the first section, so I would add this in both places.

-Steven

Comment 22 Steven J. Levine 2012-01-10 17:58:46 UTC

Latest build:

Red_Hat_Enterprise_Linux-Cluster_Administration-5-web-en-US-5-49.el6eng

New sentence noted in Comment 21 is here:

http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/s1-vm-considerations-CA.html

And here:

http://documentation-stage.bne.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Cluster_Administration/ap-ha-resource-params-CA.html#tb-vm-resource-CA

As an "Important" note preceding table C.21.

Comment 24 Steven J. Levine 2012-01-24 20:30:06 UTC

Fixed and checked in, but I can't put this back into ON_QA until we can build on Brew.

Comment 28 errata-xmlrpc 2012-02-21 05:18:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0175.html