Bug 241820

Summary:	Automated LVM tagging needed for HA LVM and volume ownership
Product:	[Retired] Red Hat Cluster Suite	Reporter:	Corey Marthaler <cmarthal>
Component:	rgmanager	Assignee:	LVM and device-mapper development team <lvm-team>
Status:	CLOSED WONTFIX	QA Contact:	Cluster QE <mspqa-list>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4	CC:	cluster-maint, fbijlsma, jbrassow
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2010-01-27 18:55:54 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Corey Marthaler 2007-05-30 20:27:10 UTC

Description of problem:
This may be a user error, but inorder for me to get the lvm devices to activate
and start on a given cluster node, I had to put tags on the devices for the
first node in the failover domain. After the services were started on that node,
relocation to other nodes failed because that device couldn't be activated on
the new node, presumably due to that manually added tag which I thought was to
be deleted and re-added at relocation time by lvm.sh. Here's the error I see:

May 30 14:49:57 link-08 clurgmgrd[12870]: <notice> Stopping service serv2
May 30 14:49:57 link-08 clurgmgrd: [12870]: <err> stop: Could not match
/dev/corey/mirror2 with a real device
May 30 14:49:57 link-08 clurgmgrd[12870]: <notice> stop on fs:FS2 returned 2
(invalid argument(s))
May 30 14:49:57 link-08 clurgmgrd[12870]: <crit> #12: RG serv2 failed to stop;
intervention required
May 30 14:49:57 link-08 clurgmgrd[12870]: <notice> Service serv2 is failed

 
[root@link-08 ~]# lvs -a -o +tags
  LV                 VG         Attr   LSize  Origin Snap%  Move Log         
Copy%  LV Tags
  LogVol00           VolGroup00 -wi-ao 72.44G
  LogVol01           VolGroup00 -wi-ao  1.94G
  mirror1            corey      mwi---  2.00G                    mirror1_mlog  
     link-02
  [mirror1_mimage_0] corey      iwi---  2.00G
  [mirror1_mimage_1] corey      iwi---  2.00G
  [mirror1_mimage_2] corey      iwi---  2.00G
  [mirror1_mlog]     corey      lwi---  4.00M
  mirror2            corey      mwi---  2.00G                    mirror2_mlog  
     link-02
  [mirror2_mimage_0] corey      iwi---  2.00G
  [mirror2_mimage_1] corey      iwi---  2.00G
  [mirror2_mimage_2] corey      iwi---  2.00G
  [mirror2_mlog]     corey      lwi---  4.00M
  mirror3            corey      mwi---  2.00G                    mirror3_mlog  
     link-02
  [mirror3_mimage_0] corey      iwi---  2.00G
  [mirror3_mimage_1] corey      iwi---  2.00G
  [mirror3_mimage_2] corey      iwi---  2.00G
  [mirror3_mlog]     corey      lwi---  4.00M
  mirror4            corey      mwi---  2.00G                    mirror4_mlog  
     link-02
  [mirror4_mimage_0] corey      iwi---  2.00G
  [mirror4_mimage_1] corey      iwi---  2.00G
  [mirror4_mimage_2] corey      iwi---  2.00G
  [mirror4_mlog]     corey      lwi---  4.00M
[root@link-08 ~]# clustat
Member Status: Quorate

  Member Name                              Status
  ------ ----                              ------
  link-02                                  Online, rgmanager
  link-04                                  Online, rgmanager
  link-07                                  Online, rgmanager
  link-08                                  Online, Local, rgmanager

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  serv1                link-02                        started
  serv2                (link-08)                      failed
  serv3                link-02                        started
  serv4                link-02                        started

Comment 1 Jonathan Earl Brassow 2007-06-01 15:37:14 UTC

Can't remember, did we think it was the tagging, or was it the multiple
resources using the same VG?

Might want to add the resource manager section of cluster.conf to this bug report.

Comment 2 Corey Marthaler 2007-06-01 21:16:47 UTC

After discussing this with Jon, this may be a documentation issue, but is none
the less still confusing...

The problem is that, for HA LVM, the user has to add a tag inorder to create a
mirror (or else the creation will fail), and this results in the mirror volume
being active. However, when starting HA services, if a volume is active with a
tag on it, the service will fail due to ownership issues. What needs to happen
is that all volumes have to be deactivated before HA services can be started.
It's also not a bad idea to delete those newly created tags as well, but not
necessary (as I've found) as long as the volume isn't currently active, because
the lvm script will remove tags.

Comment 3 Jonathan Earl Brassow 2007-07-11 16:54:06 UTC

Changing $SUBJECT to address the underlying issue:

1) We need lvm to automatically tag volumes with a machine-unique tag.
2) Need to tighten the rules on tags so that machines cannot do things like
delete another node's LVs

Unique tagging would act like volume group ownership if properly implemented. 
One big question remaining is, where can we get a unique ID from?  It has to be
something that the installer can generate as well.

Comment 4 Jonathan Earl Brassow 2010-01-27 18:55:54 UTC

A completely different solution is envisioned for future HA LVM.  There is no longer a push to enhance what tagging does or make the HA steps easier.

The future version will use CLVM, where single machine targets are used when logical volumes are activated exclusively.