Bug 729481

Summary: rgmanager should detect when clvmd is not running when using HA LVM w/ clvm locking
Product: Red Hat Enterprise Linux 6 Reporter: Corey Marthaler <cmarthal>
Component: resource-agentsAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: low Docs Contact:
Priority: low    
Version: 6.1CC: agk, cfeist, cluster-maint, mjuricek
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: resource-agents-3.9.2-11.el6 Doc Type: Bug Fix
Doc Text:
No documentation needed
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 14:38:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 756082    
Attachments:
Description Flags
Patch to fix problem and provide better error reporting none

Description Corey Marthaler 2011-08-09 20:58:57 UTC
Description of problem:
The preferred method for HA LVM locking is now clvm instead of volume_list tags, even though both are supported. So when using a clvm configuration, rgmanager should be able to detect when clvmd is not running. If it's not, rgmanager currently assumes it's an invalid HA config due to the fact that there's no volume_list tagging section in lvm.conf. Would is be possible to check if the VG(s) specified in the rm cluster.conf section have the cluster attribute, and then, if clvmd is not running use that as a more helpful error message.


Aug  9 15:45:29 taft-03 rgmanager[2410]: I am node #3
Aug  9 15:45:29 taft-03 rgmanager[2410]: Resource Group Manager Starting
Aug  9 15:45:29 taft-03 rgmanager[2410]: Loading Service Data
Aug  9 15:45:35 taft-03 rgmanager[2410]: Initializing Services
Aug  9 15:45:36 taft-03 rgmanager[3331]: [fs] stop: Could not match /dev/TAFT1/ha with a real device
Aug  9 15:45:36 taft-03 rgmanager[2410]: stop on fs "fs1" returned 2 (invalid argument(s))
Aug  9 15:45:36 taft-03 rgmanager[3368]: [fs] stop: Could not match /dev/TAFT3/ha with a real device
Aug  9 15:45:36 taft-03 rgmanager[2410]: stop on fs "fs3" returned 2 (invalid argument(s))
Aug  9 15:45:36 taft-03 rgmanager[3376]: [fs] stop: Could not match /dev/TAFT2/ha with a real device
Aug  9 15:45:36 taft-03 rgmanager[2410]: stop on fs "fs2" returned 2 (invalid argument(s))
Aug  9 15:45:37 taft-03 rgmanager[3471]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:37 taft-03 rgmanager[3486]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:37 taft-03 rgmanager[3517]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:37 taft-03 rgmanager[3503]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:37 taft-03 rgmanager[3606]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:38 taft-03 rgmanager[3664]: [lvm] WARNING: An improper setup can cause data corruption!
Aug  9 15:45:38 taft-03 rgmanager[3680]: [lvm] WARNING: An improper setup can cause data corruption!
Aug  9 15:45:38 taft-03 rgmanager[3657]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:38 taft-03 rgmanager[3732]: [lvm] WARNING: An improper setup can cause data corruption!
Aug  9 15:45:41 taft-03 rgmanager[2410]: Services Initialized
Aug  9 15:45:41 taft-03 rgmanager[2410]: State change: Local UP
Aug  9 15:45:41 taft-03 rgmanager[2410]: State change: taft-01 UP
Aug  9 15:45:41 taft-03 rgmanager[2410]: State change: taft-02 UP
Aug  9 15:45:41 taft-03 rgmanager[2410]: State change: taft-04 UP
Aug  9 15:45:41 taft-03 rgmanager[2410]: Starting stopped service service:halvm1
Aug  9 15:45:41 taft-03 rgmanager[2410]: Starting stopped service service:halvm2
Aug  9 15:45:41 taft-03 rgmanager[2410]: Starting stopped service service:halvm3
Aug  9 15:45:42 taft-03 rgmanager[3863]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:43 taft-03 rgmanager[3891]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:43 taft-03 rgmanager[3885]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:43 taft-03 rgmanager[2410]: start on lvm "lvm2" returned 1 (generic error)
Aug  9 15:45:43 taft-03 rgmanager[3902]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:43 taft-03 rgmanager[2410]: #68: Failed to start service:halvm2; return value: 1
Aug  9 15:45:43 taft-03 rgmanager[2410]: Stopping service service:halvm2
Aug  9 15:45:43 taft-03 rgmanager[3946]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:43 taft-03 rgmanager[2410]: start on lvm "lvm1" returned 1 (generic error)
Aug  9 15:45:43 taft-03 rgmanager[2410]: #68: Failed to start service:halvm1; return value: 1
Aug  9 15:45:43 taft-03 rgmanager[3960]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:43 taft-03 rgmanager[2410]: Stopping service service:halvm1
Aug  9 15:45:43 taft-03 rgmanager[2410]: start on lvm "lvm3" returned 1 (generic error)
Aug  9 15:45:43 taft-03 rgmanager[2410]: #68: Failed to start service:halvm3; return value: 1
Aug  9 15:45:43 taft-03 rgmanager[2410]: Stopping service service:halvm3
Aug  9 15:45:43 taft-03 rgmanager[3996]: [fs] stop: Could not match /dev/TAFT2/ha with a real device
Aug  9 15:45:43 taft-03 rgmanager[2410]: stop on fs "fs2" returned 2 (invalid argument(s))
Aug  9 15:45:43 taft-03 rgmanager[4046]: [fs] stop: Could not match /dev/TAFT1/ha with a real device
Aug  9 15:45:43 taft-03 rgmanager[2410]: stop on fs "fs1" returned 2 (invalid argument(s))
Aug  9 15:45:44 taft-03 rgmanager[4090]: [fs] stop: Could not match /dev/TAFT3/ha with a real device
Aug  9 15:45:44 taft-03 rgmanager[2410]: stop on fs "fs3" returned 2 (invalid argument(s))
Aug  9 15:45:45 taft-03 rgmanager[4149]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:45 taft-03 rgmanager[4165]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:45 taft-03 rgmanager[4173]: [lvm] HA LVM:  Improper setup detected
Aug  9 15:45:45 taft-03 rgmanager[4188]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:45 taft-03 rgmanager[4231]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:45 taft-03 rgmanager[4237]: [lvm] * "volume_list" not specified in lvm.conf.
Aug  9 15:45:45 taft-03 rgmanager[4253]: [lvm] WARNING: An improper setup can cause data corruption!
Aug  9 15:45:46 taft-03 rgmanager[4297]: [lvm] WARNING: An improper setup can cause data corruption!
Aug  9 15:45:46 taft-03 rgmanager[4308]: [lvm] WARNING: An improper setup can cause data corruption!
Aug  9 15:45:48 taft-03 rgmanager[2410]: #12: RG service:halvm1 failed to stop; intervention required
Aug  9 15:45:48 taft-03 rgmanager[2410]: Service service:halvm1 is failed
Aug  9 15:45:48 taft-03 rgmanager[2410]: #13: Service service:halvm1 failed to stop cleanly
Aug  9 15:45:48 taft-03 rgmanager[2410]: #12: RG service:halvm3 failed to stop; intervention required
Aug  9 15:45:48 taft-03 rgmanager[2410]: Service service:halvm3 is failed
Aug  9 15:45:48 taft-03 rgmanager[2410]: #13: Service service:halvm3 failed to stop cleanly
Aug  9 15:45:49 taft-03 rgmanager[2410]: #12: RG service:halvm2 failed to stop; intervention required
Aug  9 15:45:49 taft-03 rgmanager[2410]: Service service:halvm2 is failed
Aug  9 15:45:49 taft-03 rgmanager[2410]: #13: Service service:halvm2 failed to stop cleanly



Version-Release number of selected component (if applicable):
rgmanager-3.0.12.1-2.el6.x86_64

Comment 1 Jonathan Earl Brassow 2011-08-11 20:48:20 UTC
I think it does this already...  Are you should you have a 'c' attribute on the volume group?

start)
        if ! [[ $(vgs -o attr --noheadings $OCF_RESKEY_vg_name) =~ .....c ]]; then
                ha_lvm_proper_setup_check || exit 1
        fi

Comment 2 Corey Marthaler 2011-08-12 19:06:27 UTC
Yep. If you have clvmd running, the services work, if you don't, it complains that there's an invalid configuration.

[root@taft-01 ~]# vgs -a -o +devices
  VG        #PV #LV #SN Attr   VSize   VFree   Devices
  TAFT1       3   1   0 wz--nc 203.48g 187.48g ha_mimage_0(0),ha_mimage_1(0)
  TAFT1       3   1   0 wz--nc 203.48g 187.48g /dev/sdh2(0)
  TAFT1       3   1   0 wz--nc 203.48g 187.48g /dev/sdh1(0)
  TAFT1       3   1   0 wz--nc 203.48g 187.48g /dev/sdg2(0)
  TAFT2       3   1   0 wz--nc 203.48g 187.48g ha_mimage_0(0),ha_mimage_1(0)
  TAFT2       3   1   0 wz--nc 203.48g 187.48g /dev/sdg1(0)
  TAFT2       3   1   0 wz--nc 203.48g 187.48g /dev/sdf2(0)
  TAFT2       3   1   0 wz--nc 203.48g 187.48g /dev/sdf1(0)
  TAFT3       3   1   0 wz--nc 203.48g 187.48g ha_mimage_0(0),ha_mimage_1(0)
  TAFT3       3   1   0 wz--nc 203.48g 187.48g /dev/sde2(0)
  TAFT3       3   1   0 wz--nc 203.48g 187.48g /dev/sde1(0)
  TAFT3       3   1   0 wz--nc 203.48g 187.48g /dev/sdd2(0)
  TAFT4       3   1   0 wz--nc 203.48g 187.48g ha_mimage_0(0),ha_mimage_1(0)
  TAFT4       3   1   0 wz--nc 203.48g 187.48g /dev/sdd1(0)
  TAFT4       3   1   0 wz--nc 203.48g 187.48g /dev/sdc2(0)
  TAFT4       3   1   0 wz--nc 203.48g 187.48g /dev/sdc1(0)

[root@taft-01 ~]# service rgmanager stop
Stopping Cluster Service Manager:                          [  OK  ]

[root@taft-01 ~]# service clvmd stop
Deactivating clustered VG(s):   0 logical volume(s) in volume group "TAFT1" now active
  0 logical volume(s) in volume group "TAFT2" now active
  0 logical volume(s) in volume group "TAFT3" now active
  clvmd not running on node taft-02
  0 logical volume(s) in volume group "TAFT4" now active
  clvmd not running on node taft-02
                                                           [  OK  ]
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

[root@taft-01 ~]# service cman status
cluster is running.
[root@taft-01 ~]# service rgmanager start
Starting Cluster Service Manager:                          [  OK  ]

Aug 12 13:52:24 taft-01 rgmanager[10551]: I am node #1
Aug 12 13:52:24 taft-01 rgmanager[10551]: Resource Group Manager Starting
Aug 12 13:52:24 taft-01 rgmanager[10551]: Loading Service Data
Aug 12 13:52:30 taft-01 rgmanager[10551]: Initializing Services
Aug 12 13:52:31 taft-01 rgmanager[11535]: [fs] stop: Could not match /dev/TAFT1/ha with a real device
Aug 12 13:52:31 taft-01 rgmanager[10551]: stop on fs "fs1" returned 2 (invalid argument(s))
Aug 12 13:52:31 taft-01 rgmanager[11559]: [fs] stop: Could not match /dev/TAFT2/ha with a real device
Aug 12 13:52:31 taft-01 rgmanager[10551]: stop on fs "fs2" returned 2 (invalid argument(s))
Aug 12 13:52:31 taft-01 rgmanager[11601]: [fs] stop: Could not match /dev/TAFT4/ha with a real device
Aug 12 13:52:31 taft-01 rgmanager[10551]: stop on fs "fs4" returned 2 (invalid argument(s))
Aug 12 13:52:31 taft-01 rgmanager[11611]: [fs] stop: Could not match /dev/TAFT3/ha with a real device
Aug 12 13:52:31 taft-01 rgmanager[10551]: stop on fs "fs3" returned 2 (invalid argument(s))
Aug 12 13:52:32 taft-01 rgmanager[11726]: [lvm] HA LVM:  Improper setup detected
Aug 12 13:52:33 taft-01 rgmanager[11748]: [lvm] * "volume_list" not specified in lvm.conf.
Aug 12 13:52:33 taft-01 rgmanager[11762]: [lvm] HA LVM:  Improper setup detected
Aug 12 13:52:33 taft-01 rgmanager[11787]: [lvm] HA LVM:  Improper setup detected
Aug 12 13:52:33 taft-01 rgmanager[11790]: [lvm] HA LVM:  Improper setup detected
Aug 12 13:52:33 taft-01 rgmanager[11784]: [lvm] WARNING: An improper setup can cause data corruption!
Aug 12 13:52:33 taft-01 rgmanager[11829]: [lvm] * "volume_list" not specified in lvm.conf.
Aug 12 13:52:34 taft-01 rgmanager[11863]: [lvm] * "volume_list" not specified in lvm.conf.
Aug 12 13:52:34 taft-01 rgmanager[11881]: [lvm] * "volume_list" not specified in lvm.conf.
Aug 12 13:52:34 taft-01 rgmanager[11902]: [lvm] WARNING: An improper setup can cause data corruption!
Aug 12 13:52:34 taft-01 rgmanager[11941]: [lvm] WARNING: An improper setup can cause data corruption!
Aug 12 13:52:34 taft-01 rgmanager[11955]: [lvm] WARNING: An improper setup can cause data corruption!
Aug 12 13:52:39 taft-01 rgmanager[10551]: Services Initialized
Aug 12 13:52:39 taft-01 rgmanager[10551]: State change: Local UP
Aug 12 13:52:39 taft-01 rgmanager[10551]: State change: taft-02 UP
Aug 12 13:52:39 taft-01 rgmanager[10551]: State change: taft-03 UP
Aug 12 13:52:39 taft-01 rgmanager[10551]: State change: taft-04 UP

Comment 4 Jonathan Earl Brassow 2012-03-05 19:56:10 UTC
Created attachment 567742 [details]
Patch to fix problem and provide better error reporting

Comment 8 Chris Feist 2012-04-30 22:01:11 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No documentation needed

Comment 10 errata-xmlrpc 2012-06-20 14:38:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0947.html