Bug 693855 - mirror device failure during HA lvm service relocation may cause service failure
Summary: mirror device failure during HA lvm service relocation may cause service failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rgmanager
Version: 5.6
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 683213
Blocks: 807971
TreeView+ depends on / blocked
 
Reported: 2011-04-05 18:42 UTC by Jonathan Earl Brassow
Modified: 2013-01-08 07:05 UTC (History)
6 users (show)

Fixed In Version: rgmanager-2.0.52-32.el5
Doc Type: Bug Fix
Doc Text:
Clone Of: 683213
Environment:
Last Closed: 2013-01-08 07:05:00 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:0026 0 normal SHIPPED_LIVE rgmanager bug fix update 2013-01-07 15:28:53 UTC

Description Jonathan Earl Brassow 2011-04-05 18:42:02 UTC
+++ This bug was initially created as a clone of Bug #683213 +++

Description of problem:
I killed a mirror leg device on the service owner, which then appeared to have cause the service to be relocated, but that relocation failed.

This is with the "old way" of using tags.

Old Owner:
Mar  8 14:30:04 taft-01 rgmanager[3335]: I am node #1
Mar  8 14:30:05 taft-01 rgmanager[3335]: Resource Group Manager Starting
Mar  8 14:30:05 taft-01 rgmanager[3335]: Loading Service Data
Mar  8 14:30:07 taft-01 rgmanager[3335]: Initializing Services
Mar  8 14:30:07 taft-01 rgmanager[3906]: stop: Could not match /dev/TAFT/ha1 with a real device
Mar  8 14:30:07 taft-01 rgmanager[3335]: stop on fs "fs1" returned 2 (invalid argument(s))
Mar  8 14:30:09 taft-01 rgmanager[3961]: Deactivating TAFT/ha1
Mar  8 14:30:09 taft-01 rgmanager[3983]: Making resilient : lvchange -an TAFT/ha1
Mar  8 14:30:09 taft-01 rgmanager[4008]: Resilient command: lvchange -an TAFT/ha1 --config devices{filter=["a|/dev/sda2|","a|/dev/sdb1|","a|/dev/sdc1|","a|/dev/sdd1|",
Mar  8 14:30:10 taft-01 rgmanager[4032]: Removing ownership tag (taft-01) from TAFT/ha1
Mar  8 14:30:11 taft-01 rgmanager[4156]: Unable to delete tag from TAFT/ha1
Mar  8 14:30:11 taft-01 rgmanager[4178]: Failed to stop TAFT/ha1
Mar  8 14:30:11 taft-01 rgmanager[4200]: Failed to stop TAFT/ha1
Mar  8 14:30:11 taft-01 rgmanager[3335]: stop on lvm "lvm" returned 1 (generic error)
Mar  8 14:30:11 taft-01 rgmanager[3335]: Services Initialized
Mar  8 14:30:11 taft-01 rgmanager[3335]: State change: Local UP
Mar  8 14:30:11 taft-01 rgmanager[3335]: State change: taft-02 UP
Mar  8 14:30:11 taft-01 rgmanager[3335]: State change: taft-03 UP
Mar  8 14:30:11 taft-01 rgmanager[3335]: State change: taft-04 UP

Attempted New Owner:
Mar  8 14:30:04 taft-03 rgmanager[3897]: I am node #3
Mar  8 14:30:04 taft-03 rgmanager[3897]: Resource Group Manager Starting
Mar  8 14:30:04 taft-03 rgmanager[3897]: Loading Service Data
Mar  8 14:30:06 taft-03 rgmanager[3897]: Initializing Services
Mar  8 14:30:06 taft-03 rgmanager[4468]: stop: Could not match /dev/TAFT/ha1 with a real device
Mar  8 14:30:06 taft-03 rgmanager[3897]: stop on fs "fs1" returned 2 (invalid argument(s))
Mar  8 14:30:08 taft-03 rgmanager[4523]: Deactivating TAFT/ha1
Mar  8 14:30:08 taft-03 rgmanager[4545]: Making resilient : lvchange -an TAFT/ha1
Mar  8 14:30:09 taft-03 rgmanager[4570]: Resilient command: lvchange -an TAFT/ha1 --config devices{filter=["a|/dev/sda2|","a|/dev/sdb1|","a|/dev/sdc1|","a|/dev/sdd1|",
Mar  8 14:30:09 taft-03 rgmanager[4594]: Removing ownership tag (taft-03) from TAFT/ha1
Mar  8 14:30:10 taft-03 rgmanager[4723]: Unable to delete tag from TAFT/ha1
Mar  8 14:30:10 taft-03 rgmanager[4745]: Failed to stop TAFT/ha1
Mar  8 14:30:10 taft-03 rgmanager[4767]: Failed to stop TAFT/ha1
Mar  8 14:30:10 taft-03 rgmanager[3897]: stop on lvm "lvm" returned 1 (generic error)
Mar  8 14:30:10 taft-03 rgmanager[3897]: Services Initialized
Mar  8 14:30:10 taft-03 rgmanager[3897]: State change: Local UP
Mar  8 14:30:11 taft-03 rgmanager[3897]: State change: taft-01 UP
Mar  8 14:30:11 taft-03 rgmanager[3897]: State change: taft-04 UP
Mar  8 14:30:11 taft-03 rgmanager[3897]: State change: taft-02 UP
Mar  8 14:30:11 taft-03 rgmanager[3897]: Starting stopped service service:halvm
Mar  8 14:30:13 taft-03 rgmanager[4839]: Activating TAFT/ha1
Mar  8 14:30:13 taft-03 rgmanager[4951]: Unable to add tag to TAFT/ha1
Mar  8 14:30:14 taft-03 rgmanager[4973]: Failed to start TAFT/ha1
Mar  8 14:30:14 taft-03 rgmanager[4995]: Attempting cleanup of TAFT
Mar  8 14:30:14 taft-03 rgmanager[5155]: Failed to make TAFT consistent
Mar  8 14:30:14 taft-03 rgmanager[3897]: start on lvm "lvm" returned 1 (generic error)
Mar  8 14:30:14 taft-03 rgmanager[3897]: #68: Failed to start service:halvm; return value: 1
Mar  8 14:30:14 taft-03 rgmanager[3897]: Stopping service service:halvm
Mar  8 14:30:15 taft-03 rgmanager[5192]: stop: Could not match /dev/TAFT/ha1 with a real device
Mar  8 14:30:15 taft-03 rgmanager[3897]: stop on fs "fs1" returned 2 (invalid argument(s))
Mar  8 14:30:16 taft-03 rgmanager[5247]: Deactivating TAFT/ha1
Mar  8 14:30:17 taft-03 rgmanager[5269]: Making resilient : lvchange -an TAFT/ha1
Mar  8 14:30:17 taft-03 rgmanager[5294]: Resilient command: lvchange -an TAFT/ha1 --config devices{filter=["a|/dev/sda2|","a|/dev/sdb1|","a|/dev/sdc1|","a|/dev/sdd1|",
Mar  8 14:30:18 taft-03 rgmanager[5318]: Removing ownership tag (taft-03) from TAFT/ha1
Mar  8 14:30:18 taft-03 rgmanager[5430]: Unable to delete tag from TAFT/ha1
Mar  8 14:30:18 taft-03 rgmanager[5452]: Failed to stop TAFT/ha1
Mar  8 14:30:19 taft-03 rgmanager[5474]: Failed to stop TAFT/ha1
Mar  8 14:30:19 taft-03 rgmanager[3897]: stop on lvm "lvm" returned 1 (generic error)
Mar  8 14:30:19 taft-03 rgmanager[3897]: #12: RG service:halvm failed to stop; intervention required
Mar  8 14:30:19 taft-03 rgmanager[3897]: Service service:halvm is failed
Mar  8 14:30:19 taft-03 rgmanager[3897]: #13: Service service:halvm failed to stop cleanly


[root@taft-03 ~]# lvs -a -o +devices
  LV             VG        Attr   LSize  Origin Snap%  Move Log      Copy%  Convert Devices
  ha1            TAFT      mwi---  3.00g                    ha1_mlog                ha1_mimage_0(0),ha1_mimage_1(0),ha1_mimage_2(0)
  [ha1_mimage_0] TAFT      Iwi---  3.00g                                            /dev/sde1(0)
  [ha1_mimage_1] TAFT      Iwi---  3.00g                                            /dev/sdg1(0)
  [ha1_mimage_2] TAFT      Iwi---  3.00g                                            /dev/sdb1(0)
  [ha1_mlog]     TAFT      lwi---  4.00m                                            /dev/sdd1(0)
  lv_home        vg_taft03 -wi-ao 25.64g                                            /dev/sda2(8269)
  lv_root        vg_taft03 -wi-ao 32.30g                                            /dev/sda2(0)
  lv_swap        vg_taft03 -wi-ao  9.81g                                            /dev/sda2(14832)
[root@taft-03 ~]# lvchange -an TAFT
[root@taft-03 ~]# lvchange -ay TAFT
  Not activating TAFT/ha1 since it does not pass activation filter.
[root@taft-03 ~]# vgchange --addtag taft-03 TAFT
  Cannot change VG TAFT while PVs are missing.
  Consider vgreduce --removemissing.

[root@taft-03 ~]# vgreduce --removemissing TAFT
  WARNING: Partial LV ha1 needs to be repaired or removed.
  WARNING: Partial LV ha1_mimage_0 needs to be repaired or removed.
  WARNING: There are still partial LVs in VG TAFT.
  To remove them unconditionally use: vgreduce --removemissing --force.
  Proceeding to remove empty missing PVs.
  Command failed with status code 5.


Version-Release number of selected component (if applicable):
2.6.32-94.el6.x86_64

lvm2-2.02.83-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
lvm2-libs-2.02.83-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
lvm2-cluster-2.02.83-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
udev-147-2.31.el6    BUILT: Wed Jan 26 05:39:15 CST 2011
device-mapper-1.02.62-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
device-mapper-libs-1.02.62-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
device-mapper-event-1.02.62-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
device-mapper-event-libs-1.02.62-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
cmirror-2.02.83-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011

--- Additional comment from jbrassow on 2011-04-05 14:35:20 EDT ---

Created attachment 490060 [details]
rhel6 patch for this issue

problem may also be present in rhel5.  This patch would fix the issue there as well.

Comment 1 RHEL Program Management 2012-04-02 10:46:41 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 7 errata-xmlrpc 2013-01-08 07:05:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0026.html


Note You need to log in before you can comment on or make changes to this bug.