Bug 547842 - unable to restore log device after successful failure and core conversion
Summary: unable to restore log device after successful failure and core conversion
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2-cluster
Version: 5.4
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Milan Broz
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-12-15 19:38 UTC by Corey Marthaler
Modified: 2013-03-01 04:07 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 09:02:19 UTC
Target Upstream Version:
Embargoed:
cmarthal: needinfo+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0299 0 normal SHIPPED_LIVE lvm2-cluster bug fix and enhancement update 2010-03-29 14:26:30 UTC

Description Corey Marthaler 2009-12-15 19:38:23 UTC
Description of problem:
This no longer works like it did before the allocate policy changes. In this case, the log device is failed, successfully converted to core log, and then that failed device is re-enabled, pvcreated, vgextended, and the covert to disk log fails. Is there a new step that I'm missing?

Scenario: Kill disk log of synced 4 leg mirror(s)                                                                               

********* Mirror hash info for this scenario *********
* names:              syncd_log_4legs_1               
* sync:               1                               
* disklog:            /dev/sdh1                       
* failpv(s):          /dev/sdh1                       
* failnode(s):        taft-01 taft-02 taft-03 taft-04 
* leg devices:        /dev/sde1 /dev/sdf1 /dev/sdd1 /dev/sdb1
* leg fault policy:   remove                                 
* log fault policy:   remove                                 
******************************************************       

Creating mirror(s) on taft-03...
taft-03: lvcreate -m 3 -n syncd_log_4legs_1 -L 600M helter_skelter /dev/sde1:0-1000 /dev/sdf1:0-1000 /dev/sdd1:0-1000 /dev/sdb1:0-1000 /dev/sdh1:0-150                                                                                                          

Waiting until all mirrors become fully syncd...
   0/1 mirror(s) are fully synced: ( 25.92% )  
   0/1 mirror(s) are fully synced: ( 35.33% )  
   0/1 mirror(s) are fully synced: ( 61.33% )  
   0/1 mirror(s) are fully synced: ( 89.58% )  
   1/1 mirror(s) are fully synced: ( 100.00% ) 

Creating gfs on top of mirror(s) on taft-01...
Mounting mirrored gfs filesystems on taft-01...
Mounting mirrored gfs filesystems on taft-02...
Mounting mirrored gfs filesystems on taft-03...
Mounting mirrored gfs filesystems on taft-04...

Writing verification files (checkit) to mirror(s) on...
        ---- taft-01 ----                              
        ---- taft-02 ----                              
        ---- taft-03 ----                              
        ---- taft-04 ----                              

<start name="taft-01_syncd_log_4legs_1" pid="17144" time="Tue Dec 15 11:58:20 2009" type="cmd" />
<start name="taft-02_syncd_log_4legs_1" pid="17146" time="Tue Dec 15 11:58:20 2009" type="cmd" />
<start name="taft-03_syncd_log_4legs_1" pid="17148" time="Tue Dec 15 11:58:20 2009" type="cmd" />
<start name="taft-04_syncd_log_4legs_1" pid="17150" time="Tue Dec 15 11:58:20 2009" type="cmd" />
Sleeping 10 seconds to get some outsanding GFS I/O locks before the failure                      
Verifying files (checkit) on mirror(s) on...                                                     
        ---- taft-01 ----
        ---- taft-02 ----
        ---- taft-03 ----
        ---- taft-04 ----

Disabling device sdh on taft-01
Disabling device sdh on taft-02
Disabling device sdh on taft-03
Disabling device sdh on taft-04

Attempting I/O to cause mirror down conversion(s) on taft-01
10+0 records in                                             
10+0 records out                                            
41943040 bytes (42 MB) copied, 15.8842 seconds, 2.6 MB/s    
Verifying current sanity of lvm after the failure           
  /dev/sdh1: open failed: No such device or address         
  Couldn't find device with uuid 'F8bOUx-sNnJ-FavG-hEXW-Aeg8-nVfK-2g7ztk'.
  Couldn't find device with uuid 'F8bOUx-sNnJ-FavG-hEXW-Aeg8-nVfK-2g7ztk'.
  Couldn't find device with uuid 'F8bOUx-sNnJ-FavG-hEXW-Aeg8-nVfK-2g7ztk'.
  Couldn't find device with uuid 'F8bOUx-sNnJ-FavG-hEXW-Aeg8-nVfK-2g7ztk'.
  Couldn't find device with uuid 'F8bOUx-sNnJ-FavG-hEXW-Aeg8-nVfK-2g7ztk'.
  Couldn't find device with uuid 'F8bOUx-sNnJ-FavG-hEXW-Aeg8-nVfK-2g7ztk'.
Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s)               
  /dev/sdh1: open failed: No such device or address                       
Verifying LOG device /dev/sdh1 is *NOT* in the linear(s)                  
  /dev/sdh1: open failed: No such device or address                       
Verifying LEG device /dev/sde1 *IS* in the volume(s)                      
  /dev/sdh1: open failed: No such device or address                       
Verifying LEG device /dev/sdf1 *IS* in the volume(s)                      
  /dev/sdh1: open failed: No such device or address                       
Verifying LEG device /dev/sdd1 *IS* in the volume(s)                      
  /dev/sdh1: open failed: No such device or address                       
Verifying LEG device /dev/sdb1 *IS* in the volume(s)                      
  /dev/sdh1: open failed: No such device or address                       
Verify the dm devices associated with /dev/sdh1 are in proper states      
Verify that the mirror image order remains the same after the down conversion
  /dev/sdh1: open failed: No such device or address
  /dev/sdh1: open failed: No such device or address
  /dev/sdh1: open failed: No such device or address
  /dev/sdh1: open failed: No such device or address
  /dev/sdh1: open failed: No such device or address

Verifying files (checkit) on mirror(s) on...
        ---- taft-01 ----
        ---- taft-02 ----
        ---- taft-03 ----
        ---- taft-04 ----

Enabling device sdh on taft-01
Enabling device sdh on taft-02
Enabling device sdh on taft-03
Enabling device sdh on taft-04

Recreating PVs /dev/sdh1
  WARNING: Volume group helter_skelter is not consistent
  WARNING: Volume Group helter_skelter is not consistent
  WARNING: Volume group helter_skelter is not consistent
Extending the recreated PVs back into VG helter_skelter
Up converting linear(s) back to mirror(s) on taft-03...
taft-03: lvconvert -m 3 -b helter_skelter/syncd_log_4legs_1 /dev/sde1:0-1000 /dev/sdf1:0-1000 /dev/sdd1:0-1000 /dev/sdb1:0-1000/dev/sdh1:0-150
  Error locking on node taft-04-bond: Refusing activation of partial LV syncd_log_4legs_1_mlog. Use --partial to override.
  Error locking on node taft-03-bond: Refusing activation of partial LV syncd_log_4legs_1_mlog. Use --partial to override.
  Error locking on node taft-02-bond: Refusing activation of partial LV syncd_log_4legs_1_mlog. Use --partial to override.
  Aborting. Failed to activate mirror log.
  Failed to initialise mirror log.
couldn't up convert mirror syncd_log_4legs_1 on taft-03

# retried again:
[root@taft-01 ~]# lvconvert -m 3 -b helter_skelter/syncd_log_4legs_1 /dev/sde1:0-1000 /dev/sdf1:0-1000 /dev/sdd1:0-1000 /dev/sdb1:0-1000 /dev/sdh1:0-150       Aborting. Unable to deactivate mirror log.
  Failed to initialise mirror log.

[root@taft-01 ~]# dmsetup ls
helter_skelter-syncd_log_4legs_1        (253, 7)
helter_skelter-syncd_log_4legs_1_mimage_3       (253, 6)
helter_skelter-syncd_log_4legs_1_mimage_2       (253, 5)
helter_skelter-syncd_log_4legs_1_mimage_1       (253, 4)
helter_skelter-syncd_log_4legs_1_mimage_0       (253, 3)
VolGroup00-LogVol01     (253, 1)
VolGroup00-LogVol00     (253, 0)
helter_skelter-syncd_log_4legs_1_mlog   (253, 2)


Version-Release number of selected component (if applicable):
2.6.18-160.el5

lvm2-2.02.56-2.el5    BUILT: Thu Dec 10 09:38:13 CST 2009
lvm2-cluster-2.02.56-2.el5    BUILT: Thu Dec 10 09:38:41 CST 2009
device-mapper-1.02.39-1.el5    BUILT: Wed Nov 11 12:31:44 CST 2009
cmirror-1.1.39-2.el5    BUILT: Mon Jul 27 15:39:05 CDT 2009
kmod-cmirror-0.1.22-1.el5    BUILT: Mon Jul 27 15:28:46 CDT 2009

Comment 1 Petr Rockai 2009-12-17 11:26:47 UTC
I would probably need output of dmsetup table and lvs -a -o +devices, ideally before and after the lvconvert is attempted. Also, the syslog output from the time around the failure, to see what was going on wrt dmeventd in the background. Thanks.

Comment 2 Corey Marthaler 2009-12-17 17:36:16 UTC
I'll check this out to verify that this isn't in lvm single machine mirroring as well.

Comment 3 Corey Marthaler 2009-12-17 23:08:53 UTC
Adding a 'clvmd -R' right after the PV recreate and vgextend causes this bug to go away.

Comment 4 Milan Broz 2010-01-06 19:24:10 UTC
I hope that problem is in cached metadata, fix in lvm2-cluster-2.02.56-4

Comment 7 Corey Marthaler 2010-01-28 22:27:09 UTC
Fix verified in lvm2-2.02.56-6.el5/lvm2-cluster-2.02.56-6.el5.

Comment 9 errata-xmlrpc 2010-03-30 09:02:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0299.html


Note You need to log in before you can comment on or make changes to this bug.