471452 – Mirror log failure + machine failure results in log volume not removed

Bug 471452 - Mirror log failure + machine failure results in log volume not removed

Summary: Mirror log failure + machine failure results in log volume not removed

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Jonathan Earl Brassow
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1049888
TreeView+	depends on / blocked

Reported:	2008-11-13 19:15 UTC by Corey Marthaler
Modified:	2014-02-04 15:02 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-01-30 00:04:43 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Corey Marthaler 2008-11-13 19:15:30 UTC

Description of problem:
This may be related to bz 446107. I noticed that after the log device was failed and the node with the lowest id was killed, the log device wasn't properly removed. The gfs filesystem appears to still be intact however.

Looks like the log is a phantom volume w/o a device attached:
[root@taft-02 cluster]# lvs -a -o +devices
  /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error
  LV                           VG             Attr   LSize   Origin Snap%  Move Log Copy%  Convert Devices                                                                                  
  LogVol00                     VolGroup00     -wi-ao  66.19G                                       /dev/sda2(0)                                                                             
  LogVol01                     VolGroup00     -wi-ao   1.94G                                       /dev/sda2(2118)                                                                          
  syncd_log_3legs_1            helter_skelter mwi-ao 600.00M                        100.00         syncd_log_3legs_1_mimage_0(0),syncd_log_3legs_1_mimage_1(0),syncd_log_3legs_1_mimage_2(0)
  [syncd_log_3legs_1_mimage_0] helter_skelter iwi-ao 600.00M                                       /dev/sde1(0)                                                                             
  [syncd_log_3legs_1_mimage_1] helter_skelter iwi-ao 600.00M                                       /dev/sdh1(0)                                                                             
  [syncd_log_3legs_1_mimage_2] helter_skelter iwi-ao 600.00M                                       /dev/sdf1(0)                                                                             
  syncd_log_3legs_1_mlog       helter_skelter vwi-a-   4.00M   


Helter_skelter:
Scenario: Kill disk log of synced 3 leg mirror(s)                               

****** Mirror hash info for this scenario ******
* name:         syncd_log_3legs                 
* sync:         1                               
* num mirrors:  1                               
* disklog:      /dev/sdg1                       
* failpv:       /dev/sdg1                       
* leg devices:  /dev/sde1 /dev/sdh1 /dev/sdf1   
************************************************

Creating mirror(s) on taft-02...
taft-02: lvcreate -m 2 -n syncd_log_3legs_1 -L 600M helter_skelter /dev/sde1:0-1000 /dev/sdh1:0-1000 /dev/sdf1:0-1000 /dev/sdg1:0-150                                                                                           

Waiting until all mirrors become fully syncd...
        0/1 mirror(s) are fully synced: ( 1=0.83% )
        0/1 mirror(s) are fully synced: ( 1=49.25% )
        0/1 mirror(s) are fully synced: ( 1=97.42% )
        1/1 mirror(s) are fully synced: ( 1=100.00% )

Creating gfs on top of mirror(s) on taft-01...
Mounting mirrored gfs filesystems on taft-01...
Mounting mirrored gfs filesystems on taft-02...
Mounting mirrored gfs filesystems on taft-03...
Mounting mirrored gfs filesystems on taft-04...

Writing verification files (checkit) to mirror(s) on...
        ---- taft-01 ---- 
        ---- taft-02 ----
        ---- taft-03 ----
        ---- taft-04 ----

<start name="taft-01_1" pid="11072" time="Thu Nov 13 10:21:40 2008" type="cmd" />
<start name="taft-02_1" pid="11073" time="Thu Nov 13 10:21:40 2008" type="cmd" />
<start name="taft-03_1" pid="11075" time="Thu Nov 13 10:21:40 2008" type="cmd" />
<start name="taft-04_1" pid="11078" time="Thu Nov 13 10:21:40 2008" type="cmd" />
Sleeping 10 seconds to get some outsanding GFS I/O locks before the failure      
Verifying files (checkit) on mirror(s) on...                                     
        ---- taft-01 ---- 
        ---- taft-02 ----
        ---- taft-03 ----
        ---- taft-04 ----

Disabling device sdg on taft-01
Disabling device sdg on taft-02
Disabling device sdg on taft-03
Disabling device sdg on taft-04

Attempting I/O to cause mirror down conversion(s) on taft-02
10+0 records in
10+0 records out

[ HERE IS WHERE TAFT-01 WAS KILLED ] 

41943040 bytes (42 MB) copied, 24.6422 seconds, 1.7 MB/s
Verifying the down conversion of the failed mirror(s)
  /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error
Verifying FAILED device /dev/sdg1 is *NOT* in the volume(s)
  /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error
Verifying LEG device /dev/sde1 *IS* in the volume(s)
  /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error
Verifying LEG device /dev/sdh1 *IS* in the volume(s)
  /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error
Verifying LEG device /dev/sdf1 *IS* in the volume(s)
  /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error
Verify the dm devices associated with /dev/sdg1 are no longer present
<fail name="taft-01_1" pid="11072" time="Thu Nov 13 10:24:22 2008" type="cmd" duration="162" ec="127" />
ALL STOP!
<killed name="taft-02_1" pid="11073" time="Thu Nov 13 10:24:28 2008" type="cmd" duration="168" signal="2" />
<killed name="taft-03_1" pid="11075" time="Thu Nov 13 10:24:28 2008" type="cmd" duration="168" signal="2" />
<killed name="taft-04_1" pid="11078" time="Thu Nov 13 10:24:28 2008" type="cmd" duration="168" signal="2" />
Could not connect to remote host
syncd_log_3legs_1_mlog on taft-02 should no longer be there


Version-Release number of selected component (if applicable):
2.6.18-117.el5

lvm2-2.02.40-6.el5    BUILT: Fri Oct 24 07:37:33 CDT 2008
lvm2-cluster-2.02.40-6.el5    BUILT: Fri Oct 24 07:38:44 CDT 2008
device-mapper-1.02.28-2.el5    BUILT: Fri Sep 19 02:50:32 CDT 2008
cmirror-1.1.34-5.el5    BUILT: Thu Nov  6 15:10:44 CST 2008
kmod-cmirror-0.1.21-2.el5    BUILT: Thu Nov  6 14:12:07 CST 2008

Comment 3 Jonathan Earl Brassow 2009-03-11 17:11:37 UTC

Editing subject to reflect the fact that this is not a cluster mirror specific problem.

Comment 7 Jonathan Earl Brassow 2014-01-30 00:04:43 UTC

In contrast to comment 2, I don't think that the metadata needs to be edited by hand.  The user would simply need to remove the detached (and now visible) log LV.  This isn't that hard.

While it would be nice to commit a multistep transaction and have the steps execute - even in the event of a failure - that is not going to happen in LVM for RHEL5.  So, I'm closing this bug WONTFIX.

Note You need to log in before you can comment on or make changes to this bug.