Description of problem: This may be related to bz 446107. I noticed that after the log device was failed and the node with the lowest id was killed, the log device wasn't properly removed. The gfs filesystem appears to still be intact however. Looks like the log is a phantom volume w/o a device attached: [root@taft-02 cluster]# lvs -a -o +devices /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices LogVol00 VolGroup00 -wi-ao 66.19G /dev/sda2(0) LogVol01 VolGroup00 -wi-ao 1.94G /dev/sda2(2118) syncd_log_3legs_1 helter_skelter mwi-ao 600.00M 100.00 syncd_log_3legs_1_mimage_0(0),syncd_log_3legs_1_mimage_1(0),syncd_log_3legs_1_mimage_2(0) [syncd_log_3legs_1_mimage_0] helter_skelter iwi-ao 600.00M /dev/sde1(0) [syncd_log_3legs_1_mimage_1] helter_skelter iwi-ao 600.00M /dev/sdh1(0) [syncd_log_3legs_1_mimage_2] helter_skelter iwi-ao 600.00M /dev/sdf1(0) syncd_log_3legs_1_mlog helter_skelter vwi-a- 4.00M Helter_skelter: Scenario: Kill disk log of synced 3 leg mirror(s) ****** Mirror hash info for this scenario ****** * name: syncd_log_3legs * sync: 1 * num mirrors: 1 * disklog: /dev/sdg1 * failpv: /dev/sdg1 * leg devices: /dev/sde1 /dev/sdh1 /dev/sdf1 ************************************************ Creating mirror(s) on taft-02... taft-02: lvcreate -m 2 -n syncd_log_3legs_1 -L 600M helter_skelter /dev/sde1:0-1000 /dev/sdh1:0-1000 /dev/sdf1:0-1000 /dev/sdg1:0-150 Waiting until all mirrors become fully syncd... 0/1 mirror(s) are fully synced: ( 1=0.83% ) 0/1 mirror(s) are fully synced: ( 1=49.25% ) 0/1 mirror(s) are fully synced: ( 1=97.42% ) 1/1 mirror(s) are fully synced: ( 1=100.00% ) Creating gfs on top of mirror(s) on taft-01... Mounting mirrored gfs filesystems on taft-01... Mounting mirrored gfs filesystems on taft-02... Mounting mirrored gfs filesystems on taft-03... Mounting mirrored gfs filesystems on taft-04... Writing verification files (checkit) to mirror(s) on... ---- taft-01 ---- ---- taft-02 ---- ---- taft-03 ---- ---- taft-04 ---- <start name="taft-01_1" pid="11072" time="Thu Nov 13 10:21:40 2008" type="cmd" /> <start name="taft-02_1" pid="11073" time="Thu Nov 13 10:21:40 2008" type="cmd" /> <start name="taft-03_1" pid="11075" time="Thu Nov 13 10:21:40 2008" type="cmd" /> <start name="taft-04_1" pid="11078" time="Thu Nov 13 10:21:40 2008" type="cmd" /> Sleeping 10 seconds to get some outsanding GFS I/O locks before the failure Verifying files (checkit) on mirror(s) on... ---- taft-01 ---- ---- taft-02 ---- ---- taft-03 ---- ---- taft-04 ---- Disabling device sdg on taft-01 Disabling device sdg on taft-02 Disabling device sdg on taft-03 Disabling device sdg on taft-04 Attempting I/O to cause mirror down conversion(s) on taft-02 10+0 records in 10+0 records out [ HERE IS WHERE TAFT-01 WAS KILLED ] 41943040 bytes (42 MB) copied, 24.6422 seconds, 1.7 MB/s Verifying the down conversion of the failed mirror(s) /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error Verifying FAILED device /dev/sdg1 is *NOT* in the volume(s) /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error Verifying LEG device /dev/sde1 *IS* in the volume(s) /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error Verifying LEG device /dev/sdh1 *IS* in the volume(s) /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error Verifying LEG device /dev/sdf1 *IS* in the volume(s) /dev/sdg1: read failed after 0 of 2048 at 0: Input/output error Verify the dm devices associated with /dev/sdg1 are no longer present <fail name="taft-01_1" pid="11072" time="Thu Nov 13 10:24:22 2008" type="cmd" duration="162" ec="127" /> ALL STOP! <killed name="taft-02_1" pid="11073" time="Thu Nov 13 10:24:28 2008" type="cmd" duration="168" signal="2" /> <killed name="taft-03_1" pid="11075" time="Thu Nov 13 10:24:28 2008" type="cmd" duration="168" signal="2" /> <killed name="taft-04_1" pid="11078" time="Thu Nov 13 10:24:28 2008" type="cmd" duration="168" signal="2" /> Could not connect to remote host syncd_log_3legs_1_mlog on taft-02 should no longer be there Version-Release number of selected component (if applicable): 2.6.18-117.el5 lvm2-2.02.40-6.el5 BUILT: Fri Oct 24 07:37:33 CDT 2008 lvm2-cluster-2.02.40-6.el5 BUILT: Fri Oct 24 07:38:44 CDT 2008 device-mapper-1.02.28-2.el5 BUILT: Fri Sep 19 02:50:32 CDT 2008 cmirror-1.1.34-5.el5 BUILT: Thu Nov 6 15:10:44 CST 2008 kmod-cmirror-0.1.21-2.el5 BUILT: Thu Nov 6 14:12:07 CST 2008
Editing subject to reflect the fact that this is not a cluster mirror specific problem.
In contrast to comment 2, I don't think that the metadata needs to be edited by hand. The user would simply need to remove the detached (and now visible) log LV. This isn't that hard. While it would be nice to commit a multistep transaction and have the steps execute - even in the event of a failure - that is not going to happen in LVM for RHEL5. So, I'm closing this bug WONTFIX.