Description of problem: I just added a check to helter_skelter (the mirror device failure suite) to verify that the dm-devices are actually being removed as well (instead of just checking that the failed device isn't in the lvm output). So it's possible that this bz is already filed, but caught earlier in the process. ================================================================================ Iteration 0.12 started at Mon May 12 11:57:26 CDT 2008 ================================================================================ Senario: Kill disk log of non synced 2 leg mirror(s) ****** Mirror hash info for this scenario ****** * name: nonsyncd_log_2legs * sync: 0 * mirrors: 1 * disklog: 1 * failpv: /dev/sdg1 * legs: 2 * pvs: /dev/sdh1 /dev/sde1 /dev/sdg1 ************************************************ Creating mirror(s) on taft-02... taft-02: lvcreate -m 1 -n nonsyncd_log_2legs_1 -L 800M helter_skelter /dev/sdh1:0-1000 /dev/sde1:0-1000 /dev/sdg1:0-150 Continuing on without fully syncd mirrors, currently at... ( 1=1.00% ) Creating gfs on top of mirror(s) on taft-02... Mounting mirrored gfs filesystems on taft-01... Mounting mirrored gfs filesystems on taft-02... Mounting mirrored gfs filesystems on taft-03... Mounting mirrored gfs filesystems on taft-04... Writing verification files (checkit) to mirror(s) on... ---- taft-01 ---- checkit starting with: CREATE Num files: 100 Random Seed: 12484 Verify XIOR Stream: /tmp/checkit_nonsyncd_log_2legs_1 Working dir: /mnt/nonsyncd_log_2legs_1/checkit ---- taft-02 ---- checkit starting with: CREATE Num files: 100 Random Seed: 11784 Verify XIOR Stream: /tmp/checkit_nonsyncd_log_2legs_1 Working dir: /mnt/nonsyncd_log_2legs_1/checkit ---- taft-03 ---- checkit starting with: CREATE Num files: 100 Random Seed: 12308 Verify XIOR Stream: /tmp/checkit_nonsyncd_log_2legs_1 Working dir: /mnt/nonsyncd_log_2legs_1/checkit ---- taft-04 ---- checkit starting with: CREATE Num files: 100 Random Seed: 12676 Verify XIOR Stream: /tmp/checkit_nonsyncd_log_2legs_1 Working dir: /mnt/nonsyncd_log_2legs_1/checkit <start name="taft-01_1" pid="706" time="Mon May 12 11:58:02 2008" type="cmd" /> <start name="taft-02_1" pid="708" time="Mon May 12 11:58:02 2008" type="cmd" /> <start name="taft-03_1" pid="710" time="Mon May 12 11:58:02 2008" type="cmd" /> <start name="taft-04_1" pid="712" time="Mon May 12 11:58:02 2008" type="cmd" /> Sleeping 12 seconds to get some outsanding I/O locks before the failure Disabling device sdg on taft-01 Disabling device sdg on taft-02 Disabling device sdg on taft-03 Disabling device sdg on taft-04 Attempting I/O to cause mirror down conversion(s) on taft-02 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.093717 seconds, 448 MB/s Verifying the down conversion of the failed mirror(s) /dev/sdg1: open failed: No such device or address Verifying LEG device /dev/sdh1 *IS* in the volume(s) /dev/sdg1: open failed: No such device or address Verifying LEG device /dev/sde1 *IS* in the volume(s) /dev/sdg1: open failed: No such device or address Verifying FAILED device /dev/sdg1 is *NOT* in the volume(s) /dev/sdg1: open failed: No such device or address Verify the dm devices associated with /dev/sdg1 are no longer present nonsyncd_log_2legs_1_mlog on taft-03 should no longer be there The log device should be gone by this stage, but it's not on all nodes. TAFT-01: nonsyncd_log_2legs_1 helter_skelter mwi-ao 800.00M 100.00 nonsyncd_log_2legs_1_mimage_0(0),nonsyncd_log_2legs_1_mimage_1(0) [nonsyncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 800.00M /dev/sdh1(0) [nonsyncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 800.00M /dev/sde1(0) nonsyncd_log_2legs_1_mlog helter_skelter vwi--- 4.00M helter_skelter-nonsyncd_log_2legs_1 (253, 5) helter_skelter-nonsyncd_log_2legs_1_mimage_1 (253, 4) helter_skelter-nonsyncd_log_2legs_1_mimage_0 (253, 3) TAFT-02: nonsyncd_log_2legs_1 helter_skelter mwi-ao 800.00M 100.00 nonsyncd_log_2legs_1_mimage_0(0),nonsyncd_log_2legs_1_mimage_1(0) [nonsyncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 800.00M /dev/sdh1(0) [nonsyncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 800.00M /dev/sde1(0) nonsyncd_log_2legs_1_mlog helter_skelter vwi--- 4.00M helter_skelter-nonsyncd_log_2legs_1 (253, 5) helter_skelter-nonsyncd_log_2legs_1_mimage_1 (253, 4) helter_skelter-nonsyncd_log_2legs_1_mimage_0 (253, 3) TAFT-03: nonsyncd_log_2legs_1 helter_skelter mwi-ao 800.00M 100.00 nonsyncd_log_2legs_1_mimage_0(0),nonsyncd_log_2legs_1_mimage_1(0) [nonsyncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 800.00M /dev/sdh1(0) [nonsyncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 800.00M /dev/sde1(0) nonsyncd_log_2legs_1_mlog helter_skelter vwi-a- 4.00M helter_skelter-nonsyncd_log_2legs_1_mlog (253, 2) helter_skelter-nonsyncd_log_2legs_1 (253, 5) helter_skelter-nonsyncd_log_2legs_1_mimage_1 (253, 4) helter_skelter-nonsyncd_log_2legs_1_mimage_0 (253, 3) TAFT-04: nonsyncd_log_2legs_1 helter_skelter mwi-ao 800.00M 100.00 nonsyncd_log_2legs_1_mimage_0(0),nonsyncd_log_2legs_1_mimage_1(0) [nonsyncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 800.00M /dev/sdh1(0) [nonsyncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 800.00M /dev/sde1(0) nonsyncd_log_2legs_1_mlog helter_skelter vwi-a- 4.00M helter_skelter-nonsyncd_log_2legs_1_mlog (253, 2) helter_skelter-nonsyncd_log_2legs_1 (253, 5) helter_skelter-nonsyncd_log_2legs_1_mimage_1 (253, 4) helter_skelter-nonsyncd_log_2legs_1_mimage_0 (253, 3) Version-Release number of selected component (if applicable): [root@taft-04 ~]# rpm -qi lvm2 Name : lvm2 Relocations: (not relocatable) Version : 2.02.32 Vendor: Red Hat, Inc. Release : 4.el5 Build Date: Fri 04 Apr 2008 06:15:19 AM CDT Install Date: Fri 02 May 2008 03:43:27 PM CDT Build Host: hs20-bc2-3.build.redhat.com [root@taft-04 ~]# rpm -qi lvm2-cluster Name : lvm2-cluster Relocations: (not relocatable) Version : 2.02.32 Vendor: Red Hat, Inc. Release : 4.el5 Build Date: Wed 02 Apr 2008 03:56:50 AM CDT Install Date: Fri 02 May 2008 03:45:12 PM CDT Build Host: hs20-bc2-3.build.redhat.com Extensions to LVM2 to support clusters. [root@taft-04 ~]# rpm -qi cmirror Name : cmirror Relocations: (not relocatable) Version : 1.1.17 Vendor: Red Hat, Inc. Release : 1.el5 Build Date: Fri 09 May 2008 11:33:43 AM CDT Install Date: Fri 09 May 2008 03:57:13 PM CDT Build Host: hs20-bc1-7.build.redhat.com [root@taft-04 ~]# rpm -qi kmod-cmirror Name : kmod-cmirror Relocations: (not relocatable) Version : 0.1.9 Vendor: Red Hat, Inc. Release : 1.el5 Build Date: Thu 08 May 2008 02:28:27 PM CDT Install Date: Thu 08 May 2008 05:10:39 PM CDT Build Host: hs20-bc2-4.build.redhat.com [root@taft-04 ~]# rpm -qi device-mapper Name : device-mapper Relocations: (not relocatable) Version : 1.02.24 Vendor: Red Hat, Inc. Release : 1.el5 Build Date: Thu 17 Jan 2008 04:46:05 PM CST Install Date: Fri 02 May 2008 03:42:58 PM CDT Build Host: hs20-bc2-3.build.redhat.com How reproducible: I believe I've seen this before
[root@taft-01 ~]# dmsetup ls --tree helter_skelter-nonsyncd_log_2legs_1 (253:5) ├─helter_skelter-nonsyncd_log_2legs_1_mimage_1 (253:4) │ └─ (8:65) └─helter_skelter-nonsyncd_log_2legs_1_mimage_0 (253:3) └─ (8:113) [root@taft-01 ~]# dmsetup status helter_skelter-nonsyncd_log_2legs_1: 0 1638400 mirror 2 253:3 253:4 1600/1600 1 AA 1 clustered_core helter_skelter-nonsyncd_log_2legs_1_mimage_1: 0 1638400 linear helter_skelter-nonsyncd_log_2legs_1_mimage_0: 0 1638400 linear [root@taft-02 ~]# dmsetup ls --tree helter_skelter-nonsyncd_log_2legs_1 (253:5) ├─helter_skelter-nonsyncd_log_2legs_1_mimage_1 (253:4) │ └─ (8:65) └─helter_skelter-nonsyncd_log_2legs_1_mimage_0 (253:3) └─ (8:113) [root@taft-02 ~]# dmsetup status helter_skelter-nonsyncd_log_2legs_1: 0 1638400 mirror 2 253:3 253:4 1600/1600 1 AA 1 clustered_core helter_skelter-nonsyncd_log_2legs_1_mimage_1: 0 1638400 linear helter_skelter-nonsyncd_log_2legs_1_mimage_0: 0 1638400 linear [root@taft-03 ~]# dmsetup ls --tree helter_skelter-nonsyncd_log_2legs_1_mlog (253:2) └─ (8:97) helter_skelter-nonsyncd_log_2legs_1 (253:5) ├─helter_skelter-nonsyncd_log_2legs_1_mimage_1 (253:4) │ └─ (8:65) └─helter_skelter-nonsyncd_log_2legs_1_mimage_0 (253:3) └─ (8:113) [root@taft-03 ~]# dmsetup status helter_skelter-nonsyncd_log_2legs_1_mlog: 0 8192 linear helter_skelter-nonsyncd_log_2legs_1: 0 1638400 mirror 2 253:3 253:4 1600/1600 1 AA 1 clustered_core helter_skelter-nonsyncd_log_2legs_1_mimage_1: 0 1638400 linear helter_skelter-nonsyncd_log_2legs_1_mimage_0: 0 1638400 linear [root@taft-04 ~]# dmsetup ls --tree helter_skelter-nonsyncd_log_2legs_1_mlog (253:2) └─ (8:97) helter_skelter-nonsyncd_log_2legs_1 (253:5) ├─helter_skelter-nonsyncd_log_2legs_1_mimage_1 (253:4) │ └─ (8:65) └─helter_skelter-nonsyncd_log_2legs_1_mimage_0 (253:3) └─ (8:113) [root@taft-04 ~]# dmsetup status helter_skelter-nonsyncd_log_2legs_1_mlog: 0 8192 linear helter_skelter-nonsyncd_log_2legs_1: 0 1638400 mirror 2 253:3 253:4 1600/1600 1 AA 1 clustered_core helter_skelter-nonsyncd_log_2legs_1_mimage_1: 0 1638400 linear helter_skelter-nonsyncd_log_2legs_1_mimage_0: 0 1638400 linear
Looks like lvm knows that it failed to remove the device: May 12 11:58:20 taft-01 lvm[7333]: WARNING: dev_open(/dev/sdh1) called while suspended May 12 11:58:20 taft-01 clogd[6720]: [hbeCYldT] rw_log: write failure: Input/output error May 12 11:58:20 taft-01 clogd[6720]: [hbeCYldT] Error writing to disk log May 12 11:58:20 taft-01 clogd[6720]: [hbeCYldT] rw_log: write failure: Input/output error May 12 11:58:20 taft-01 clogd[6720]: [hbeCYldT] Error writing to disk log May 12 11:58:20 taft-01 kernel: device-mapper: dm-log-clustered: [hbeCYldT] Server error while processing request [DM_CLOG_FLUSH]: -5 May 12 11:58:20 taft-01 lvm[7333]: Monitoring mirror device helter_skelter-nonsyncd_log_2legs_1 for events May 12 11:58:20 taft-01 lvm[7333]: Another thread is handling an event. Waiting... May 12 11:58:20 taft-01 lvm[7333]: WARNING: dev_open(/dev/sde1) called while suspended May 12 11:58:20 taft-01 lvm[7333]: WARNING: dev_open(/dev/sdf1) called while suspended May 12 11:58:20 taft-01 lvm[7333]: WARNING: dev_open(/dev/sdh1) called while suspended May 12 11:58:20 taft-01 lvm[7333]: WARNING: dev_open(/etc/lvm/lvm.conf) called while suspended May 12 11:58:20 taft-01 lvm[7333]: Failed to remove faulty devices in helter_skelter-nonsyncd_log_2legs_1 May 12 11:58:22 taft-01 lvm[7333]: No longer monitoring mirror device helter_skelter-nonsyncd_log_2legs_1 for events
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
I reproduced this last night. Senario: Kill disk log of synced 2 leg mirror(s) ****** Mirror hash info for this scenario ****** * name: syncd_log_2legs * sync: 1 * mirrors: 1 * disklog: 1 * failpv: /dev/sde1 * legs: 2 * pvs: /dev/sdg1 /dev/sdf1 /dev/sde1 ************************************************ Creating mirror(s) on taft-04... taft-04: lvcreate -m 1 -n syncd_log_2legs_1 -L 800M helter_skelter /dev/sdg1:0-1000 /dev/sdf1:0-1000 /dev/sde1:0-150 Waiting until all mirrors become fully syncd... 0/1 mirror(s) are fully synced: ( 1=0.50% ) 0/1 mirror(s) are fully synced: ( 1=31.00% ) 0/1 mirror(s) are fully synced: ( 1=61.50% ) 0/1 mirror(s) are fully synced: ( 1=92.00% ) 1/1 mirror(s) are fully synced: ( 1=100.00% ) Creating gfs on top of mirror(s) on taft-04... Mounting mirrored gfs filesystems on taft-01... Mounting mirrored gfs filesystems on taft-02... Mounting mirrored gfs filesystems on taft-03... Mounting mirrored gfs filesystems on taft-04... Writing verification files (checkit) to mirror(s) on... Sleeping 12 seconds to get some outsanding I/O locks before the failure Verifying files (checkit) on mirror(s) on... Disabling device sde on taft-01 Disabling device sde on taft-02 Disabling device sde on taft-03 Disabling device sde on taft-04 Attempting I/O to cause mirror down conversion(s) on taft-04 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.108968 seconds, 385 MB/s Verifying the down conversion of the failed mirror(s) /dev/sde1: open failed: No such device or address Verifying LEG device /dev/sdg1 *IS* in the volume(s) /dev/sde1: open failed: No such device or address Verifying LEG device /dev/sdf1 *IS* in the volume(s) /dev/sde1: open failed: No such device or address Verifying FAILED device /dev/sde1 is *NOT* in the volume(s) /dev/sde1: open failed: No such device or address Verify the dm devices associated with /dev/sde1 are no longer present syncd_log_2legs_1_mlog on taft-03 should no longer be there [root@taft-04 tmp]# lvs -a -o +devices /dev/sde1: open failed: No such device or address LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices LogVol00 VolGroup00 -wi-ao 66.19G /dev/sda2(0) LogVol01 VolGroup00 -wi-ao 1.94G /dev/sda2(2118) syncd_log_2legs_1 helter_skelter mwi-ao 800.00M 100.00 syncd_log_2legs_1_mimage_0(0),syncd_log_2legs_1_mimage_1(0) [syncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 800.00M /dev/sdg1(0) [syncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 800.00M /dev/sdf1(0) syncd_log_2legs_1_mlog helter_skelter vwi--- 4.00M [root@taft-04 tmp]# dmsetup ls helter_skelter-syncd_log_2legs_1_mimage_1 (253, 4) helter_skelter-syncd_log_2legs_1_mimage_0 (253, 3) VolGroup00-LogVol01 (253, 1) helter_skelter-syncd_log_2legs_1 (253, 5) VolGroup00-LogVol00 (253, 0) [root@taft-03 tmp]# lvs -a -o +devices /dev/sde1: read failed after 0 of 2048 at 0: Input/output error LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices LogVol00 VolGroup00 -wi-ao 66.19G /dev/sda2(0) LogVol01 VolGroup00 -wi-ao 1.94G /dev/sda2(2118) syncd_log_2legs_1 helter_skelter mwi-ao 800.00M 100.00 syncd_log_2legs_1_mimage_0(0),syncd_log_2legs_1_mimage_1(0) [syncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 800.00M /dev/sdg1(0) [syncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 800.00M /dev/sdf1(0) syncd_log_2legs_1_mlog helter_skelter vwi-a- 4.00M [root@taft-03 tmp]# dmsetup ls helter_skelter-syncd_log_2legs_1_mlog (253, 2) helter_skelter-syncd_log_2legs_1_mimage_1 (253, 4) helter_skelter-syncd_log_2legs_1_mimage_0 (253, 3) VolGroup00-LogVol01 (253, 1) helter_skelter-syncd_log_2legs_1 (253, 5) VolGroup00-LogVol00 (253, 0)
Having trouble reproducing this issue with the latest build...
Taking this off the list 5.3 list until it can be more reliably reproduced.
The last time this was seen was 3 months ago. Closing for now and will reopen if seen again.