Bug 453478 - server recovery after leg failure can cause down convert to stall
Summary: server recovery after leg failure can cause down convert to stall
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cmirror
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Corey Marthaler
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-06-30 21:43 UTC by Corey Marthaler
Modified: 2010-12-10 00:12 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-10 00:12:43 UTC
Embargoed:


Attachments (Terms of Use)
log from taft-01 (141.05 KB, text/plain)
2008-07-01 14:14 UTC, Corey Marthaler
no flags Details
log from taft-02 (1.36 MB, text/plain)
2008-07-01 14:15 UTC, Corey Marthaler
no flags Details
log from taft-03 (203.39 KB, text/plain)
2008-07-01 14:16 UTC, Corey Marthaler
no flags Details
log from taft-04 (206.91 KB, text/plain)
2008-07-01 14:17 UTC, Corey Marthaler
no flags Details

Description Corey Marthaler 2008-06-30 21:43:36 UTC
Description of problem:
I believe that taft-01 was the server (though i'm not positive) and I rebooted
it during the down convert of these two mirrors.


Senario: Kill primary leg of synced 2 leg mirror(s)

****** Mirror hash info for this scenario ******
* name:      syncd_primary_2legs
* sync:      1
* mirrors:   2
* disklog:   1
* failpv:    /dev/sdh1
* legs:      2
* pvs:       /dev/sdh1 /dev/sde1 /dev/sdf1
************************************************

Creating mirror(s) on taft-02...
taft-02: lvcreate -m 1 -n syncd_primary_2legs_1 -L 800M helter_skelter
/dev/sdh1:0-1000 /dev/sde1:0-1000 /dev/sdf1:0-150
taft-02: lvcreate -m 1 -n syncd_primary_2legs_2 -L 800M helter_skelter
/dev/sdh1:0-1000 /dev/sde1:0-1000 /dev/sdf1:0-150

Waiting until all mirrors become fully syncd...
        0/2 mirror(s) are fully synced: ( 1=11.50% 2=0.00% )
        0/2 mirror(s) are fully synced: ( 1=28.50% 2=16.00% )
        0/2 mirror(s) are fully synced: ( 1=46.50% 2=33.00% )
        0/2 mirror(s) are fully synced: ( 1=64.00% 2=48.50% )
        0/2 mirror(s) are fully synced: ( 1=81.00% 2=64.50% )
        0/2 mirror(s) are fully synced: ( 1=98.00% 2=81.00% )
        2/2 mirror(s) are fully synced: ( 1=100.00% 2=100.00% )

Creating gfs on top of mirror(s) on taft-02...
Mounting mirrored gfs filesystems on taft-01...
Mounting mirrored gfs filesystems on taft-02...
Mounting mirrored gfs filesystems on taft-03...
Mounting mirrored gfs filesystems on taft-04...

Writing verification files (checkit) to mirror(s) on...
        ---- taft-01 ----
        ---- taft-02 ----
        ---- taft-03 ----
        ---- taft-04 ----

Sleeping 12 seconds to get some outsanding I/O locks before the failure

Disabling device sdh on taft-01
Disabling device sdh on taft-02
Disabling device sdh on taft-03
Disabling device sdh on taft-04

Attempting I/O to cause mirror down conversion(s) on taft-02
10+0 records in
10+0 records out

### FAILED TAFT-01 HERE ###

Verifying the down conversion of the failed mirror(s)
  /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s)
  /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
Verifying LEG device /dev/sde1 *IS* in the volume(s)
  /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
Verifying LOG device /dev/sdf1 is *NOT* in the linear(s)
  /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
log device /dev/sdf1 should no longer be present on taft-02


[root@taft-02 ~]# lvs -a -o +devices
  /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
  LV                             VG             Attr   LSize   Origin Snap% 
Move Log Copy%  Convert Devices
  LogVol00                       VolGroup00     -wi-ao  58.34G                 
                     /dev/sda2(0)
  LogVol01                       VolGroup00     -wi-ao   9.75G                 
                     /dev/sda2(1867)
  syncd_primary_2legs_1          helter_skelter -wi-ao 800.00M                 
                     /dev/sde1(0)
  syncd_primary_2legs_2          helter_skelter -wi-ao 800.00M                 
                     /dev/sde1(200)
  syncd_primary_2legs_2_mimage_0 helter_skelter vwi-a- 800.00M
  syncd_primary_2legs_2_mimage_1 helter_skelter vwi-a- 800.00M
  syncd_primary_2legs_2_mlog     helter_skelter -wi-a-   4.00M                 
                     /dev/sdf1(1)

[root@taft-02 ~]# dmsetup ls
helter_skelter-syncd_primary_2legs_2    (253, 9)
helter_skelter-syncd_primary_2legs_1    (253, 5)
helter_skelter-syncd_primary_2legs_2_mlog       (253, 6)
VolGroup00-LogVol01     (253, 1)
helter_skelter-syncd_primary_2legs_2_mimage_1   (253, 8)
VolGroup00-LogVol00     (253, 0)
helter_skelter-syncd_primary_2legs_2_mimage_0   (253, 7)

[root@taft-02 ~]# dmsetup ls --tree
helter_skelter-syncd_primary_2legs_2 (253:9)
 └─ (8:65)
helter_skelter-syncd_primary_2legs_1 (253:5)
 └─ (8:65)
helter_skelter-syncd_primary_2legs_2_mlog (253:6)
 └─ (8:81)
VolGroup00-LogVol01 (253:1)
 └─ (8:2)
helter_skelter-syncd_primary_2legs_2_mimage_1 (253:8)
 └─ (8:65)
VolGroup00-LogVol00 (253:0)
 └─ (8:2)
helter_skelter-syncd_primary_2legs_2_mimage_0 (253:7)
 └─ (8:113)

[root@taft-02 ~]# dmsetup info
Name:              helter_skelter-syncd_primary_2legs_2
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      133
Major, minor:      253, 9
Number of targets: 1
UUID: LVM-ZxFBvXqHXiY1v7hjkxualLvb1MfLx28I3gqP8dYb0T6NkpzVTBlx116u0n7rg1Ny

Name:              helter_skelter-syncd_primary_2legs_1
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      135
Major, minor:      253, 5
Number of targets: 1
UUID: LVM-ZxFBvXqHXiY1v7hjkxualLvb1MfLx28ICegFKfeKac72bXanEfnjJ1rqOZI0cRq7

Name:              helter_skelter-syncd_primary_2legs_2_mlog
State:             ACTIVE
Read Ahead:        0
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 6
Number of targets: 1
UUID: LVM-ZxFBvXqHXiY1v7hjkxualLvb1MfLx28IpDhvVdetz0kvAExmsKP1yFII7ZNwRQrX

Name:              VolGroup00-LogVol01
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 1
Number of targets: 1
UUID: LVM-0aFTiqoLYX7dWJU63sScCNgaO7boq16XlvpcVPHdnYWO8lwcHAKZEeJjxI49e75R

Name:              helter_skelter-syncd_primary_2legs_2_mimage_1
State:             ACTIVE
Read Ahead:        0
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 8
Number of targets: 1
UUID: LVM-ZxFBvXqHXiY1v7hjkxualLvb1MfLx28IhBOahJQlvV2vRARwnSVbeirMvcAugNxP

Name:              VolGroup00-LogVol00
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      0
Major, minor:      253, 0
Number of targets: 1
UUID: LVM-0aFTiqoLYX7dWJU63sScCNgaO7boq16XDS8P1Q22JxhHkAfgPaQUhfFwbeuN3QFA

Name:              helter_skelter-syncd_primary_2legs_2_mimage_0
State:             ACTIVE
Read Ahead:        0
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 7
Number of targets: 1
UUID: LVM-ZxFBvXqHXiY1v7hjkxualLvb1MfLx28IbNy3koZCprUZl3VWqJ2adZssQcIoCWrZ


Version-Release number of selected component (if applicable):
2.6.9-71.ELsmp

lvm2-2.02.37-3.el4    BUILT: Thu Jun 12 10:09:19 CDT 2008
lvm2-cluster-2.02.37-3.el4    BUILT: Thu Jun 12 10:22:07 CDT 2008
device-mapper-1.02.25-2.el4    BUILT: Mon Jun  9 09:28:41 CDT 2008
cmirror-1.0.1-1    BUILT: Tue Jan 30 17:28:02 CST 2007
cmirror-kernel-2.6.9-41.4    BUILT: Tue Jun  3 13:54:29 CDT 2008

Comment 1 Corey Marthaler 2008-07-01 14:14:42 UTC
Created attachment 310672 [details]
log from taft-01

Comment 2 Corey Marthaler 2008-07-01 14:15:30 UTC
Created attachment 310673 [details]
log from taft-02

Comment 3 Corey Marthaler 2008-07-01 14:16:30 UTC
Created attachment 310675 [details]
log from taft-03

Comment 4 Corey Marthaler 2008-07-01 14:17:17 UTC
Created attachment 310676 [details]
log from taft-04

Comment 7 Corey Marthaler 2010-12-10 00:12:43 UTC
This bug no longer exists in the latest 5.6 rpms.

I tried rebooting the server twice (2-way mirror, 1 log), once w/ the leg fault policy set to 'allocate' and once w/ it set to 'remove' and didn't see an issue either time. 

Marking this bug CLOSED.


Note You need to log in before you can comment on or make changes to this bug.