Bug 228104
Summary: | greater than 2 legged cluster mirrors do not down convert when a leg fails | ||
---|---|---|---|
Product: | [Retired] Red Hat Cluster Suite | Reporter: | Corey Marthaler <cmarthal> |
Component: | cmirror | Assignee: | Jonathan Earl Brassow <jbrassow> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | high | ||
Version: | 4 | CC: | agk, cfeist, dwysocha, jbrassow, mbroz, prockai |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-08-05 21:42:50 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Corey Marthaler
2007-02-09 23:36:03 UTC
Hmmm, this appears to work just fine in single node mirroring. [root@link-07 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Move Log Copy% Devices mirror vg mwi-ao 10.00G mirror_mlog 10.70 mirror_mimage_0(0),mirror_mimage_1(0),mirror_mimage_2(0),mirror_mimage_3(0) [mirror_mimage_0] vg iwi-ao 10.00G /dev/sdh1(0) [mirror_mimage_1] vg iwi-ao 10.00G /dev/sda1(0) [mirror_mimage_2] vg iwi-ao 10.00G /dev/sdb1(0) [mirror_mimage_3] vg iwi-ao 10.00G /dev/sdc1(0) [mirror_mlog] vg lwi-ao 4.00M /dev/sdd1(0) # FAIL /dev/sdh and wait... [root@link-07 ~]# lvs -a -o +devices /dev/sdh: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error /dev/sdh2: read failed after 0 of 2048 at 0: Input/output error LV VG Attr LSize Origin Snap% Move Log Copy% Devices mirror vg mwi-ao 10.00G mirror_mlog 13.12 mirror_mimage_3(0),mirror_mimage_1(0),mirror_mimage_2(0) [mirror_mimage_1] vg iwi-ao 10.00G /dev/sda1(0) [mirror_mimage_2] vg iwi-ao 10.00G /dev/sdb1(0) [mirror_mimage_3] vg iwi-ao 10.00G /dev/sdc1(0) [mirror_mlog] vg lwi-ao 4.00M /dev/sdd1(0) Here's what actually happening from the user's view point... Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg-cmirror1 9.5G 20K 9.5G 1% /mnt/gfs1 [root@link-07 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Move Log Copy% Devices cmirror1 vg mwi-ao 10.00G cmirror1_mlog 71.33 cmirror1_mimage_0(0),cmirror1_mimage_1(0),cmirror1_mimage_2(0) [cmirror1_mimage_0] vg iwi-ao 10.00G /dev/sdh2(0) [cmirror1_mimage_1] vg iwi-ao 10.00G /dev/sde1(0) [cmirror1_mimage_2] vg iwi-ao 10.00G /dev/sdf1(0) [cmirror1_mlog] vg lwi-ao 4.00M /dev/sdg2(0) [root@link-07 ~]# ls -lrt /mnt/gfs1 total 3936 -rw-rw-rw- 1 root root 1000000 Feb 12 08:33 link-02 -rw-rw-rw- 1 root root 1000000 Feb 12 09:00 link-08 -rw-rw-rw- 1 root root 1000000 Feb 12 09:00 link-04 -rw-rw-rw- 1 root root 1000000 Feb 12 14:07 link-07 [FAIL /dev/sdh] [root@link-07 ~]# ls -lrt /mnt/gfs1 ls: /mnt/gfs1: Input/output error [root@link-07 ~]# touch /mnt/gfs1/foo touch: cannot touch `/mnt/gfs1/foo': Input/output error Filesystem Size Used Avail Use% Mounted on df: `/mnt/gfs1': Input/output error # The leg "remains" in the mirror [root@link-08 ~]# lvs -a -o +devices /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error /dev/dm-6: read failed after 0 of 4096 at 0: Input/output error /dev/sdh: read failed after 0 of 4096 at 0: Input/output error /dev/sdh2: read failed after 0 of 2048 at 0: Input/output error LV VG Attr LSize Origin Snap% Move Log Copy% Devices cmirror1 vg mwi-ao 10.00G cmirror1_mlog 86.17 cmirror1_mimage_0(0),cmirror1_mimage_1(0),cmirror1_mimage_2(0) [cmirror1_mimage_0] vg iwi-ao 10.00G [cmirror1_mimage_1] vg iwi-ao 10.00G /dev/sde1(0) [cmirror1_mimage_2] vg iwi-ao 10.00G /dev/sdf1(0) [cmirror1_mlog] vg lwi-ao 4.00M /dev/sdg2(0) All the nodes running I/O to GFS on cmirror lose their connection to the machine: [...] <xior magic="0xfeed10"><read syscall="readv"><path>/mnt/gfs1/link-07</path><oflags>O_RDONLY</oflags><offset>0</offset><count>163676</count></read></xior> <xior magic="0xfeed10"><write syscall="write"><path>/mnt/gfs1/link-07</path><oflags>O_RDWR</oflags><offset>0</offset><count>974966</count><pattern>D</pattern></write></xior> Connection to link-07 closed. [...] <xior magic="0xfeed10"><read syscall="read"><path>/mnt/gfs1/link-04</path><oflags>O_RDONLY</oflags><offset>0</offset><count>814950</count></read></xior> <xior magic="0xfeed10"><write syscall="writev"><path>/mnt/gfs1/link-04</path><oflags>O_RDWR</oflags><offset>0</offset><count>703955</count><pattern>P</pattern></write></xior> Connection to link-04 closed. # sync % is stuck [root@link-08 ~]# lvs -a -o +devices /dev/dm-3: read failed after 0 of 4096 at 0: Input/output error /dev/dm-6: read failed after 0 of 4096 at 0: Input/output error /dev/sdh: read failed after 0 of 4096 at 0: Input/output error /dev/sdh2: read failed after 0 of 2048 at 0: Input/output error LV VG Attr LSize Origin Snap% Move Log Copy% Devices cmirror1 vg mwi-ao 10.00G cmirror1_mlog 86.17 cmirror1_mimage_0(0),cmirror1_mimage_1(0),cmirror1_mimage_2(0) [cmirror1_mimage_0] vg iwi-ao 10.00G [cmirror1_mimage_1] vg iwi-ao 10.00G /dev/sde1(0) [cmirror1_mimage_2] vg iwi-ao 10.00G /dev/sdf1(0) [cmirror1_mlog] vg lwi-ao 4.00M /dev/sdg2(0) I propose setting a restriction that mirrors are limited to 2 sides for 4.5. This would diffuse this bug. Once we agree on that, I'll open a RFE for 4.6 and make this bug dependent on that. Greater then 2 legged mirrors have been verified to down convert during leg failures. Fixed in current release (4.7). |