Bug 796348 - non raid1 volumes are not able to be repaired after device failure
Summary: non raid1 volumes are not able to be repaired after device failure
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.3
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-22 17:43 UTC by Corey Marthaler
Modified: 2012-06-20 15:01 UTC (History)
9 users (show)

Fixed In Version: lvm2-2.02.95-1.el6
Doc Type: Bug Fix
Doc Text:
New Feature to 6.3. No documentation required. Bug 732458 is the bug that requires a release note for the RAID features. Other documentation is found in the LVM manual. Operational bugs need no documentation because they are being fixed before their initial release.
Clone Of:
Environment:
Last Closed: 2012-06-20 15:01:37 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0962 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2012-06-19 21:12:11 UTC

Description Corey Marthaler 2012-02-22 17:43:35 UTC
Description of problem:
When the raid fault policy is set to warn and a raid device is failed, that failed volume should be able to be repaired using lvconvert. However, that currently only works for raid1 volumes.


# RAID 4 volume
[root@taft-02 bin]# lvs -a -o +devices
  /dev/sdh1: read failed after 0 of 512 at 145669554176: Input/output error
  /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid n9wkln-uB6W-zKOy-AfIS-GnwJ-0en7-TRYaT8.
  LV                                      VG         Attr     LSize   Copy%  Devices
  synced_primary_raid4_2legs_1            black_bird rwi-aor- 504.00m        synced_primary_raid4_2legs_1_rimage_0(0),synced_primary_raid4_2legs_1_rimage_1(0),synced_primary_raid4_2legs_1_rimage_2(0)
  [synced_primary_raid4_2legs_1_rimage_0] black_bird iwi-aor- 252.00m        unknown device(1)
  [synced_primary_raid4_2legs_1_rimage_1] black_bird iwi-aor- 252.00m        /dev/sdd1(1)
  [synced_primary_raid4_2legs_1_rimage_2] black_bird iwi-aor- 252.00m        /dev/sdf1(1)
  [synced_primary_raid4_2legs_1_rmeta_0]  black_bird ewi-aor-   4.00m        unknown device(0)
  [synced_primary_raid4_2legs_1_rmeta_1]  black_bird ewi-aor-   4.00m        /dev/sdd1(0)
  [synced_primary_raid4_2legs_1_rmeta_2]  black_bird ewi-aor-   4.00m        /dev/sdf1(0)

[root@taft-02 bin]# lvconvert --repair black_bird/synced_primary_raid4_2legs_1
  /dev/sdh1: read failed after 0 of 512 at 145669554176: Input/output error
  /dev/sdh1: read failed after 0 of 512 at 145669664768: Input/output error
  /dev/sdh1: read failed after 0 of 512 at 0: Input/output error
  /dev/sdh1: read failed after 0 of 512 at 4096: Input/output error
  /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid n9wkln-uB6W-zKOy-AfIS-GnwJ-0en7-TRYaT8.
  Can't repair non-mirrored LV "synced_primary_raid4_2legs_1".



Version-Release number of selected component (if applicable):
2.6.32-236.el6.x86_64

lvm2-2.02.92-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
lvm2-libs-2.02.92-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
lvm2-cluster-2.02.92-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.71-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
device-mapper-libs-1.02.71-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
device-mapper-event-1.02.71-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
device-mapper-event-libs-1.02.71-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012
cmirror-2.02.92-0.40.el6    BUILT: Thu Feb 16 18:12:38 CST 2012


How reproducible:
Everytime

Comment 1 Jonathan Earl Brassow 2012-02-23 04:09:53 UTC
Fix checked in upstream in 2.02.93.

I've confirmed the solution works with 'raid1' and 'raid5' for the "warn" and "allocate" policies.

Comment 4 Corey Marthaler 2012-03-26 23:42:21 UTC
Fix verified in the latest rpms.

2.6.32-251.el6.x86_64
lvm2-2.02.95-2.el6    BUILT: Fri Mar 16 08:39:54 CDT 2012
lvm2-libs-2.02.95-2.el6    BUILT: Fri Mar 16 08:39:54 CDT 2012
lvm2-cluster-2.02.95-2.el6    BUILT: Fri Mar 16 08:39:54 CDT 2012
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.74-2.el6    BUILT: Fri Mar 16 08:39:54 CDT 2012
device-mapper-libs-1.02.74-2.el6    BUILT: Fri Mar 16 08:39:54 CDT 2012
device-mapper-event-1.02.74-2.el6    BUILT: Fri Mar 16 08:39:54 CDT 2012
device-mapper-event-libs-1.02.74-2.el6    BUILT: Fri Mar 16 08:39:54 CDT 2012
cmirror-2.02.95-2.el6    BUILT: Fri Mar 16 08:39:54 CDT 2012


[...]
Fault policy is warn, manually repairing failed raid volumes
taft-01: 'echo y | lvconvert --repair black_bird/synced_random_raid6_4legs_1'
  /dev/sde1: read failed after 0 of 512 at 145669554176: Input/output error
  /dev/sde1: read failed after 0 of 512 at 145669664768: Input/output error
  /dev/sde1: read failed after 0 of 512 at 0: Input/output error
  /dev/sde1: read failed after 0 of 512 at 4096: Input/output error
  /dev/sde1: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid DVXjHM-T5RK-34Uj-g103-OHy7-0Bg2-SKZd7g.
Attempt to replace failed RAID images (requires full device resync)? [y/n]: 
Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )
[...]

Comment 5 Jonathan Earl Brassow 2012-04-23 18:29:35 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
New Feature to 6.3.  No documentation required.

Bug 732458 is the bug that requires a release note for the RAID features.  Other documentation is found in the LVM manual.

Operational bugs need no documentation because they are being fixed before their initial release.

Comment 7 errata-xmlrpc 2012-06-20 15:01:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0962.html


Note You need to log in before you can comment on or make changes to this bug.