Bug 801613

Summary: Failed to replace faulty raid devices: Reshaping arrays not yet supported
Product: Red Hat Enterprise Linux 6 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED WORKSFORME QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: high    
Version: 6.3CC: agk, dwysocha, heinzm, jbrassow, mbroz, msnitzer, prajnoha, prockai, thornber, zkabelac
Target Milestone: rcKeywords: TestBlocker
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-03-15 14:55:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2012-03-09 00:23:37 UTC
Description of problem:
I'm seeing this issue while running raid image failure testing with allocation turned on. This may be related to bug 801571.

Scenario kill_random_synced_raid5_3legs: Kill random leg of synced 3 leg raid5 volume(s)

********* RAID hash info for this scenario *********
* names:              synced_random_raid5_3legs_1
* sync:               1
* type:               raid5
* -m |-i value:       3
* leg devices:        /dev/sdg1 /dev/sdf1 /dev/sde1 /dev/sdc1
* failpv(s):          /dev/sdc1
* failnode(s):        taft-01
* raid fault policy:   allocate
******************************************************

Creating raids(s) on taft-01...
taft-01: lvcreate --type raid5 -i 3 -n synced_random_raid5_3legs_1 -L 500M black_bird /dev/sdg1:0-1000 /dev/sdf1:0-1000 /dev/sde1:0-1000 /dev/sdc1:0-1000

RAID Structure(s):
  LV                                     Attr     LSize   Copy%  Devices
  synced_random_raid5_3legs_1            rwi-a-r- 504.00m        synced_random_raid5_3legs_1_rimage_0(0),synced_random_raid5_3legs_1_rimage_1(0),synced_random_raid5_3legs_1_rimage_2(0),synced_random_raid5_3legs_1_rimage_3(0)
  [synced_random_raid5_3legs_1_rimage_0] Iwi-aor- 168.00m        /dev/sdg1(1)
  [synced_random_raid5_3legs_1_rimage_1] Iwi-aor- 168.00m        /dev/sdf1(1)
  [synced_random_raid5_3legs_1_rimage_2] Iwi-aor- 168.00m        /dev/sde1(1)
  [synced_random_raid5_3legs_1_rimage_3] Iwi-aor- 168.00m        /dev/sdc1(1)
  [synced_random_raid5_3legs_1_rmeta_0]  ewi-aor-   4.00m        /dev/sdg1(0)
  [synced_random_raid5_3legs_1_rmeta_1]  ewi-aor-   4.00m        /dev/sdf1(0)
  [synced_random_raid5_3legs_1_rmeta_2]  ewi-aor-   4.00m        /dev/sde1(0)
  [synced_random_raid5_3legs_1_rmeta_3]  ewi-aor-   4.00m        /dev/sdc1(0)

PVS IN VG: /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1
PV=/dev/sdc1
     synced_random_raid5_3legs_1_rimage_3: 1
     synced_random_raid5_3legs_1_rmeta_3: 1

Creating ext on top of mirror(s) on taft-01...
mke2fs 1.41.12 (17-May-2010)
Mounting mirrored ext filesystems on taft-01...

Writing verification files (checkit) to mirror(s) on...
     ---- taft-01 ----

Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
     ---- taft-01 ----

Disabling device sdc on taft-01

Attempting I/O to cause mirror down conversion(s) on taft-01
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.536352 s, 78.2 MB/s

Verifying current sanity of lvm after the failure

RAID Structure(s):
  /dev/sdc1: read failed after 0 of 512 at 145669554176: Input/output error
  /dev/sdc1: read failed after 0 of 512 at 145669664768: Input/output error
  /dev/sdc1: read failed after 0 of 512 at 0: Input/output error
  /dev/sdc1: read failed after 0 of 512 at 4096: Input/output error
  /dev/sdc1: read failed after 0 of 2048 at 0: Input/output error
  Couldn't find device with uuid LOdwxl-cgjB-XCHH-e4j0-FNry-aZlG-D3Edvk.
  LV                                     Attr     LSize   Copy%  Devices
  synced_random_raid5_3legs_1            rwi-aor- 504.00m        synced_random_raid5_3legs_1_rimage_0(0),synced_random_raid5_3legs_1_rimage_1(0),synced_random_raid5_3legs_1_rimage_2(0),synced_random_raid5_3legs_1_rimage_3(0)
  [synced_random_raid5_3legs_1_rimage_0] iwi-aor- 168.00m        /dev/sdg1(1)
  [synced_random_raid5_3legs_1_rimage_1] iwi-aor- 168.00m        /dev/sdf1(1)
  [synced_random_raid5_3legs_1_rimage_2] iwi-aor- 168.00m        /dev/sde1(1)
  [synced_random_raid5_3legs_1_rimage_3] iwi-aor- 168.00m        unknown device(1)
  [synced_random_raid5_3legs_1_rmeta_0]  ewi-aor-   4.00m        /dev/sdg1(0)
  [synced_random_raid5_3legs_1_rmeta_1]  ewi-aor-   4.00m        /dev/sdf1(0)
  [synced_random_raid5_3legs_1_rmeta_2]  ewi-aor-   4.00m        /dev/sde1(0)
  [synced_random_raid5_3legs_1_rmeta_3]  ewi-aor-   4.00m        unknown device(0)

Verifying FAILED device /dev/sdc1 is *NOT* in the volume(s)
Verifying IMAGE device /dev/sdg1 *IS* in the volume(s)
Verifying IMAGE device /dev/sdf1 *IS* in the volume(s)
Verifying IMAGE device /dev/sde1 *IS* in the volume(s)
verify the rimage/rmeta dm devices remain after the failures
Checking EXISTENCE and STATE of synced_random_raid5_3legs_1_rimage_3 on:  taft-01there should not be an 'unknown' device associated with synced_random_raid5_3legs_1_rimage_3 on taft-01



Mar  8 17:51:20 taft-01 kernel: device-mapper: raid: Reshaping arrays not yet supported.
Mar  8 17:51:20 taft-01 kernel: device-mapper: table: 253:11: raid: Unable to assemble array: Invalid superblocks
Mar  8 17:51:20 taft-01 kernel: device-mapper: ioctl: error adding target to table
Mar  8 17:51:20 taft-01 lvm[3262]: device-mapper: reload ioctl on  failed: Invalid argument
Mar  8 17:51:20 taft-01 lvm[3262]: Failed to suspend black_bird/synced_random_raid5_3legs_1 before committing changes
Mar  8 17:51:20 taft-01 lvm[3262]: Failed to replace faulty devices in black_bird/synced_random_raid5_3legs_1.
Mar  8 17:51:20 taft-01 lvm[3262]: Repair of RAID device black_bird-synced_random_raid5_3legs_1 failed.
Mar  8 17:51:20 taft-01 lvm[3262]: Failed to process event for black_bird-synced_random_raid5_3legs_1


Version-Release number of selected component (if applicable):
2.6.32-220.4.2.el6.x86_64

lvm2-2.02.95-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
lvm2-libs-2.02.95-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
lvm2-cluster-2.02.95-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.74-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
device-mapper-libs-1.02.74-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
device-mapper-event-1.02.74-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
device-mapper-event-libs-1.02.74-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
cmirror-2.02.95-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012


How reproducible:
Often

Comment 1 Corey Marthaler 2012-03-15 14:55:47 UTC
I have not been able to reproduce this since updating the kernel. Closing and
will reopen if seen again in the future...

2.6.32-251.el6.x86_64
lvm2-2.02.95-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
lvm2-libs-2.02.95-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
lvm2-cluster-2.02.95-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.74-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
device-mapper-libs-1.02.74-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
device-mapper-event-1.02.74-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
device-mapper-event-libs-1.02.74-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012
cmirror-2.02.95-1.el6    BUILT: Tue Mar  6 10:00:33 CST 2012