RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1397589 - Raid 1/4/5/6 device failure repair regression (Unable to extract RAID image while RAID array is not in-sync)
Summary: Raid 1/4/5/6 device failure repair regression (Unable to extract RAID image w...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.8
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: ---
Assignee: Heinz Mauelshagen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On: 1311765
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-22 21:56 UTC by Corey Marthaler
Modified: 2017-03-21 12:04 UTC (History)
7 users (show)

Fixed In Version: lvm2-2.02.143-10.el6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-21 12:04:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0798 0 normal SHIPPED_LIVE lvm2 bug fix update 2017-03-21 12:51:51 UTC

Description Corey Marthaler 2016-11-22 21:56:08 UTC
Description of problem:
This may be related to the fix for bug 1311765? All raid 4/5/6 device failure repair attempts, whether automatically with an "allocate" policy or manually with a "warn" policy, fail with an "is not in-sync" error even though the raid volumes *are* in sync.



# 6.8 - Works as expected                                                                                                                                                                                                                         
                                                                                                                                                                                                                               
2.6.32-642.11.1.el6.x86_64                                                                                                                                                                                                               
                                                                                                                                                                                                                                         
lvm2-2.02.143-7.el6_8.1    BUILT: Mon Aug 15 02:23:52 CDT 2016                                                                                                                                                                           
lvm2-libs-2.02.143-7.el6_8.1    BUILT: Mon Aug 15 02:23:52 CDT 2016                                                                                                                                                                              
lvm2-cluster-2.02.143-7.el6_8.1    BUILT: Mon Aug 15 02:23:52 CDT 2016                                                                                                                                                                           
udev-147-2.73.el6_8.2    BUILT: Tue Aug 30 08:17:19 CDT 2016                                                                                                                                                                                             
device-mapper-1.02.117-7.el6_8.1    BUILT: Mon Aug 15 02:23:52 CDT 2016                                                                                                                                                                                  
device-mapper-libs-1.02.117-7.el6_8.1    BUILT: Mon Aug 15 02:23:52 CDT 2016                                                                                                                                                                             
device-mapper-event-1.02.117-7.el6_8.1    BUILT: Mon Aug 15 02:23:52 CDT 2016                                                                                                                                                                            
device-mapper-event-libs-1.02.117-7.el6_8.1    BUILT: Mon Aug 15 02:23:52 CDT 2016                                                                                                                                                                            
device-mapper-persistent-data-0.6.2-0.1.rc7.el6    BUILT: Tue Mar 22 08:58:09 CDT 2016                                                                                                                                                                        
                                                                                                                                                                                                                                                                
                                                                                                                                                                                                                                                                   
[root@host-091 ~]# lvs -a -o +devices                                                                                                                                                                                                                               
  WARNING: Device for PV N0rLEq-khqm-yoiR-c2qo-4fRD-Dt5o-10RHoe not found or rejected by a filter.
  WARNING: Device for PV iCFGId-RcRi-3WEe-jwUN-5MyJ-lJeQ-xYxGgF not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_multiple_raid6_3legs_1_rimage_0 while checking used and assumed devices.
  LV                                       VG         Attr       LSize   Cpy%Sync Devices
  synced_multiple_raid6_3legs_1            black_bird rwi-aor-p- 504.00m 100.00   synced_multiple_raid6_3legs_1_rimage_0(0),synced_multiple_raid6_3legs_1_rimage_1(0),synced_multiple_raid6_3legs_1_rimage_2(0),synced_multiple_raid6_3legs_1_rimage_3(0),synced_multiple_raid6_3legs_1_rimage_4(0)
  [synced_multiple_raid6_3legs_1_rimage_0] black_bird iwi-aor-p- 168.00m          unknown device(1)
  [synced_multiple_raid6_3legs_1_rimage_1] black_bird iwi-aor--- 168.00m          /dev/sdh1(1)
  [synced_multiple_raid6_3legs_1_rimage_2] black_bird iwi-aor--- 168.00m          /dev/sdd1(1)
  [synced_multiple_raid6_3legs_1_rimage_3] black_bird iwi-aor-p- 168.00m          unknown device(1)
  [synced_multiple_raid6_3legs_1_rimage_4] black_bird iwi-aor--- 168.00m          /dev/sde1(1)
  [synced_multiple_raid6_3legs_1_rmeta_0]  black_bird ewi-aor-p-   4.00m          unknown device(0)
  [synced_multiple_raid6_3legs_1_rmeta_1]  black_bird ewi-aor---   4.00m          /dev/sdh1(0)
  [synced_multiple_raid6_3legs_1_rmeta_2]  black_bird ewi-aor---   4.00m          /dev/sdd1(0)
  [synced_multiple_raid6_3legs_1_rmeta_3]  black_bird ewi-aor-p-   4.00m          unknown device(0)
  [synced_multiple_raid6_3legs_1_rmeta_4]  black_bird ewi-aor---   4.00m          /dev/sde1(0)

[root@host-091 ~]# lvconvert --yes --repair black_bird/synced_multiple_raid6_3legs_1
  WARNING: Device for PV N0rLEq-khqm-yoiR-c2qo-4fRD-Dt5o-10RHoe not found or rejected by a filter.
  WARNING: Device for PV iCFGId-RcRi-3WEe-jwUN-5MyJ-lJeQ-xYxGgF not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_multiple_raid6_3legs_1_rimage_0 while checking used and assumed devices.
  Faulty devices in black_bird/synced_multiple_raid6_3legs_1 successfully replaced.

[root@host-091 ~]# lvs -a -o +devices
  WARNING: Device for PV N0rLEq-khqm-yoiR-c2qo-4fRD-Dt5o-10RHoe not found or rejected by a filter.
  WARNING: Device for PV iCFGId-RcRi-3WEe-jwUN-5MyJ-lJeQ-xYxGgF not found or rejected by a filter.
  LV                                       VG         Attr       LSize   Cpy%Sync Devices
  synced_multiple_raid6_3legs_1            black_bird rwi-aor--- 504.00m 100.00   synced_multiple_raid6_3legs_1_rimage_0(0),synced_multiple_raid6_3legs_1_rimage_1(0),synced_multiple_raid6_3legs_1_rimage_2(0),synced_multiple_raid6_3legs_1_rimage_3(0),synced_multiple_raid6_3legs_1_rimage_4(0)
  [synced_multiple_raid6_3legs_1_rimage_0] black_bird iwi-aor--- 168.00m          /dev/sdg1(1)
  [synced_multiple_raid6_3legs_1_rimage_1] black_bird iwi-aor--- 168.00m          /dev/sdh1(1)
  [synced_multiple_raid6_3legs_1_rimage_2] black_bird iwi-aor--- 168.00m          /dev/sdd1(1)
  [synced_multiple_raid6_3legs_1_rimage_3] black_bird iwi-aor--- 168.00m          /dev/sdf1(1)
  [synced_multiple_raid6_3legs_1_rimage_4] black_bird iwi-aor--- 168.00m          /dev/sde1(1)
  [synced_multiple_raid6_3legs_1_rmeta_0]  black_bird ewi-aor---   4.00m          /dev/sdg1(0)
  [synced_multiple_raid6_3legs_1_rmeta_1]  black_bird ewi-aor---   4.00m          /dev/sdh1(0)
  [synced_multiple_raid6_3legs_1_rmeta_2]  black_bird ewi-aor---   4.00m          /dev/sdd1(0)
  [synced_multiple_raid6_3legs_1_rmeta_3]  black_bird ewi-aor---   4.00m          /dev/sdf1(0)
  [synced_multiple_raid6_3legs_1_rmeta_4]  black_bird ewi-aor---   4.00m          /dev/sde1(0)





# 6.9 - No longer works

2.6.32-671.el6.x86_64

lvm2-2.02.143-9.el6    BUILT: Thu Nov 10 03:21:10 CST 2016
lvm2-libs-2.02.143-9.el6    BUILT: Thu Nov 10 03:21:10 CST 2016
lvm2-cluster-2.02.143-9.el6    BUILT: Thu Nov 10 03:21:10 CST 2016
udev-147-2.73.el6_8.2    BUILT: Tue Aug 30 08:17:19 CDT 2016
device-mapper-1.02.117-9.el6    BUILT: Thu Nov 10 03:21:10 CST 2016
device-mapper-libs-1.02.117-9.el6    BUILT: Thu Nov 10 03:21:10 CST 2016
device-mapper-event-1.02.117-9.el6    BUILT: Thu Nov 10 03:21:10 CST 2016
device-mapper-event-libs-1.02.117-9.el6    BUILT: Thu Nov 10 03:21:10 CST 2016
device-mapper-persistent-data-0.6.2-0.1.rc7.el6    BUILT: Tue Mar 22 08:58:09 CDT 2016


[root@host-078 ~]# lvs -a -o +devices
  WARNING: Device for PV gxCZNr-ihos-8uS8-72qS-tTPj-eTz3-BEZaYB not found or rejected by a filter.
  WARNING: Device for PV BAZkg2-VrsH-bSec-wTFV-jLD3-dXOu-gbd0Pi not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_multiple_raid6_3legs_1_rimage_1 while checking used and assumed devices.
  LV                                       VG         Attr       LSize   Cpy%Sync Devices
  synced_multiple_raid6_3legs_1            black_bird rwi-aor-p- 504.00m 100.00   synced_multiple_raid6_3legs_1_rimage_0(0),synced_multiple_raid6_3legs_1_rimage_1(0),synced_multiple_raid6_3legs_1_rimage_2(0),synced_multiple_raid6_3legs_1_rimage_3(0),synced_multiple_raid6_3legs_1_rimage_4(0)
  [synced_multiple_raid6_3legs_1_rimage_0] black_bird iwi-aor--- 168.00m          /dev/sdg1(1)
  [synced_multiple_raid6_3legs_1_rimage_1] black_bird iwi-aor-p- 168.00m          unknown device(1)
  [synced_multiple_raid6_3legs_1_rimage_2] black_bird iwi-aor-p- 168.00m          unknown device(1)
  [synced_multiple_raid6_3legs_1_rimage_3] black_bird iwi-aor--- 168.00m          /dev/sdb1(1)
  [synced_multiple_raid6_3legs_1_rimage_4] black_bird iwi-aor--- 168.00m          /dev/sdf1(1)
  [synced_multiple_raid6_3legs_1_rmeta_0]  black_bird ewi-aor---   4.00m          /dev/sdg1(0)
  [synced_multiple_raid6_3legs_1_rmeta_1]  black_bird ewi-aor-p-   4.00m          unknown device(0)
  [synced_multiple_raid6_3legs_1_rmeta_2]  black_bird ewi-aor-p-   4.00m          unknown device(0)
  [synced_multiple_raid6_3legs_1_rmeta_3]  black_bird ewi-aor---   4.00m          /dev/sdb1(0)
  [synced_multiple_raid6_3legs_1_rmeta_4]  black_bird ewi-aor---   4.00m          /dev/sdf1(0)

[root@host-078 ~]# lvconvert --yes --repair black_bird/synced_multiple_raid6_3legs_1
  WARNING: Device for PV gxCZNr-ihos-8uS8-72qS-tTPj-eTz3-BEZaYB not found or rejected by a filter.
  WARNING: Device for PV BAZkg2-VrsH-bSec-wTFV-jLD3-dXOu-gbd0Pi not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_multiple_raid6_3legs_1_rimage_1 while checking used and assumed devices.
  Unable to extract RAID image while RAID array is not in-sync
  Failed to remove the specified images from black_bird/synced_multiple_raid6_3legs_1
  Failed to replace faulty devices in black_bird/synced_multiple_raid6_3legs_1.

[root@host-078 ~]# lvs -a -o +devices
  WARNING: Device for PV gxCZNr-ihos-8uS8-72qS-tTPj-eTz3-BEZaYB not found or rejected by a filter.
  WARNING: Device for PV BAZkg2-VrsH-bSec-wTFV-jLD3-dXOu-gbd0Pi not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_multiple_raid6_3legs_1_rimage_1 while checking used and assumed devices.
  LV                                       VG         Attr       LSize   Cpy%Sync Devices
  synced_multiple_raid6_3legs_1            black_bird rwi-aor-p- 504.00m 100.00   synced_multiple_raid6_3legs_1_rimage_0(0),synced_multiple_raid6_3legs_1_rimage_1(0),synced_multiple_raid6_3legs_1_rimage_2(0),synced_multiple_raid6_3legs_1_rimage_3(0),synced_multiple_raid6_3legs_1_rimage_4(0)
  [synced_multiple_raid6_3legs_1_rimage_0] black_bird iwi-aor--- 168.00m          /dev/sdg1(1)
  [synced_multiple_raid6_3legs_1_rimage_1] black_bird iwi-aor-p- 168.00m          unknown device(1)
  [synced_multiple_raid6_3legs_1_rimage_2] black_bird iwi-aor-p- 168.00m          unknown device(1)
  [synced_multiple_raid6_3legs_1_rimage_3] black_bird iwi-aor--- 168.00m          /dev/sdb1(1)
  [synced_multiple_raid6_3legs_1_rimage_4] black_bird iwi-aor--- 168.00m          /dev/sdf1(1)
  [synced_multiple_raid6_3legs_1_rmeta_0]  black_bird ewi-aor---   4.00m          /dev/sdg1(0)
  [synced_multiple_raid6_3legs_1_rmeta_1]  black_bird ewi-aor-p-   4.00m          unknown device(0)
  [synced_multiple_raid6_3legs_1_rmeta_2]  black_bird ewi-aor-p-   4.00m          unknown device(0)
  [synced_multiple_raid6_3legs_1_rmeta_3]  black_bird ewi-aor---   4.00m          /dev/sdb1(0)
  [synced_multiple_raid6_3legs_1_rmeta_4]  black_bird ewi-aor---   4.00m          /dev/sdf1(0)

Comment 1 Corey Marthaler 2016-11-22 22:10:24 UTC
# 6.8 raid5 attempt

[root@host-091 ~]# lvs -a -o +devices
  WARNING: Device for PV h9L4KG-Rclw-ZDAd-nHxG-iLFN-YQkL-WmyADY not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_random_raid5_2legs_1_rimage_1 while checking used and assumed devices.
  LV                                     VG         Attr       LSize   Cpy%Sync Devices
  synced_random_raid5_2legs_1            black_bird rwi-aor-p- 504.00m 100.00   synced_random_raid5_2legs_1_rimage_0(0),synced_random_raid5_2legs_1_rimage_1(0),synced_random_raid5_2legs_1_rimage_2(0)
  [synced_random_raid5_2legs_1_rimage_0] black_bird iwi-aor--- 252.00m          /dev/sdc1(1)
  [synced_random_raid5_2legs_1_rimage_1] black_bird iwi-aor-p- 252.00m          unknown device(1)
  [synced_random_raid5_2legs_1_rimage_2] black_bird iwi-aor--- 252.00m          /dev/sdf1(1)
  [synced_random_raid5_2legs_1_rmeta_0]  black_bird ewi-aor---   4.00m          /dev/sdc1(0)
  [synced_random_raid5_2legs_1_rmeta_1]  black_bird ewi-aor-p-   4.00m          unknown device(0)
  [synced_random_raid5_2legs_1_rmeta_2]  black_bird ewi-aor---   4.00m          /dev/sdf1(0)

[root@host-091 ~]# lvconvert --yes --repair black_bird/synced_random_raid5_2legs_1
  WARNING: Device for PV h9L4KG-Rclw-ZDAd-nHxG-iLFN-YQkL-WmyADY not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_random_raid5_2legs_1_rimage_1 while checking used and assumed devices.
  Faulty devices in black_bird/synced_random_raid5_2legs_1 successfully replaced.



# 6.9 raid5 attempt

[root@host-078 ~]# lvs -a -o +devices
  WARNING: Device for PV zKjceH-1t0W-6r4f-Vi22-ssDp-a2B3-vxoBux not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_random_raid5_2legs_1_rimage_2 while checking used and assumed devices.
  LV                                     VG         Attr       LSize   Cpy%Sync Devices
  synced_random_raid5_2legs_1            black_bird rwi-aor-p- 504.00m 100.00   synced_random_raid5_2legs_1_rimage_0(0),synced_random_raid5_2legs_1_rimage_1(0),synced_random_raid5_2legs_1_rimage_2(0)
  [synced_random_raid5_2legs_1_rimage_0] black_bird iwi-aor--- 252.00m          /dev/sdc1(1)
  [synced_random_raid5_2legs_1_rimage_1] black_bird iwi-aor--- 252.00m          /dev/sdd1(1)
  [synced_random_raid5_2legs_1_rimage_2] black_bird iwi-aor-p- 252.00m          unknown device(1)
  [synced_random_raid5_2legs_1_rmeta_0]  black_bird ewi-aor---   4.00m          /dev/sdc1(0)
  [synced_random_raid5_2legs_1_rmeta_1]  black_bird ewi-aor---   4.00m          /dev/sdd1(0)
  [synced_random_raid5_2legs_1_rmeta_2]  black_bird ewi-aor-p-   4.00m          unknown device(0)

[root@host-078 ~]# lvconvert --yes --repair black_bird/synced_random_raid5_2legs_1
  WARNING: Device for PV zKjceH-1t0W-6r4f-Vi22-ssDp-a2B3-vxoBux not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_random_raid5_2legs_1_rimage_2 while checking used and assumed devices.
  Unable to extract RAID image while RAID array is not in-sync
  Failed to remove the specified images from black_bird/synced_random_raid5_2legs_1
  Failed to replace faulty devices in black_bird/synced_random_raid5_2legs_1.

Comment 2 Corey Marthaler 2016-11-23 00:06:27 UTC
Looks like this affects raid1 as well. This was a fully synced raid1 when the failure tool place.

# allocation policy's automatic repair failed

Nov 22 17:52:37 host-078 lvm[1997]: Device #0 of raid1 array, black_bird-synced_primary_raid1_2legs_1, has failed.
Nov 22 17:52:37 host-078 lvm[1997]: WARNING: Device for PV 7QzEuP-y6sd-X0Nk-eqiI-uWco-R52P-l7NcAK not found or rejected by a filter.
Nov 22 17:52:37 host-078 lvm[1997]: Couldn't find device for segment belonging to black_bird/synced_primary_raid1_2legs_1_rimage_0 while checking used and assumed devices.
Nov 22 17:52:37 host-078 lvm[1997]: WARNING: Device for PV 7QzEuP-y6sd-X0Nk-eqiI-uWco-R52P-l7NcAK already missing, skipping.
Nov 22 17:52:37 host-078 lvm[1997]: WARNING: Device for PV 7QzEuP-y6sd-X0Nk-eqiI-uWco-R52P-l7NcAK not found or rejected by a filter.
Nov 22 17:52:37 host-078 lvm[1997]: Couldn't find device for segment belonging to black_bird/synced_primary_raid1_2legs_1_rimage_0 while checking used and assumed devices.
Nov 22 17:52:37 host-078 lvm[1997]: Unable to extract primary RAID image while RAID array is not in-sync (use --force option to replace)
Nov 22 17:52:37 host-078 lvm[1997]: Failed to remove the specified images from black_bird/synced_primary_raid1_2legs_1
Nov 22 17:52:37 host-078 lvm[1997]: Failed to replace faulty devices in black_bird/synced_primary_raid1_2legs_1.
Nov 22 17:52:37 host-078 lvm[1997]: Failed to process event for black_bird-synced_primary_raid1_2legs_1.


# the raid is in-sync
[root@host-078 ~]# lvs -a -o +devices
  WARNING: Device for PV 7QzEuP-y6sd-X0Nk-eqiI-uWco-R52P-l7NcAK not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_primary_raid1_2legs_1_rimage_0 while checking used and assumed devices.
  LV                                      VG         Attr       LSize   Cpy%Sync Devices
  synced_primary_raid1_2legs_1            black_bird rwi-aor-p- 500.00m 100.00   synced_primary_raid1_2legs_1_rimage_0(0),synced_primary_raid1_2legs_1_rimage_1(0),synced_primary_raid1_2legs_1_rimage_2(0)
  [synced_primary_raid1_2legs_1_rimage_0] black_bird iwi-aor-p- 500.00m          unknown device(1)
  [synced_primary_raid1_2legs_1_rimage_1] black_bird iwi-aor--- 500.00m          /dev/sdc1(1)
  [synced_primary_raid1_2legs_1_rimage_2] black_bird iwi-aor--- 500.00m          /dev/sdd1(1)
  [synced_primary_raid1_2legs_1_rmeta_0]  black_bird ewi-aor-p-   4.00m          unknown device(0)
  [synced_primary_raid1_2legs_1_rmeta_1]  black_bird ewi-aor---   4.00m          /dev/sdc1(0)
  [synced_primary_raid1_2legs_1_rmeta_2]  black_bird ewi-aor---   4.00m          /dev/sdd1(0)

[root@host-078 ~]# lvconvert --yes --repair black_bird/synced_primary_raid1_2legs_1
  WARNING: Device for PV 7QzEuP-y6sd-X0Nk-eqiI-uWco-R52P-l7NcAK not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_primary_raid1_2legs_1_rimage_0 while checking used and assumed devices.
  Unable to extract primary RAID image while RAID array is not in-sync (use --force option to replace)
  Failed to remove the specified images from black_bird/synced_primary_raid1_2legs_1
  Failed to replace faulty devices in black_bird/synced_primary_raid1_2legs_1.

# it's looking for the force now on all repairs?
[root@host-078 ~]# lvconvert --yes --force --repair black_bird/synced_primary_raid1_2legs_1
  WARNING: Device for PV 7QzEuP-y6sd-X0Nk-eqiI-uWco-R52P-l7NcAK not found or rejected by a filter.
  Couldn't find device for segment belonging to black_bird/synced_primary_raid1_2legs_1_rimage_0 while checking used and assumed devices.
  Faulty devices in black_bird/synced_primary_raid1_2legs_1 successfully replaced.

Comment 4 Zdenek Kabelac 2016-11-23 15:17:50 UTC
Issue here with  'raid1' is -  we do not allow fix  of primary leg failure.


Reason is - with  mdraid - we are not able to see a difference between fail before or after initial synchronization -  md raid kernel is not providing this info - and lvm2 is not yet storing this in lvm2 metadata.

So technically it's quite bad user needs to use  '--force' option to repair raid with failed primary leg.

There is no way dmeventd can do allocate since it's not using --force option.

At this moment it's on a user to decided (from look at kernel message log) if the initial sync has proceeded and 'other legs' are safe to use - or there is still some junk - that's why '--force with lvconvert --repair' is needed ATM.

Comment 5 Corey Marthaler 2016-11-23 18:08:29 UTC
This is quite the change in behavior. This is something that "just worked" all through lvm raid history, and now will no longer anymore in 7.4 and 6.9 going forward?

I understand your argument, that it may have never "just worked" properly, but this will require documentation to let users know that the allocate fault polices will no longer automatically repair raids anymore when a primary leg that was in-sync fails, even though it used to in every release prior.

Comment 6 Heinz Mauelshagen 2016-11-23 22:23:43 UTC
Only raid1 should be restricted to reject repair unless --force provided for the said argument as of comment #4, not raid4/5/6/10.

Comment 7 Heinz Mauelshagen 2016-11-23 22:44:30 UTC
Upstream commit e611f82a11fb wasn't added to the 
lvm2-2.02.143-9.el6 build, thus not restricting checks to raid1.

Comment 9 Corey Marthaler 2016-11-30 23:31:08 UTC
The three scenarios listed in comments #0, #1, #2 now work as expected. That is, raid5 and raid6 are back to their normal behavior, and raid1 in 6.9 now requires a --force in order to repair a primary leg failure regardless of sync status.

2.6.32-671.el6.x86_64
lvm2-2.02.143-10.el6    BUILT: Thu Nov 24 03:58:43 CST 2016
lvm2-libs-2.02.143-10.el6    BUILT: Thu Nov 24 03:58:43 CST 2016
lvm2-cluster-2.02.143-10.el6    BUILT: Thu Nov 24 03:58:43 CST 2016
udev-147-2.73.el6_8.2    BUILT: Tue Aug 30 08:17:19 CDT 2016
device-mapper-1.02.117-10.el6    BUILT: Thu Nov 24 03:58:43 CST 2016
device-mapper-libs-1.02.117-10.el6    BUILT: Thu Nov 24 03:58:43 CST 2016
device-mapper-event-1.02.117-10.el6    BUILT: Thu Nov 24 03:58:43 CST 2016
device-mapper-event-libs-1.02.117-10.el6    BUILT: Thu Nov 24 03:58:43 CST 2016



# raid6  

Scenario kill_multiple_synced_raid6_3legs: Kill multiple legs of synced 3 leg raid6 volume(s)

********* RAID hash info for this scenario *********
* names:              synced_multiple_raid6_3legs_1
* sync:               1
* type:               raid6
* -m |-i value:       3
* leg devices:        /dev/mapper/mpathap1 /dev/mapper/mpathfp1 /dev/mapper/mpathhp1 /dev/mapper/mpathdp1 /dev/mapper/mpathep1
* spanned legs:       0
* manual repair:      0
* failpv(s):          /dev/mapper/mpathdp1 /dev/mapper/mpathep1
* failnode(s):        taft-04
* lvmetad:            0
* raid fault policy:  warn
******************************************************

Creating raids(s) on taft-04...
taft-04: lvcreate --type raid6 -i 3 -n synced_multiple_raid6_3legs_1 -L 500M black_bird /dev/mapper/mpathap1:0-2400 /dev/mapper/mpathfp1:0-2400 /dev/mapper/mpathhp1:0-2400 /dev/mapper/mpathdp1:0-2400 /dev/mapper/mpathep1:0-2400
[...]
Fault policy is warn... Manually repairing failed raid volumes
taft-04: 'lvconvert --yes --repair black_bird/synced_multiple_raid6_3legs_1'
  Couldn't find device with uuid 2xpwZb-6teA-p2q1-Ghnh-j62n-Y6Xz-swD5WS.
  Couldn't find device with uuid aRC94g-llmD-ceN2-5GOS-IsGB-Ecaf-k6cLJc.
  Couldn't find device for segment belonging to black_bird/synced_multiple_raid6_3legs_1_rimage_3 while checking used and assumed devices.
Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )



# raid5

Scenario kill_random_synced_raid5_2legs: Kill random leg of synced 2 leg raid5 volume(s)

********* RAID hash info for this scenario *********
* names:              synced_random_raid5_2legs_1
* sync:               1
* type:               raid5
* -m |-i value:       2
* leg devices:        /dev/sde1 /dev/sdh1 /dev/sda1
* spanned legs:       0
* manual repair:      0
* failpv(s):          /dev/sdh1
* failnode(s):        host-076
* lvmetad:            0
* raid fault policy:  warn
******************************************************

Creating raids(s) on host-076...
host-076: lvcreate --type raid5 -i 2 -n synced_random_raid5_2legs_1 -L 500M black_bird /dev/sde1:0-2400 /dev/sdh1:0-2400 /dev/sda1:0-2400
[...]
Fault policy is warn... Manually repairing failed raid volumes
host-076: 'lvconvert --yes --repair black_bird/synced_random_raid5_2legs_1'
  /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdh1: read failed after 0 of 512 at 21467824128: Input/output error
  /dev/sdh1: read failed after 0 of 512 at 21467938816: Input/output error
  /dev/sdh1: read failed after 0 of 512 at 0: Input/output error
  /dev/sdh1: read failed after 0 of 512 at 4096: Input/output error
  Couldn't find device with uuid GlIYa2-2lz1-9R9M-hQls-B0Hd-Os1z-2kSXQy.
  Couldn't find device for segment belonging to black_bird/synced_random_raid5_2legs_1_rimage_1 while checking used and assumed devices.
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 84.97% )
   1/1 mirror(s) are fully synced: ( 100.00% )



# raid1

Scenario kill_primary_synced_raid1_2legs: Kill primary leg of synced 2 leg raid1 volume(s)

********* RAID hash info for this scenario *********
* names:              synced_primary_raid1_2legs_1
* sync:               1
* type:               raid1
* -m |-i value:       2
* leg devices:        /dev/mapper/mpathhp1 /dev/mapper/mpathcp1 /dev/mapper/mpathfp1
* spanned legs:       0
* manual repair:      1
* failpv(s):          /dev/mapper/mpathhp1
* additional snap:    /dev/mapper/mpathcp1
* failnode(s):        taft-04
* lvmetad:            0
* raid fault policy:  allocate
******************************************************

Creating raids(s) on taft-04...
taft-04: lvcreate --type raid1 -m 2 -n synced_primary_raid1_2legs_1 -L 500M black_bird /dev/mapper/mpathhp1:0-2400 /dev/mapper/mpathcp1:0-2400 /dev/mapper/mpathfp1:0-2400
[...]
Manually repairing failed raid volumes
(but first, verify that a non-force repair attempt fails, check for bug 1311765)
  Couldn't find device for segment belonging to black_bird/synced_primary_raid1_2legs_1_rimage_0 while checking used and assumed devices.
  Unable to extract primary RAID image while RAID array is not in-sync (use --force option to replace)
  Failed to remove the specified images from black_bird/synced_primary_raid1_2legs_1
  Failed to replace faulty devices in black_bird/synced_primary_raid1_2legs_1.
taft-04: 'lvconvert --force --yes --repair black_bird/synced_primary_raid1_2legs_1'
  Couldn't find device for segment belonging to black_bird/synced_primary_raid1_2legs_1_rimage_0 while checking used and assumed devices.
Waiting until all mirror|raid volumes become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )

Comment 11 errata-xmlrpc 2017-03-21 12:04:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0798.html


Note You need to log in before you can comment on or make changes to this bug.