RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1461562 - RAID RESHAPE: Reshape request failed on exclusive raid on clustered VG (md: pers->run() failed)
Summary: RAID RESHAPE: Reshape request failed on exclusive raid on clustered VG (md: p...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.4
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Heinz Mauelshagen
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On: 1448116
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-14 19:22 UTC by Corey Marthaler
Modified: 2021-09-03 12:38 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-28 16:56:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
verbose lvconvert attempt (287.19 KB, text/plain)
2017-06-14 20:01 UTC, Corey Marthaler
no flags Details

Description Corey Marthaler 2017-06-14 19:22:41 UTC
Description of problem:
This appears very similar to bug 1448116. However, unlike bug 1448116, the "takeover" operation passed, and the reshape image addition is what failed, however it didn't cause a deadlock. I'll attempt to reproduce and provide verbose output from the lvconvert cmd. Feel free to mark this a dup of bug 1448116 if that appears to be the case.


3.10.0-681.el7.bz1443999a.x86_64

lvm2-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
lvm2-libs-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
lvm2-cluster-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-libs-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-event-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-event-libs-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017
cmirror-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017



================================================================================
Iteration 0.2 started at Wed Jun 14 13:44:43 CDT 2017
================================================================================
Scenario raid5_ra: Convert Striped raid5_ra volume
********* Take over hash info for this scenario *********
* from type:    raid5_ra
* to type:      raid6_ra_6
* from legs:    4
* to legs:      5
* from region:  8192.00k
* to region:    1024.00k
* contiguous:   0
* snapshot:     0
******************************************************

Creating original volume on harding-03...
harding-03: lvcreate -aye --type raid5_ra -R 8192.00k -i 4 -n takeover -L 4G centipede2
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 44.89% )
   0/1 mirror(s) are fully synced: ( 82.78% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Placing a spacer on all raid image PVs so that expansion will have to be placed beyond
Extending raid beyond spacer
        lvextend -L +50M centipede2/takeover

Current volume device structure:
  LV                  Attr       LSize    Cpy%Sync Devices
  lvol0               -wi-a-----   20.00m          /dev/mapper/mpatha1(257)
  lvol1               -wi-a-----   20.00m          /dev/mapper/mpatha1(262)
  lvol2               -wi-a-----   20.00m          /dev/mapper/mpathb1(257)
  lvol3               -wi-a-----   20.00m          /dev/mapper/mpathb1(262)
  lvol4               -wi-a-----   20.00m          /dev/mapper/mpathc1(257)
  lvol5               -wi-a-----   20.00m          /dev/mapper/mpathc1(262)
  lvol6               -wi-a-----   20.00m          /dev/mapper/mpathd1(257)
  lvol7               -wi-a-----   20.00m          /dev/mapper/mpathd1(262)
  lvol8               -wi-a-----   20.00m          /dev/mapper/mpathe1(257)
  lvol9               -wi-a-----   20.00m          /dev/mapper/mpathe1(262)
  takeover            rwi-a-r---    4.06g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0)
  [takeover_rimage_0] iwi-aor---   <1.02g          /dev/mapper/mpatha1(1)
  [takeover_rimage_0] iwi-aor---   <1.02g          /dev/mapper/mpatha1(267)
  [takeover_rimage_1] iwi-aor---   <1.02g          /dev/mapper/mpathb1(1)
  [takeover_rimage_1] iwi-aor---   <1.02g          /dev/mapper/mpathb1(267)
  [takeover_rimage_2] iwi-aor---   <1.02g          /dev/mapper/mpathc1(1)
  [takeover_rimage_2] iwi-aor---   <1.02g          /dev/mapper/mpathc1(267)
  [takeover_rimage_3] iwi-aor---   <1.02g          /dev/mapper/mpathd1(1)
  [takeover_rimage_3] iwi-aor---   <1.02g          /dev/mapper/mpathd1(267)
  [takeover_rimage_4] iwi-aor---   <1.02g          /dev/mapper/mpathe1(1)
  [takeover_rimage_4] iwi-aor---   <1.02g          /dev/mapper/mpathe1(267)
  [takeover_rmeta_0]  ewi-aor---    4.00m          /dev/mapper/mpatha1(0)
  [takeover_rmeta_1]  ewi-aor---    4.00m          /dev/mapper/mpathb1(0)
  [takeover_rmeta_2]  ewi-aor---    4.00m          /dev/mapper/mpathc1(0)
  [takeover_rmeta_3]  ewi-aor---    4.00m          /dev/mapper/mpathd1(0)
  [takeover_rmeta_4]  ewi-aor---    4.00m          /dev/mapper/mpathe1(0)

Creating xfs on top of mirror(s) on harding-03...
Mounting mirrored xfs filesystems on harding-03...

Writing verification files (checkit) to mirror(s) on...
        ---- harding-03 ----

Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- harding-03 ----

TAKEOVER: lvconvert --yes -R 1024.00k  --type raid6_ra_6 centipede2/takeover
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 27.27% )
   0/1 mirror(s) are fully synced: ( 54.59% )
   0/1 mirror(s) are fully synced: ( 80.40% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize    Cpy%Sync Devices
  lvol0               -wi-a-----   20.00m          /dev/mapper/mpatha1(257)
  lvol1               -wi-a-----   20.00m          /dev/mapper/mpatha1(262)
  lvol2               -wi-a-----   20.00m          /dev/mapper/mpathb1(257)
  lvol3               -wi-a-----   20.00m          /dev/mapper/mpathb1(262)
  lvol4               -wi-a-----   20.00m          /dev/mapper/mpathc1(257)
  lvol5               -wi-a-----   20.00m          /dev/mapper/mpathc1(262)
  lvol6               -wi-a-----   20.00m          /dev/mapper/mpathd1(257)
  lvol7               -wi-a-----   20.00m          /dev/mapper/mpathd1(262)
  lvol8               -wi-a-----   20.00m          /dev/mapper/mpathe1(257)
  lvol9               -wi-a-----   20.00m          /dev/mapper/mpathe1(262)
  takeover            rwi-aor---    4.06g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0),takeover_rimage_5(0)
  [takeover_rimage_0] iwi-aor---   <1.02g          /dev/mapper/mpatha1(1)
  [takeover_rimage_0] iwi-aor---   <1.02g          /dev/mapper/mpatha1(267)
  [takeover_rimage_1] iwi-aor---   <1.02g          /dev/mapper/mpathb1(1)
  [takeover_rimage_1] iwi-aor---   <1.02g          /dev/mapper/mpathb1(267)
  [takeover_rimage_2] iwi-aor---   <1.02g          /dev/mapper/mpathc1(1)
  [takeover_rimage_2] iwi-aor---   <1.02g          /dev/mapper/mpathc1(267)
  [takeover_rimage_3] iwi-aor---   <1.02g          /dev/mapper/mpathd1(1)
  [takeover_rimage_3] iwi-aor---   <1.02g          /dev/mapper/mpathd1(267)
  [takeover_rimage_4] iwi-aor---   <1.02g          /dev/mapper/mpathe1(1)
  [takeover_rimage_4] iwi-aor---   <1.02g          /dev/mapper/mpathe1(267)
  [takeover_rimage_5] iwi-aor---   <1.02g          /dev/mapper/mpathf1(1)
  [takeover_rmeta_0]  ewi-aor---    4.00m          /dev/mapper/mpatha1(0)
  [takeover_rmeta_1]  ewi-aor---    4.00m          /dev/mapper/mpathb1(0)
  [takeover_rmeta_2]  ewi-aor---    4.00m          /dev/mapper/mpathc1(0)
  [takeover_rmeta_3]  ewi-aor---    4.00m          /dev/mapper/mpathd1(0)
  [takeover_rmeta_4]  ewi-aor---    4.00m          /dev/mapper/mpathe1(0)
  [takeover_rmeta_5]  ewi-aor---    4.00m          /dev/mapper/mpathf1(0)

Verifying files (checkit) on mirror(s) on...
        ---- harding-03 ----

RESHAPE: lvconvert --yes  --stripes 5 centipede2/takeover
  WARNING: Adding stripes to active and open logical volume centipede2/takeover will grow it from 1040 to 1300 extents!
  Error locking on node 2: device-mapper: reload ioctl on  (253:29) failed: Invalid argument
  Failed to lock logical volume centipede2/takeover.
  Internal error: Update of LV centipede2/takeover failed.
  Reshape request failed on LV centipede2/takeover.
couldn't reshape volume



Jun 14 13:47:18 harding-03 qarshd[8448]: Running cmdline: lvconvert --yes --stripes 5 centipede2/takeover
Jun 14 13:47:19 harding-03 multipathd: dm-42: remove map (uevent)
Jun 14 13:47:19 harding-03 multipathd: dm-42: devmap not registered, can't remove
Jun 14 13:47:19 harding-03 multipathd: dm-42: remove map (uevent)
Jun 14 13:47:19 harding-03 kernel: md/raid:mdX: device dm-20 operational as raid disk 0
Jun 14 13:47:19 harding-03 kernel: md/raid:mdX: device dm-22 operational as raid disk 1
Jun 14 13:47:19 harding-03 kernel: md/raid:mdX: device dm-24 operational as raid disk 2
Jun 14 13:47:19 harding-03 kernel: md/raid:mdX: device dm-26 operational as raid disk 3
Jun 14 13:47:19 harding-03 kernel: md/raid:mdX: device dm-28 operational as raid disk 4
Jun 14 13:47:19 harding-03 kernel: md/raid:mdX: device dm-41 operational as raid disk 5
Jun 14 13:47:19 harding-03 kernel: md/raid:mdX: raid level 6 active with 6 out of 6 devices, algorithm 17
Jun 14 13:47:19 harding-03 dmeventd[7295]: No longer monitoring RAID device centipede2-takeover for events.
Jun 14 13:47:19 harding-03 kernel: dm-29: detected capacity change from 5452595200 to 4362076160
Jun 14 13:47:19 harding-03 kernel: VFS: busy inodes on changed media or resized disk dm-29
Jun 14 13:47:19 harding-03 lvm[7295]: Monitoring RAID device centipede2-takeover for events.
Jun 14 13:47:19 harding-03 kernel: md/raid:mdX: reshape_position too early for auto-recovery - aborting.
Jun 14 13:47:19 harding-03 kernel: md: pers->run() failed ...
Jun 14 13:47:19 harding-03 kernel: device-mapper: table: 253:29: raid: Failed to run raid array
Jun 14 13:47:19 harding-03 kernel: device-mapper: ioctl: error adding target to table


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Corey Marthaler 2017-06-14 20:01:30 UTC
Created attachment 1287808 [details]
verbose lvconvert attempt

Comment 4 Heinz Mauelshagen 2017-06-19 16:27:06 UTC
This is another effect of growing rimages and reordering their address space done in one step rather than 2 (bz1447812 is another one).  In the clustered VG case, the grown size of the rimage LVs is not propagated properly causing the raid personality function to fail the respective validation.

We have to restrict reshaping on the clustered LVs for the time being until this fix is properly designed, implemented and tested.

Comment 5 Jonathan Earl Brassow 2017-06-19 17:49:56 UTC
Disallowing reshape/takeover while LV is in cluster VG until future release

Comment 6 Heinz Mauelshagen 2017-11-09 13:44:57 UTC
(In reply to Jonathan Earl Brassow from comment #5)
> Disallowing reshape/takeover while LV is in cluster VG until future release

Fixed/reenabled as of bz1448116

Comment 7 Heinz Mauelshagen 2017-11-28 16:56:18 UTC
Verified in the context of https://bugzilla.redhat.com/show_bug.cgi?id=1448116#c18

3.10.0-794.el7.x86_64

lvm2-2.02.176-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
lvm2-libs-2.02.176-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
lvm2-cluster-2.02.176-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
lvm2-lockd-2.02.176-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
lvm2-python-boom-0.8-4.el7    BUILT: Wed Nov 15 04:23:09 CST 2017
cmirror-2.02.176-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
device-mapper-1.02.145-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
device-mapper-libs-1.02.145-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
device-mapper-event-1.02.145-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
device-mapper-event-libs-1.02.145-4.el7    BUILT: Wed Nov 15 04:21:19 CST 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017


Note You need to log in before you can comment on or make changes to this bug.