1549272 – "Failed to lock logical volume" during raid scrub check after partial activation

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1549272 - "Failed to lock logical volume" during raid scrub check after partial activation

Summary: "Failed to lock logical volume" during raid scrub check after partial activation

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	7.5
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Heinz Mauelshagen
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1886597
TreeView+	depends on / blocked

Reported:	2018-02-26 21:00 UTC by Corey Marthaler
Modified:	2021-09-03 12:37 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1886597 (view as bug list)
Environment:
Last Closed:	2021-02-15 07:35:28 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
verbose lvchange attempt (52.34 KB, text/plain) 2018-02-26 21:10 UTC, Corey Marthaler	no flags	Details
View All

Description Corey Marthaler 2018-02-26 21:00:05 UTC

Description of problem:

Scenario leading up to the scrub attempt:

SCENARIO (raid1) - [vgcfgrestore_raid_with_missing_pv]

Create a raid, force remove a leg, and then restore it's VG
host-083: lvcreate  --nosync --type raid1 -m 1 -n missing_pv_raid -L 100M raid_sanity
  WARNING: New raid1 won't be synchronised. Don't read what you didn't write!
Deactivating missing_pv_raid raid

Backup the VG config
host-083 vgcfgbackup -f /tmp/raid_sanity.bkup.10114 raid_sanity

Force removing PV /dev/sda1 (used in this raid)
host-083: 'pvremove -ff --yes /dev/sda1'
  WARNING: PV /dev/sda1 is used by VG raid_sanity.
  WARNING: Wiping physical volume label from /dev/sda1 of volume group "raid_sanity".

Verifying that this VG is now corrupt
  WARNING: Device for PV 5PJqhe-7OLo-1iiv-3VDw-btRs-JmFx-GswaZn not found or rejected by a filter.
  Failed to find physical volume "/dev/sda1".

Attempt to restore the VG back to it's original state (should not segfault BZ 1348327)
host-083 vgcfgrestore -f /tmp/raid_sanity.bkup.10114 raid_sanity
  Couldn't find device with uuid 5PJqhe-7OLo-1iiv-3VDw-btRs-JmFx-GswaZn.
  Cannot restore Volume Group raid_sanity with 1 PVs marked as missing.
  Restore failed.
Checking syslog to see if vgcfgrestore segfaulted

Activating VG in partial readonly mode
host-083 vgchange -ay --partial raid_sanity
  PARTIAL MODE. Incomplete logical volumes will be processed.
  WARNING: Device for PV 5PJqhe-7OLo-1iiv-3VDw-btRs-JmFx-GswaZn not found or rejected by a filter.

Recreating PV using it's old uuid
host-083 pvcreate --norestorefile --uuid "5PJqhe-7OLo-1iiv-3VDw-btRs-JmFx-GswaZn" /dev/sda1
  WARNING: Device for PV 5PJqhe-7OLo-1iiv-3VDw-btRs-JmFx-GswaZn not found or rejected by a filter.
Restoring the VG back to it's original state
host-083 vgcfgrestore -f /tmp/raid_sanity.bkup.10114 raid_sanity
Reactivating VG

[root@host-083 ~]# lvs -a -o +devices
  LV                         VG          Attr       LSize    Cpy%Sync Devices
  missing_pv_raid            raid_sanity Rwi-a-r-r- 100.00m  100.00   missing_pv_raid_rimage_0(0),missing_pv_raid_rimage_1(0)
  [missing_pv_raid_rimage_0] raid_sanity Iwi-aor-r- 100.00m           /dev/sda1(1)
  [missing_pv_raid_rimage_1] raid_sanity iwi-aor--- 100.00m           /dev/sdb1(1)
  [missing_pv_raid_rmeta_0]  raid_sanity ewi-aor-r-   4.00m           /dev/sda1(0)
  [missing_pv_raid_rmeta_1]  raid_sanity ewi-aor---   4.00m           /dev/sdb1(0)

[root@host-083 ~]# lvchange --syncaction repair raid_sanity/missing_pv_raid
[root@host-083 ~]# echo $?
0

perform raid scrubbing (lvchange --syncaction repair) on raid raid_sanity/missing_pv_raid

[422860.991394] md: requested-resync of RAID array mdX
Feb 26 14:27:12 host-083 kernel: md: requested-resync of RAID array mdX
[422861.004645] md: mdX: requested-resync done.
Feb 26 14:27:12 host-083 kernel: md: mdX: requested-resync done.
Feb 26 14:27:12 host-083 lvm[22228]: WARNING: Device #0 of raid1 array, raid_sanity-missing_pv_raid, has failed.
Feb 26 14:27:12 host-083 lvm[22228]: WARNING: Disabling lvmetad cache for repair command.
Feb 26 14:27:12 host-083 lvm[22228]: WARNING: Not using lvmetad because of repair.
[422861.239234] device-mapper: raid: Failed to read superblock of device at position 0
[422861.247114] device-mapper: raid: Device 1 specified for rebuild; clearing superblock
Feb 26 14:27:13 host-083 kernel: device-mapper: raid: Failed to read superblock of device at position 0
Feb 26 14:27:13 host-083 kernel: device-mapper: raid: Device 1 specified for rebuild; clearing superblock
[422861.259156] md/raid1:mdX: active with 0 out of 2 mirrors
[422861.260594] mdX: failed to create bitmap (-5)
[422861.262601] device-mapper: table: 253:12: raid: Failed to run raid array
[422861.264326] device-mapper: ioctl: error adding target to table
Feb 26 14:27:13 host-083 kernel: md/raid1:mdX: active with 0 out of 2 mirrors
Feb 26 14:27:13 host-083 kernel: mdX: failed to create bitmap (-5)
Feb 26 14:27:13 host-083 kernel: device-mapper: table: 253:12: raid: Failed to run raid array
Feb 26 14:27:13 host-083 kernel: device-mapper: ioctl: error adding target to table
Feb 26 14:27:13 host-083 lvm[22228]: device-mapper: reload ioctl on  (253:12) failed: Input/output error
Feb 26 14:27:13 host-083 lvm[22228]: Failed to lock logical volume raid_sanity/missing_pv_raid.
Feb 26 14:27:13 host-083 lvm[22228]: Failed to replace faulty devices in raid_sanity/missing_pv_raid.
Feb 26 14:27:13 host-083 lvm[22228]: Repair of RAID device raid_sanity-missing_pv_raid failed.
Feb 26 14:27:13 host-083 lvm[22228]: Failed to process event for raid_sanity-missing_pv_raid.


Version-Release number of selected component (if applicable):
3.10.0-854.el7.x86_64

lvm2-2.02.177-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
lvm2-libs-2.02.177-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
lvm2-cluster-2.02.177-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
lvm2-lockd-2.02.177-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
cmirror-2.02.177-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
device-mapper-1.02.146-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
device-mapper-libs-1.02.146-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
device-mapper-event-1.02.146-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
device-mapper-event-libs-1.02.146-4.el7    BUILT: Fri Feb 16 06:22:31 CST 2018
device-mapper-persistent-data-0.7.3-3.el7    BUILT: Tue Nov 14 05:07:18 CST 2017


How reproducible:
Most times

Comment 2 Corey Marthaler 2018-02-26 21:10:16 UTC

Created attachment 1401029 [details]
verbose lvchange attempt

Comment 3 Corey Marthaler 2020-10-08 17:17:25 UTC

This is an issue in the current rhel8.3 as well (this test has been turned off since 2018). We can open a new bug to track and fix (if warranted), in rhel8 and close this issue.

kernel-4.18.0-234.el8.x86_64

[root@hayes-03 ~]# lvs -a -o +devices
  LV                         VG          Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                                
  missing_pv_raid            raid_sanity Rwi-a-r-r- 100.00m                                    100.00           missing_pv_raid_rimage_0(0),missing_pv_raid_rimage_1(0)
  [missing_pv_raid_rimage_0] raid_sanity Iwi-aor-r- 100.00m                                                     /dev/sdb1(1)                                           
  [missing_pv_raid_rimage_1] raid_sanity iwi-aor--- 100.00m                                                     /dev/sdc1(1)                                           
  [missing_pv_raid_rmeta_0]  raid_sanity ewi-aor-r-   4.00m                                                     /dev/sdb1(0)                                           
  [missing_pv_raid_rmeta_1]  raid_sanity ewi-aor---   4.00m                                                     /dev/sdc1(0)                                           
[root@hayes-03 ~]# lvchange --syncaction check raid_sanity/missing_pv_raid
[root@hayes-03 ~]# echo $?
0

Oct  8 12:11:00 hayes-03 kernel: md: mdX: data-check done.
Oct  8 12:11:00 hayes-03 lvm[546077]: WARNING: Device #0 of raid1 array, raid_sanity-missing_pv_raid, has failed.
Oct  8 12:11:00 hayes-03 kernel: device-mapper: raid: Failed to read superblock of device at position 0
Oct  8 12:11:00 hayes-03 kernel: device-mapper: raid: Device 1 specified for rebuild; clearing superblock
Oct  8 12:11:00 hayes-03 kernel: md: pers->run() failed ...
Oct  8 12:11:00 hayes-03 kernel: device-mapper: table: 253:6: raid: Failed to run raid array
Oct  8 12:11:00 hayes-03 kernel: device-mapper: ioctl: error adding target to table
Oct  8 12:11:00 hayes-03 lvm[546077]: device-mapper: reload ioctl on  (253:6) failed: Invalid argument
Oct  8 12:11:00 hayes-03 lvm[546077]: Failed to suspend logical volume raid_sanity/missing_pv_raid.
Oct  8 12:11:00 hayes-03 lvm[546077]: Failed to replace faulty devices in raid_sanity/missing_pv_raid.
Oct  8 12:11:00 hayes-03 lvm[546077]: Repair of RAID device raid_sanity-missing_pv_raid failed.
Oct  8 12:11:00 hayes-03 lvm[546077]: Failed to process event for raid_sanity-missing_pv_raid.

Comment 6 RHEL Program Management 2021-02-15 07:35:28 UTC

After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Note You need to log in before you can comment on or make changes to this bug.