1034460 – [lvmetad] VG mda corruption is not handled when using lvmetad

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1034460 - [lvmetad] VG mda corruption is not handled when using lvmetad

Summary: [lvmetad] VG mda corruption is not handled when using lvmetad

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	7.0
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	LVM and device-mapper development team
QA Contact:	cluster-qe@redhat.com
Docs Contact:
URL:
Whiteboard:
Depends On:	894136
Blocks:
TreeView+	depends on / blocked

Reported:	2013-11-25 21:46 UTC by Corey Marthaler
Modified:	2023-03-08 07:26 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:	894136
Environment:
Last Closed:	2014-01-09 00:16:36 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Comment 3 Petr Rockai 2013-12-02 14:35:45 UTC

Not sure, but the upstream test is still passing, so I don't think anyone broke this upstream. The patch referred in comment 1 should indeed fix the problem.

Comment 4 Corey Marthaler 2013-12-18 23:31:50 UTC

Any word on this? Should the tests just not be using pvs to reliably determine corruption? The pvs behavior is different from rhel6.5 (using lvmetad) and rhe7.0 (using lvmetad). As well as 7.0 w/ lvmetad and 7.0 w/o lvmetad.  

## RHEL6.5

[root@taft-01 ~]# lvs -a -o +devices
  LV                VG       Attr       LSize   Origin Data%  Devices
  corrupt_meta_snap snapper  swi-a-s--- 100.00m origin   0.00 /dev/sdb1(75)
  origin            snapper  owi-a-s--- 300.00m               /dev/sdb1(0)

[root@taft-01 ~]# dd if=/dev/zero of=/dev/sdb1 count=1000
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.0038089 s, 134 MB/s

[root@taft-01 ~]# vgck
  Couldn't find device with uuid rWf7wc-nwnD-x3Ml-eFvV-xYIX-E2Td-btUZMF.
  The volume group is missing 1 physical volumes.

[root@taft-01 ~]# pvs /dev/sdb1
  No physical volume found in lvmetad cache for /dev/sdb1
  Failed to read physical volume "/dev/sdb1"
[root@taft-01 ~]# echo $?
5

## RHEL7.0

[root@harding-02 ~]# lvs -a -o +devices
  LV                VG       Attr       LSize   Origin Data%  Devices
  corrupt_meta_snap snapper  swi-a-s--- 100.00m origin   0.00 /dev/sdc1(75)
  origin            snapper  owi-a-s--- 300.00m               /dev/sdc1(0)

[root@harding-02 ~]# dd if=/dev/zero of=/dev/sdc1 count=1000
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.0968152 s, 5.3 MB/s

[root@harding-02 ~]# vgck
  Couldn't find device with uuid OOeorC-2aY2-j177-rbju-QN4M-zN91-pzbNWg.
  The volume group is missing 1 physical volumes.

[root@harding-02 ~]# pvs /dev/sdc1
  PV         VG      Fmt  Attr PSize  PFree 
  /dev/sdc1  snapper lvm2 a--  93.16g 92.77g
[root@harding-02 ~]# echo $?
0



3.10.0-54.0.1.el7.x86_64
lvm2-2.02.103-6.el7    BUILT: Wed Nov 27 02:28:25 CST 2013
lvm2-libs-2.02.103-6.el7    BUILT: Wed Nov 27 02:28:25 CST 2013
lvm2-cluster-2.02.103-6.el7    BUILT: Wed Nov 27 02:28:25 CST 2013
device-mapper-1.02.82-6.el7    BUILT: Wed Nov 27 02:28:25 CST 2013
device-mapper-libs-1.02.82-6.el7    BUILT: Wed Nov 27 02:28:25 CST 2013
device-mapper-event-1.02.82-6.el7    BUILT: Wed Nov 27 02:28:25 CST 2013
device-mapper-event-libs-1.02.82-6.el7    BUILT: Wed Nov 27 02:28:25 CST 2013
device-mapper-persistent-data-0.2.8-2.el7    BUILT: Wed Oct 30 10:20:48 CDT 2013
cmirror-2.02.103-6.el7    BUILT: Wed Nov 27 02:28:25 CST 2013

Comment 5 Petr Rockai 2013-12-31 16:42:28 UTC

Oh, pvs! Yes, pvs no longer detects corruption: in 6.5, it goes to disk, but in 7.0 it relies on lvmetad. To check for metadata corruption, vgck should be always used. The original resolution for this bug was:

> Should be implemented upstream (vgck will not rely on lvmetad but check
> metadata stored on disk) in 0da72743ca46ae9f8185cd12d5c78b3c2b801872.

Please change the tests to use vgck instead of pvs. If that still fails, then we have a bug on our hands. I am still not sure though, because this bug report states:

> This fix must not have made it into rhel7 yet? A 'pvscan --cache $pv' will
> detect a corrupted pv, but vgck will not (unlike it does in rhel6.5).

Summary: pvs used to report corrupt metadata, but no longer does if lvmetad is in use. We have extended vgck instead to always check metadata on disk. Only vgck is guaranteed to check metadata on disk for corruption; if other commands currently do such checking, this behaviour may be removed in future.

Comment 6 Corey Marthaler 2014-01-09 00:16:36 UTC

Ok thanks. I'll remove the 'pvs' checks and stick with vgck, which does detect the corruption.

Closing...

Note You need to log in before you can comment on or make changes to this bug.