RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1029170 - request for clarification: does lvm auto repair its corrupted pool tmeta devices
Summary: request for clarification: does lvm auto repair its corrupted pool tmeta dev...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.5
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Zdenek Kabelac
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-11 20:17 UTC by Corey Marthaler
Modified: 2014-01-28 13:26 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-28 13:26:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2013-11-11 20:17:43 UTC
Description of problem:
I'm attempting to create more test cases for thin_check and thin_repair/thin_restore. However, I'm having a hard time finding a way to reliably corrupt the tmeta device. It appears that lvm is either automatically repairing itself or it's not being truly corrupted in the first place. I've tried with both lvmetad on and off; and both with and with out a spare poolmetadata device. I've also tried corrupting less then 512 bytes, but when I do that it never seems to show up as corrupt. 


SCENARIO - [recover_corrupt_pool_tmeta_device]
Create a snapshot, corrupt it's pool metadata (_tmeta) device, and then restore it using thin_restore
Making origin volume
lvcreate --thinpool POOL --zero n --poolmetadataspare n -L 1G snapper_thinp
  WARNING: recovery of pools without pool metadata spare LV is not automated.

Sanity checking pool device metadata
(thin_check /dev/mapper/snapper_thinp-POOL_tmeta)
examining superblock
examining devices tree
examining mapping tree

lvcreate --virtualsize 1G -T snapper_thinp/POOL -n origin
lvcreate -V 1G -T snapper_thinp/POOL -n other1
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other2
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other3
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other4
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n other5
Making snapshot of origin volume
lvcreate -K -s /dev/snapper_thinp/origin -n restore

Dumping current pool metadata to /tmp/snapper_thinp_dump.2180.30275
thin_dump /dev/mapper/snapper_thinp-POOL_tmeta > /tmp/snapper_thinp_dump.2180.30275

Corrupting pool meta device (/dev/mapper/snapper_thinp-POOL_tmeta)
dd if=/dev/urandom of=/dev/mapper/snapper_thinp-POOL_tmeta count=512 bs=1
512+0 records in
512+0 records out
512 bytes (512 B) copied, 0.00308854 s, 166 kB/s

[root@harding-03 ~]# lvs -a -o +devices
  LV           VG            Attr       LSize  Pool Origin Data%  Devices
  POOL         snapper_thinp twi-a-t---  1.00g               0.04 POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi-ao----  1.00g                    /dev/sdb3(0)
  [POOL_tmeta] snapper_thinp ewi-ao----  4.00m                    /dev/sdc2(0)
  origin       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other1       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other2       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other3       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other4       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other5       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  restore      snapper_thinp Vwi-a-t--k  1.00g POOL origin   0.01

[root@harding-03 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
  superblock is corrupt
    bad checksum in superblock

# And now it's automatically fixed?
[root@harding-03 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
examining devices tree
examining mapping tree


# Corrupt it again
[root@harding-03 ~]# dd if=/dev/urandom of=/dev/mapper/snapper_thinp-POOL_tmeta count=512 bs=1
512+0 records in
512+0 records out
512 bytes (512 B) copied, 0.00279119 s, 183 kB/s

# Corrupt
[root@harding-03 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
  superblock is corrupt
    bad checksum in superblock

# Corrupt
[root@harding-03 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
  superblock is corrupt
    bad checksum in superblock

# Not Corrupt (and all I've done is run this cmd)
[root@harding-03 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
examining devices tree
examining mapping tree


# Corrupt it again
[root@harding-03 ~]# dd if=/dev/urandom of=/dev/mapper/snapper_thinp-POOL_tmeta count=512 bs=1
512+0 records in
512+0 records out
512 bytes (512 B) copied, 0.00279241 s, 183 kB/s

# Run a sync
[root@harding-03 ~]# sync

# Not Corrupt
[root@harding-03 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
examining devices tree
examining mapping tree

[root@harding-03 ~]# lvs -a -o +devices
  LV           VG            Attr       LSize  Pool Origin Data%  Devices
  POOL         snapper_thinp twi-a-t---  1.00g               0.04 POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi-ao----  1.00g                    /dev/sdb3(0)
  [POOL_tmeta] snapper_thinp ewi-ao----  4.00m                    /dev/sdc2(0)
  origin       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other1       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other2       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other3       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other4       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  other5       snapper_thinp Vwi-a-t---  1.00g POOL          0.01
  restore      snapper_thinp Vwi-a-t--k  1.00g POOL origin   0.01



Version-Release number of selected component (if applicable):
2.6.32-424.el6.x86_64
lvm2-2.02.100-8.el6    BUILT: Wed Oct 30 03:10:56 CDT 2013
lvm2-libs-2.02.100-8.el6    BUILT: Wed Oct 30 03:10:56 CDT 2013
lvm2-cluster-2.02.100-8.el6    BUILT: Wed Oct 30 03:10:56 CDT 2013
udev-147-2.50.el6    BUILT: Fri Oct 11 05:58:10 CDT 2013
device-mapper-1.02.79-8.el6    BUILT: Wed Oct 30 03:10:56 CDT 2013
device-mapper-libs-1.02.79-8.el6    BUILT: Wed Oct 30 03:10:56 CDT 2013
device-mapper-event-1.02.79-8.el6    BUILT: Wed Oct 30 03:10:56 CDT 2013
device-mapper-event-libs-1.02.79-8.el6    BUILT: Wed Oct 30 03:10:56 CDT 2013
cmirror-2.02.100-8.el6    BUILT: Wed Oct 30 03:10:56 CDT 2013


How reproducible:
Everytime

Comment 2 RHEL Program Management 2013-11-14 21:06:23 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 3 Zdenek Kabelac 2014-01-28 13:26:41 UTC
This test is not correct.

thin_check is not supposed to be executed on live (active thin pool) metadata volume. Now we have Bug 1023828 (and Bug 1038387), where thin_check should warn before such use.  As of current state of tools -  lvm2 by default only detects error during activation (and deactivation) via thin_check tool.

To actively fix metadata, user should use: lvconvert --repair vg/pool


Note You need to log in before you can comment on or make changes to this bug.