Bug 1021710 - thin_restore/repair fails when attempted directly to inactive corrupted meta volume
Summary: thin_restore/repair fails when attempted directly to inactive corrupted meta ...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: device-mapper-persistent-data
Version: 6.5
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Joe Thornber
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-10-21 21:10 UTC by Corey Marthaler
Modified: 2015-10-16 16:07 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-16 16:07:22 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Corey Marthaler 2013-10-21 21:10:33 UTC
Description of problem:
This is one of the lingering issues from bug 1007074. Directly attempting to restore the meta data of a corrupted and deactivated pool volume fails. 

This is still to be supported eventually, right? Even if the "correct" way to do this currently is to cppy the meta data to a new volume and then swap devices.


SCENARIO - [verify_io_between_offline_mda_corruptions]
Create a snapshot and then verify it's io contents in between OFFLINE pool mda corruptions and restorations
Making origin volume
lvcreate --thinpool POOL --zero y -L 5G snapper_thinp
  device-mapper: remove ioctl on  failed: Device or resource busy
Sanity checking pool device metadata
(thin_check /dev/mapper/snapper_thinp-POOL_tmeta)
examining superblock
examining devices tree
examining mapping tree

lvcreate --virtualsize 1G -T snapper_thinp/POOL -n origin
syncing before snap creation...
Creating thin snapshot of origin volume
lvcreate -K -s /dev/snapper_thinp/origin -n snap1

Dumping current pool metadata to /tmp/snapper_thinp_dump_1.2019.3523
thin_dump /dev/mapper/snapper_thinp-POOL_tmeta > /tmp/snapper_thinp_dump_1.2019.3523

DEACTIVATING POOL VOLUME
lvchange -an snapper_thinp
Corrupting pool meta device (/dev/mapper/snapper_thinp-POOL_tmeta)
dd if=/dev/zero of=/dev/mapper/snapper_thinp-POOL_tmeta count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000349381 s, 1.5 MB/s
Verifying that pool meta device is now corrupt
thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
  superblock is corrupt
    bad checksum in superblock

Restoring /dev/mapper/snapper_thinp-POOL_tmeta using dumped file
thin_restore -i /tmp/snapper_thinp_dump_1.2019.3523 -o /dev/mapper/snapper_thinp-POOL_tmeta
transaction_manager::new_block() couldn't allocate new block

[root@harding-02 ~]# thin_restore -i /tmp/snapper_thinp_dump_1.2019.3523 -o /dev/mapper/snapper_thinp-POOL_tmeta
transaction_manager::new_block() couldn't allocate new block

[root@harding-02 ~]# thin_repair -i /tmp/snapper_thinp_dump_1.2019.3523 -o /dev/mapper/snapper_thinp-POOL_tmeta
transaction_manager::new_block() couldn't allocate new block


Version-Release number of selected component (if applicable):
2.6.32-410.el6.x86_64

lvm2-2.02.100-6.el6    BUILT: Wed Oct 16 07:26:00 CDT 2013
lvm2-libs-2.02.100-6.el6    BUILT: Wed Oct 16 07:26:00 CDT 2013
lvm2-cluster-2.02.100-6.el6    BUILT: Wed Oct 16 07:26:00 CDT 2013
udev-147-2.50.el6    BUILT: Fri Oct 11 05:58:10 CDT 2013
device-mapper-1.02.79-6.el6    BUILT: Wed Oct 16 07:26:00 CDT 2013
device-mapper-libs-1.02.79-6.el6    BUILT: Wed Oct 16 07:26:00 CDT 2013
device-mapper-event-1.02.79-6.el6    BUILT: Wed Oct 16 07:26:00 CDT 2013
device-mapper-event-libs-1.02.79-6.el6    BUILT: Wed Oct 16 07:26:00 CDT 2013
cmirror-2.02.100-6.el6    BUILT: Wed Oct 16 07:26:00 CDT 2013


How reproducible:
Everytime

Comment 2 Zdenek Kabelac 2013-10-22 07:37:07 UTC
Please attach also version of:

device-mapper-persistent-data


As seen in the log:

examining superblock
  superblock is corrupt
    bad checksum in superblock


I believe current repair capabilities of thin-repair utility are still somewhat limited - at least in my tests there were many corruptions which has made the pool irreparable for now.

For testing corruptions for now - it's rather better to damage single bytes instead of removing whole metadata headers.


Note You need to log in before you can comment on or make changes to this bug.