Bug 1477232

Summary: Unable to restore metadata from metadata snapshot (thin_dump -m -> thin_restore)
Product: Red Hat Enterprise Linux 7 Reporter: Jakub Krysl <jkrysl>
Component: device-mapper-persistent-dataAssignee: Joe Thornber <thornber>
Status: CLOSED NOTABUG QA Contact: Jakub Krysl <jkrysl>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: agk, heinzm, lvm-team, msnitzer, thornber
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-18 09:11:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jakub Krysl 2017-08-01 14:13:04 UTC
Description of problem:
Not able to restore swapped metadata dumped by thin_dump from metadata snapshot created by dmsetup message. 

[root@storageqe-21 thin]# thin_check /dev/mapper/vgtest-swapvol 
examining superblock
examining devices tree
examining mapping tree
checking space map counts
[root@storageqe-21 thin]# thin_dump /dev/mapper/vgtest-swapvol -o metadata
[root@storageqe-21 thin]# thin_dump /dev/mapper/vgtest-swapvol -o metadata_snap -m
[root@storageqe-21 thin]# thin_restore -i metadata -o /dev/mapper/vgtest-swapvol 
Restoring: [==================================================]   100%
[root@storageqe-21 thin]# thin_restore -i metadata_snap -o /dev/mapper/vgtest-swapvol 
Restoring: [>                                                 ] | 0%
mapping beyond end of data device (0 >= 0)

[root@storageqe-21 thin]# diff metadata metadata_snap 
1c1
< <superblock uuid="" time="0" transaction="10" flags="0" version="2" data_block_size="128" nr_data_blocks="1600">
---
> <superblock uuid="" time="0" transaction="10" flags="0" version="2" data_block_size="128" nr_data_blocks="0">

The only difference between the metadata is in "nr_data_blocks", where in regular it is 1600 and in snapshot 0.


Version-Release number of selected component (if applicable):
device-mapper-persistent-data-0.7.0-0.1.rc6.el7

How reproducible:
100%

Steps to Reproduce:
1. get swapped metadata with metadata snapshot
vgcreate  vgtest /dev/loop1
lvcreate -T -L 100 vgtest -n thinpool
lvcreate -T -V 10 vgtest/thinpool -n thinvol
mkfs.ext4 /dev/vgtest/thinvol
lvchange -an vgtest/thinvol
lvcreate -T -V 10 vgtest/thinpool -n thinvol1
mkfs.ext4 /dev/vgtest/thinvol1
lvchange -an vgtest/thinvol1
dmsetup suspend /dev/mapper/vgtest-swapvol-tpool
dmsetup message /dev/mapper/vgtest-swapvol-tpool 0 reserve_metadata_snap
dmsetup resume /dev/mapper/vgtest-swapvol-tpool
lvchange -an vgtest/thinpool
lvcreate -L 100 vgtest -n swapvol
lvchange -an vgtest/swapvol
lvconvert -y --thinpool vgtest/thinpool --poolmetadata  vgtest/swapvol
lvchange -ay vgtest/swapvol

2.dump metadata snapshot
thin_dump /dev/mapper/vgtest-swapvol -o metadata_snap

3.try to restore metadata
thin_restore -i metadata_snap -o /dev/mapper/vgtest-swapvol

Actual results:
Restoring: [>                                                 ] | 0%
mapping beyond end of data device (0 >= 0)

Expected results:
Restoring: [==================================================]   100%

Additional info:
Even though thin_dump manpage says it is not processable by thin_restore, this is quite important feature as the metadata might be corrupted but the metadata snapshot could be OK. For this reason both metadata dump and metadata snapshot dump should be restorable even though restoring either leads to loosing the metadata snapshot.

Comment 2 Joe Thornber 2017-09-18 09:10:07 UTC
Restoring from a metadata snapshot is a very risky business that should only be used as a last resort.

During normal operation there are 2 situations that can cause data blocks for a thin volume to move around.

i) a snapshot is taken of the thin volume, and then a write is sent to the thin volume (very common).

ii) a REQ_DISCARD io is sent to the thin volume which unmaps the block (it may of course be reprovisioned by a subsequent write).

Because of this it's very likely that the metadata snapshot is incorrect.

The metadata snaps do not snapshot the space maps, which track free space in the metadata and data devices.  The space maps also hold the total block counts.  Which is why the nr-data-blocks is unknown in a dump of a metadata snap.

Even if the metadata snap held these sizes separately it would be possible that the data device had been resized since the metadata snap was taken.

So the options are:

i) Set the nr-data-blocks to 0, which also prevents anyone from restoring it without hand editing the xml first.

ii) Set it to the largest block mentioned in the metadata snap.  Doing this will require 2 passes through the metadata, and so increase the thin_dump execution time by a lot.

iii) Set it to whatever the nr-data-blocks is in the real superblock.  We can only do this if the pool is offline, since we can't normally access the superblock with a live system since the kernel will be actively changing it.

I'm happy with option (i), so I'm afraid I'm going to WONTFIX this bug.