Bug 1255552 - thin_restore/cache_restore shouldn't corrupt a meta device if given a non existent file to restore from
thin_restore/cache_restore shouldn't corrupt a meta device if given a non exi...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: device-mapper-persistent-data (Show other bugs)
7.2
x86_64 Linux
low Severity low
: rc
: ---
Assigned To: Joe Thornber
Jakub Krysl
:
Depends On:
Blocks: 1469559
  Show dependency treegraph
 
Reported: 2015-08-20 19:00 EDT by Corey Marthaler
Modified: 2018-04-10 09:16 EDT (History)
12 users (show)

See Also:
Fixed In Version: device-mapper-persistent-data-0.7.3-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-04-10 09:15:56 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:0776 None None None 2018-04-10 09:16 EDT

  None (edit)
Description Corey Marthaler 2015-08-20 19:00:55 EDT
Description of problem:
[root@host-110 ~]# lvcreate  --thinpool POOL  --zero y -L 1G --poolmetadatasize 4M snapper_thinp
  Logical volume "POOL" created.
[root@host-110 ~]# lvcreate  --virtualsize 1G -T snapper_thinp/POOL -n origin
  Logical volume "origin" created.
[root@host-110 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta% Devices
  POOL            twi-aotz--   1.00g             0.00   1.07  POOL_tdata(0)
  [POOL_tdata]    Twi-ao----   1.00g                          /dev/sda1(1)
  [POOL_tmeta]    ewi-ao----   4.00m                          /dev/sdh1(0)
  [lvol0_pmspare] ewi-------   4.00m                          /dev/sda1(0)
  origin          Vwi-a-tz--   1.00g POOL        0.00


[root@host-110 ~]# lvchange -an --yes --select 'lv_name=POOL || pool_lv=POOL'
[root@host-110 ~]# lvcreate -L 4M -n meta_swap snapper_thinp
  Logical volume "meta_swap" created.
[root@host-110 ~]# lvconvert --yes --thinpool snapper_thinp/POOL --poolmetadata snapper_thinp/meta_swap
  Converted snapper_thinp/POOL to thin pool.
[root@host-110 ~]# lvchange -ay snapper_thinp/meta_swap
[root@host-110 ~]# lvs -a -o +devices
  LV              Attr       LSize   Pool Origin Data%  Meta% Devices
  root            -wi-ao----   6.67g                          /dev/vda2(205)
  swap            -wi-ao---- 820.00m                          /dev/vda2(0)
  POOL            twi---tz--   1.00g                          POOL_tdata(0)
  [POOL_tdata]    Twi-------   1.00g                          /dev/sda1(1)
  [POOL_tmeta]    ewi-------   4.00m                          /dev/sda1(257)
  [lvol0_pmspare] ewi-------   4.00m                          /dev/sda1(0)
  meta_swap       -wi-a-----   4.00m                          /dev/sdh1(0)
  origin          Vwi---tz--   1.00g POOL

[root@host-110 ~]# thin_check /dev/mapper/snapper_thinp-meta_swap
examining superblock
examining devices tree
examining mapping tree
checking space map counts

[root@host-110 ~]# thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/snapper_thinp-meta_swap
Couldn't stat file

[root@host-110 ~]# thin_check /dev/mapper/snapper_thinp-meta_swap
examining superblock
examining devices tree
examining mapping tree
  missing all mappings for devices: [0, -]
    value size mismatch: expected 8, but got 16.  This is not the btree you are looking for.


Version-Release number of selected component (if applicable):
3.10.0-306.el7.x86_64

lvm2-2.02.128-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
lvm2-libs-2.02.128-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
lvm2-cluster-2.02.128-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
device-mapper-1.02.105-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
device-mapper-libs-1.02.105-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
device-mapper-event-1.02.105-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
device-mapper-event-libs-1.02.105-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
device-mapper-persistent-data-0.5.5-1.el7    BUILT: Thu Aug 13 09:58:10 CDT 2015
cmirror-2.02.128-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
sanlock-3.2.4-1.el7    BUILT: Fri Jun 19 12:48:49 CDT 2015
sanlock-lib-3.2.4-1.el7    BUILT: Fri Jun 19 12:48:49 CDT 2015
lvm2-lockd-2.02.128-1.el7    BUILT: Tue Aug 18 03:45:17 CDT 2015
Comment 2 Jakub Krysl 2017-07-27 05:03:35 EDT
I did some digging and this is what I came up with:

#this LEADS to error
vgcreate  vgtest /dev/loop1
lvcreate -T -L 100 vgtest -n thinpool
lvcreate -T -V 10 vgtest/thinpool -n thinvol
lvchange -an vgtest/thinvol
lvchange -an vgtest/thinpool
lvcreate -L 100 vgtest -n swapvol
lvchange -an vgtest/swapvol
lvconvert -y --thinpool vgtest/thinpool --poolmetadata  vgtest/swapvol
lvchange -ay vgtest/swapvol
thin_check /dev/mapper/vgtest-swapvol
thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol
thin_check /dev/mapper/vgtest-swapvol
#
examining superblock
examining devices tree
examining mapping tree
  missing all mappings for devices: [0, -]
    value size mismatch: expected 8, but got 16. This is not the btree you are looking for. (block 4)


#this LEADS to error, but different
vgcreate  vgtest /dev/loop1
lvcreate -T -L 100 vgtest -n thinpool
lvcreate -T -V 10 vgtest/thinpool -n thinvol
mkfs.ext4 /dev/vgtest/thinvol
lvchange -an vgtest/thinvol
lvchange -an vgtest/thinpool
lvcreate -L 100 vgtest -n swapvol
lvchange -an vgtest/swapvol
lvconvert -y --thinpool vgtest/thinpool --poolmetadata  vgtest/swapvol
lvchange -ay vgtest/swapvol
thin_check /dev/mapper/vgtest-swapvol
thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol
thin_check /dev/mapper/vgtest-swapvol
#
examining superblock
examining devices tree
examining mapping tree
checking space map counts
bad checksum in space map bitmap


#this does NOT lead to error
vgcreate  vgtest /dev/loop1
lvcreate -T -L 100 vgtest -n thinpool
lvcreate -T -V 10 vgtest/thinpool -n thinvol
mkfs.ext4 /dev/vgtest/thinvol
lvchange -an vgtest/thinvol
lvcreate -T -V 10 vgtest/thinpool -n thinvol1
mkfs.ext4 /dev/vgtest/thinvol1
lvchange -an vgtest/thinvol1
lvchange -an vgtest/thinpool
lvcreate -L 100 vgtest -n swapvol
lvchange -an vgtest/swapvol
lvconvert -y --thinpool vgtest/thinpool --poolmetadata  vgtest/swapvol
lvchange -ay vgtest/swapvol
thin_check /dev/mapper/vgtest-swapvol
thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol
thin_check /dev/mapper/vgtest-swapvol
#
examining superblock
examining devices tree
examining mapping tree
checking space map counts


#this LEADS to error
vgcreate  vgtest /dev/loop1
lvcreate -T -L 100 vgtest -n thinpool
lvcreate -T -V 10 vgtest/thinpool -n thinvol
lvchange -an vgtest/thinvol
lvcreate -T -V 10 vgtest/thinpool -n thinvol1
lvchange -an vgtest/thinvol1
mkfs.ext4 /dev/vgtest/thinvol1
lvchange -an vgtest/thinpool
lvcreate -L 100 vgtest -n swapvol
lvchange -an vgtest/swapvol
lvconvert -y --thinpool vgtest/thinpool --poolmetadata  vgtest/swapvol
lvchange -ay vgtest/swapvol
thin_check /dev/mapper/vgtest-swapvol
thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol
thin_check /dev/mapper/vgtest-swapvol
#
examining superblock
examining devices tree
examining mapping tree
  thin device 1 is missing mappings [0, -]
    bad checksum in btree node (block 1)
  thin device 2 is missing mappings [0, -]
    value size mismatch: expected 8, but got 24. This is not the btree you are looking for. (block 6)


#this LEADS to error
vgcreate  vgtest /dev/loop1
lvcreate -T -L 100 vgtest -n thinpool
lvcreate -T -V 10 vgtest/thinpool -n thinvol
mkfs.ext4 /dev/vgtest/thinvol
lvchange -an vgtest/thinvol
lvcreate -T -V 10 vgtest/thinpool -n thinvol1
lvchange -an vgtest/thinvol1
lvchange -an vgtest/thinpool
lvcreate -L 100 vgtest -n swapvol
lvchange -an vgtest/swapvol
lvconvert -y --thinpool vgtest/thinpool --poolmetadata  vgtest/swapvol
lvchange -ay vgtest/swapvol
thin_check /dev/mapper/vgtest-swapvol
thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol
thin_check /dev/mapper/vgtest-swapvol
#
examining superblock
examining devices tree
  missing devices: [0, -]
    value size mismatch: expected 24, but got 8. This is not the btree you are looking for. (block 8)
examining mapping tree
  missing all mappings for devices: [0, -]
    value size mismatch: expected 8, but got 16. This is not the btree you are looking for. (block 4)


#this LEADS to error
vgcreate  vgtest /dev/loop1
lvcreate -T -L 100 vgtest -n thinpool
lvcreate -T -V 10 vgtest/thinpool -n thinvol
lvchange -an vgtest/thinvol
lvcreate -T -V 10 vgtest/thinpool -n thinvol1
lvchange -an vgtest/thinvol1
lvchange -an vgtest/thinpool
lvcreate -L 100 vgtest -n swapvol
lvchange -an vgtest/swapvol
lvconvert -y --thinpool vgtest/thinpool --poolmetadata  vgtest/swapvol
lvchange -ay vgtest/swapvol
thin_check /dev/mapper/vgtest-swapvol
thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol
thin_check /dev/mapper/vgtest-swapvol
#
examining superblock
examining devices tree
examining mapping tree
  thin device 1 is missing mappings [0, -]
    bad checksum in btree node (block 1)
  thin device 2 is missing mappings [0, -]
    value size mismatch: expected 8, but got 24. This is not the btree you are looking for. (block 6)


So basically if there are mapped blocks by creating filesystem, this error does not happen. But there have to be at least 2 thin LVs, otherwise it leads to "bad checksum in space map bitmap". But if there are more LVs and one of them does not have filesystem, it leads to this corruption.
Hope this helps a bit.
Comment 3 Jakub Krysl 2017-08-04 10:30:55 EDT
cache_restore does the same:

[root@storageqe-21 cache]# cache_check /dev/mapper/vgtest-swapvol 
examining superblock
examining mapping array
examining hint array
examining discard bitset
[root@storageqe-21 cache]# cache_restore -i wrong -o /dev/mapper/vgtest-swapvol 
Couldn't stat file
[root@storageqe-21 cache]# cache_check /dev/mapper/vgtest-swapvol 
examining superblock
examining mapping array
  missing mappings [1, 16384]:
    missing blocks
examining hint array
examining discard bitset
Comment 4 Joe Thornber 2017-09-28 09:55:19 EDT
The restore tools are destructive in that they overwrite the metadata device/file that you specify for output (much as 'dd' does).  They will not magically restore data that they have overwritten in the event of a error (for instance badly formatted XML).  Also if there is an error the tools just exit, they do not try to wrap up any partially written metadata.

I have made 2 changes that I hope will reduce any confusion:

i) if the input file is missing, the metadata dev will be untouched.

ii) if there is an error after the metadata has started being written, the tool will zero the superblock.  This makes it clear that the partially restored metadata is not to be used, and not expected to pass any *_check tool.

Changes in this patch:

https://github.com/jthornber/thin-provisioning-tools/commit/5b92f410eca3121079418659eb81ca2a4e372807

Tested with this patch:

https://github.com/jthornber/thin-provisioning-tools/commit/f018e6ecf705e8c498437e15ec3b09bd84b2e3c4
Comment 6 Jakub Krysl 2017-10-09 07:25:33 EDT
Fixed in device-mapper-persistent-data-0.7.3-1.el7.

# thin_restore -i adsfgas -o /dev/mapper/vgtest-swapvol 
Couldn't stat file
# thin_check /dev/mapper/vgtest-swapvol 
examining superblock
examining devices tree
examining mapping tree
checking space map counts

# cache_restore -i adsfgas -o /dev/mapper/vgtest-swapvol 
Couldn't stat file
# cache_check /dev/mapper/vgtest-swapvol
examining superblock
examining mapping array
examining hint array
examining discard bitset
Comment 7 Jakub Krysl 2017-10-09 07:48:33 EDT
Joe,
Please check BZ1499781. During this testing I found out the same happens with thin_repair/cache_repair, created this new bug.
Thanks
Comment 10 errata-xmlrpc 2018-04-10 09:15:56 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0776

Note You need to log in before you can comment on or make changes to this bug.