Red Hat Bugzilla – Bug 1255552
thin_restore/cache_restore shouldn't corrupt a meta device if given a non existent file to restore from
Last modified: 2018-04-10 09:16:51 EDT
Description of problem: [root@host-110 ~]# lvcreate --thinpool POOL --zero y -L 1G --poolmetadatasize 4M snapper_thinp Logical volume "POOL" created. [root@host-110 ~]# lvcreate --virtualsize 1G -T snapper_thinp/POOL -n origin Logical volume "origin" created. [root@host-110 ~]# lvs -a -o +devices LV Attr LSize Pool Origin Data% Meta% Devices POOL twi-aotz-- 1.00g 0.00 1.07 POOL_tdata(0) [POOL_tdata] Twi-ao---- 1.00g /dev/sda1(1) [POOL_tmeta] ewi-ao---- 4.00m /dev/sdh1(0) [lvol0_pmspare] ewi------- 4.00m /dev/sda1(0) origin Vwi-a-tz-- 1.00g POOL 0.00 [root@host-110 ~]# lvchange -an --yes --select 'lv_name=POOL || pool_lv=POOL' [root@host-110 ~]# lvcreate -L 4M -n meta_swap snapper_thinp Logical volume "meta_swap" created. [root@host-110 ~]# lvconvert --yes --thinpool snapper_thinp/POOL --poolmetadata snapper_thinp/meta_swap Converted snapper_thinp/POOL to thin pool. [root@host-110 ~]# lvchange -ay snapper_thinp/meta_swap [root@host-110 ~]# lvs -a -o +devices LV Attr LSize Pool Origin Data% Meta% Devices root -wi-ao---- 6.67g /dev/vda2(205) swap -wi-ao---- 820.00m /dev/vda2(0) POOL twi---tz-- 1.00g POOL_tdata(0) [POOL_tdata] Twi------- 1.00g /dev/sda1(1) [POOL_tmeta] ewi------- 4.00m /dev/sda1(257) [lvol0_pmspare] ewi------- 4.00m /dev/sda1(0) meta_swap -wi-a----- 4.00m /dev/sdh1(0) origin Vwi---tz-- 1.00g POOL [root@host-110 ~]# thin_check /dev/mapper/snapper_thinp-meta_swap examining superblock examining devices tree examining mapping tree checking space map counts [root@host-110 ~]# thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/snapper_thinp-meta_swap Couldn't stat file [root@host-110 ~]# thin_check /dev/mapper/snapper_thinp-meta_swap examining superblock examining devices tree examining mapping tree missing all mappings for devices: [0, -] value size mismatch: expected 8, but got 16. This is not the btree you are looking for. Version-Release number of selected component (if applicable): 3.10.0-306.el7.x86_64 lvm2-2.02.128-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015 lvm2-libs-2.02.128-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015 lvm2-cluster-2.02.128-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015 device-mapper-1.02.105-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015 device-mapper-libs-1.02.105-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015 device-mapper-event-1.02.105-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015 device-mapper-event-libs-1.02.105-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015 device-mapper-persistent-data-0.5.5-1.el7 BUILT: Thu Aug 13 09:58:10 CDT 2015 cmirror-2.02.128-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015 sanlock-3.2.4-1.el7 BUILT: Fri Jun 19 12:48:49 CDT 2015 sanlock-lib-3.2.4-1.el7 BUILT: Fri Jun 19 12:48:49 CDT 2015 lvm2-lockd-2.02.128-1.el7 BUILT: Tue Aug 18 03:45:17 CDT 2015
I did some digging and this is what I came up with: #this LEADS to error vgcreate vgtest /dev/loop1 lvcreate -T -L 100 vgtest -n thinpool lvcreate -T -V 10 vgtest/thinpool -n thinvol lvchange -an vgtest/thinvol lvchange -an vgtest/thinpool lvcreate -L 100 vgtest -n swapvol lvchange -an vgtest/swapvol lvconvert -y --thinpool vgtest/thinpool --poolmetadata vgtest/swapvol lvchange -ay vgtest/swapvol thin_check /dev/mapper/vgtest-swapvol thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol thin_check /dev/mapper/vgtest-swapvol # examining superblock examining devices tree examining mapping tree missing all mappings for devices: [0, -] value size mismatch: expected 8, but got 16. This is not the btree you are looking for. (block 4) #this LEADS to error, but different vgcreate vgtest /dev/loop1 lvcreate -T -L 100 vgtest -n thinpool lvcreate -T -V 10 vgtest/thinpool -n thinvol mkfs.ext4 /dev/vgtest/thinvol lvchange -an vgtest/thinvol lvchange -an vgtest/thinpool lvcreate -L 100 vgtest -n swapvol lvchange -an vgtest/swapvol lvconvert -y --thinpool vgtest/thinpool --poolmetadata vgtest/swapvol lvchange -ay vgtest/swapvol thin_check /dev/mapper/vgtest-swapvol thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol thin_check /dev/mapper/vgtest-swapvol # examining superblock examining devices tree examining mapping tree checking space map counts bad checksum in space map bitmap #this does NOT lead to error vgcreate vgtest /dev/loop1 lvcreate -T -L 100 vgtest -n thinpool lvcreate -T -V 10 vgtest/thinpool -n thinvol mkfs.ext4 /dev/vgtest/thinvol lvchange -an vgtest/thinvol lvcreate -T -V 10 vgtest/thinpool -n thinvol1 mkfs.ext4 /dev/vgtest/thinvol1 lvchange -an vgtest/thinvol1 lvchange -an vgtest/thinpool lvcreate -L 100 vgtest -n swapvol lvchange -an vgtest/swapvol lvconvert -y --thinpool vgtest/thinpool --poolmetadata vgtest/swapvol lvchange -ay vgtest/swapvol thin_check /dev/mapper/vgtest-swapvol thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol thin_check /dev/mapper/vgtest-swapvol # examining superblock examining devices tree examining mapping tree checking space map counts #this LEADS to error vgcreate vgtest /dev/loop1 lvcreate -T -L 100 vgtest -n thinpool lvcreate -T -V 10 vgtest/thinpool -n thinvol lvchange -an vgtest/thinvol lvcreate -T -V 10 vgtest/thinpool -n thinvol1 lvchange -an vgtest/thinvol1 mkfs.ext4 /dev/vgtest/thinvol1 lvchange -an vgtest/thinpool lvcreate -L 100 vgtest -n swapvol lvchange -an vgtest/swapvol lvconvert -y --thinpool vgtest/thinpool --poolmetadata vgtest/swapvol lvchange -ay vgtest/swapvol thin_check /dev/mapper/vgtest-swapvol thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol thin_check /dev/mapper/vgtest-swapvol # examining superblock examining devices tree examining mapping tree thin device 1 is missing mappings [0, -] bad checksum in btree node (block 1) thin device 2 is missing mappings [0, -] value size mismatch: expected 8, but got 24. This is not the btree you are looking for. (block 6) #this LEADS to error vgcreate vgtest /dev/loop1 lvcreate -T -L 100 vgtest -n thinpool lvcreate -T -V 10 vgtest/thinpool -n thinvol mkfs.ext4 /dev/vgtest/thinvol lvchange -an vgtest/thinvol lvcreate -T -V 10 vgtest/thinpool -n thinvol1 lvchange -an vgtest/thinvol1 lvchange -an vgtest/thinpool lvcreate -L 100 vgtest -n swapvol lvchange -an vgtest/swapvol lvconvert -y --thinpool vgtest/thinpool --poolmetadata vgtest/swapvol lvchange -ay vgtest/swapvol thin_check /dev/mapper/vgtest-swapvol thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol thin_check /dev/mapper/vgtest-swapvol # examining superblock examining devices tree missing devices: [0, -] value size mismatch: expected 24, but got 8. This is not the btree you are looking for. (block 8) examining mapping tree missing all mappings for devices: [0, -] value size mismatch: expected 8, but got 16. This is not the btree you are looking for. (block 4) #this LEADS to error vgcreate vgtest /dev/loop1 lvcreate -T -L 100 vgtest -n thinpool lvcreate -T -V 10 vgtest/thinpool -n thinvol lvchange -an vgtest/thinvol lvcreate -T -V 10 vgtest/thinpool -n thinvol1 lvchange -an vgtest/thinvol1 lvchange -an vgtest/thinpool lvcreate -L 100 vgtest -n swapvol lvchange -an vgtest/swapvol lvconvert -y --thinpool vgtest/thinpool --poolmetadata vgtest/swapvol lvchange -ay vgtest/swapvol thin_check /dev/mapper/vgtest-swapvol thin_restore -i /tmp/snapper_thinp_non_exist.14770 -o /dev/mapper/vgtest-swapvol thin_check /dev/mapper/vgtest-swapvol # examining superblock examining devices tree examining mapping tree thin device 1 is missing mappings [0, -] bad checksum in btree node (block 1) thin device 2 is missing mappings [0, -] value size mismatch: expected 8, but got 24. This is not the btree you are looking for. (block 6) So basically if there are mapped blocks by creating filesystem, this error does not happen. But there have to be at least 2 thin LVs, otherwise it leads to "bad checksum in space map bitmap". But if there are more LVs and one of them does not have filesystem, it leads to this corruption. Hope this helps a bit.
cache_restore does the same: [root@storageqe-21 cache]# cache_check /dev/mapper/vgtest-swapvol examining superblock examining mapping array examining hint array examining discard bitset [root@storageqe-21 cache]# cache_restore -i wrong -o /dev/mapper/vgtest-swapvol Couldn't stat file [root@storageqe-21 cache]# cache_check /dev/mapper/vgtest-swapvol examining superblock examining mapping array missing mappings [1, 16384]: missing blocks examining hint array examining discard bitset
The restore tools are destructive in that they overwrite the metadata device/file that you specify for output (much as 'dd' does). They will not magically restore data that they have overwritten in the event of a error (for instance badly formatted XML). Also if there is an error the tools just exit, they do not try to wrap up any partially written metadata. I have made 2 changes that I hope will reduce any confusion: i) if the input file is missing, the metadata dev will be untouched. ii) if there is an error after the metadata has started being written, the tool will zero the superblock. This makes it clear that the partially restored metadata is not to be used, and not expected to pass any *_check tool. Changes in this patch: https://github.com/jthornber/thin-provisioning-tools/commit/5b92f410eca3121079418659eb81ca2a4e372807 Tested with this patch: https://github.com/jthornber/thin-provisioning-tools/commit/f018e6ecf705e8c498437e15ec3b09bd84b2e3c4
Fixed in device-mapper-persistent-data-0.7.3-1.el7. # thin_restore -i adsfgas -o /dev/mapper/vgtest-swapvol Couldn't stat file # thin_check /dev/mapper/vgtest-swapvol examining superblock examining devices tree examining mapping tree checking space map counts # cache_restore -i adsfgas -o /dev/mapper/vgtest-swapvol Couldn't stat file # cache_check /dev/mapper/vgtest-swapvol examining superblock examining mapping array examining hint array examining discard bitset
Joe, Please check BZ1499781. During this testing I found out the same happens with thin_repair/cache_repair, created this new bug. Thanks
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0776