| Summary: | thin pool meta device can only be corrupted and repaired once | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Corey Marthaler <cmarthal> |
| Component: | device-mapper-persistent-data | Assignee: | Peter Rajnoha <prajnoha> |
| Status: | CLOSED ERRATA | QA Contact: | Bruno Goncalves <bgoncalv> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.8 | CC: | agk, bgoncalv, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, prockai, rbednar, thornber, tlavigne, zkabelac |
| Target Milestone: | rc | Flags: | cmarthal:
needinfo?
(thornber) |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | device-mapper-persistent-data-0.6.2-0.1.rc5.el6 | Doc Type: | Bug Fix |
| Doc Text: |
Intra-release bug, no documentation needed.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-05-11 01:12:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Corey Marthaler
2016-01-27 23:19:01 UTC
This was a regression in the v0.6.x series. Test case in dmtest: https://github.com/jthornber/device-mapper-test-suite/commit/a4cee3b2844441cbd6f9b3f331242105f3b94299 Fix to thinp-tools: https://github.com/jthornber/thin-provisioning-tools/commit/2815aeace9df510814c2e5d78b3a2ef398440501 In v0.6.2-rc2 onwards The corruption doesn't appear to be detected by thin_check now the second time. Let me know if I'm doing something wrong here, but the test still fails for me, just not in the same way. Corruption isn't detected (and as a result of that I'm assuming) the repair to a new device doesn't happen. # Second iteration of this test... *** Swap corrupt pool metadata iteration 2 *** Current tmeta device: /dev/sdb1 Corrupting pool meta device (/dev/mapper/snapper_thinp-POOL_tmeta) dd if=/dev/urandom of=/dev/mapper/snapper_thinp-POOL_tmeta count=512 seek=4096 bs=1 512+0 records in 512+0 records out 512 bytes (512 B) copied, 0.00182352 s, 281 kB/s Sanity checking pool device (POOL) metadata lvcreate -L 4M -n meta_swap snapper_thinp WARNING: Sum of all thin volume sizes (7.25 GiB) exceeds the size of thin pools (1.00 GiB)! lvconvert --yes --thinpool snapper_thinp/POOL --poolmetadata snapper_thinp/meta_swap lvchange -ay snapper_thinp/meta_swap ### NOT FOUND TO BE CORRUPT thin_check /dev/mapper/snapper_thinp-meta_swap examining superblock examining devices tree examining mapping tree checking space map counts meta data NOT corrupt lvchange -an snapper_thinp/meta_swap lvconvert --yes --thinpool snapper_thinp/POOL --poolmetadata snapper_thinp/meta_swap lvremove snapper_thinp/meta_swap lvchange -ay --yes --select 'lv_name=POOL || pool_lv=POOL' Swap in new _tmeta device using lvconvert --repair lvconvert --yes --repair snapper_thinp/POOL /dev/sda1 Only inactive pool can be repaired. [root@host-118 ~]# vgchange -an snapper_thinp 0 logical volume(s) in volume group "snapper_thinp" now active [root@host-118 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Devices POOL snapper_thinp twi---tz-- 1.00g POOL_tdata(0) [POOL_tdata] snapper_thinp Twi------- 1.00g /dev/sdb1(1) [POOL_tmeta] snapper_thinp ewi------- 4.00m /dev/sdb1(0) [lvol0_pmspare] snapper_thinp ewi------- 4.00m /dev/sdb1(258) origin snapper_thinp Vwi---tz-- 1.00g POOL other1 snapper_thinp Vwi---tz-- 1.00g POOL other2 snapper_thinp Vwi---tz-- 1.00g POOL other3 snapper_thinp Vwi---tz-- 1.00g POOL other4 snapper_thinp Vwi---tz-- 1.00g POOL other5 snapper_thinp Vwi---tz-- 1.00g POOL snap snapper_thinp Vwi---tz-- 1.25g POOL origin [root@host-118 ~]# lvconvert --yes --repair snapper_thinp/POOL /dev/sda1 WARNING: recovery of pools without pool metadata spare LV is not automated. WARNING: If everything works, remove "snapper_thinp/POOL_meta0". WARNING: Use pvmove command to move "snapper_thinp/POOL_tmeta" on the best fitting PV. ### Still on the same device sdb1 [root@host-118 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Devices POOL snapper_thinp twi---tz-- 1.00g POOL_tdata(0) POOL_meta0 snapper_thinp -wi------- 4.00m /dev/sdb1(0) [POOL_tdata] snapper_thinp Twi------- 1.00g /dev/sdb1(1) [POOL_tmeta] snapper_thinp ewi------- 4.00m /dev/sdb1(258) origin snapper_thinp Vwi---tz-- 1.00g POOL other1 snapper_thinp Vwi---tz-- 1.00g POOL other2 snapper_thinp Vwi---tz-- 1.00g POOL other3 snapper_thinp Vwi---tz-- 1.00g POOL other4 snapper_thinp Vwi---tz-- 1.00g POOL other5 snapper_thinp Vwi---tz-- 1.00g POOL snap snapper_thinp Vwi---tz-- 1.25g POOL origin I think the test is not very deterministic. Metadata have some layout - so it depends on runtime where various blocks of stored btree will lend in sectors. So if thin_repair fixes some initial corruption to a completely new device with new layout of btree blocks - it's then different 'data' set. So applying exactly 'same' corruption might or might not hit some existing used btree block. You may already also notice 'lvconvert --repair' is not doing cross validation between kernel and lvm2 metadata - so if kernel will lose some device - this will not be reflected in lvm2 metadata. So lvm2 may reference DevicesID which has been 'erased' from kernel metadata - this will take some time until this all will work automatically - for now it's task for human operator to figure this it. This could explain why your '2nd.' corruption trial ends with: ### NOT FOUND TO BE CORRUPT As the data set after 1st. and possibly removing something - will be different so your 'dd' will erase something possibly unused. To get 'exactly' same corruption - you would need a tool - that would expose 'metadata' layout - so some 2nd. level of thin_dump - thin_dump of metadata layout - so i.e. you would know where all btree blocks related to DeviceID X are located so you could properly 'hit' something. Something in that direction has one person in linux-lvm list where he create some extension to existing thin tools - so we will see how that could be integrated. As of now - thin_repair has limited capabilities of repair - especially if you damage core pool blocks (there are no backups like in extX fs) It did pass on our tests with: device-mapper-persistent-data-0.6.2-0.1.rc5.el6 kernel 2.6.32-621.el6 device-mapper-multipath-0.4.9-92.el6 lvm2-2.02.143-1.el6 # dmtest run --profile mytest --suite thin-provisioning -t /ToolsTests/ Loaded suite thin-provisioning ToolsTests metadata_snap_stress1...PASS metadata_snap_stress2...iteration 0 iteration 1 iteration 2 iteration 3 iteration 4 iteration 5 iteration 6 iteration 7 iteration 8 iteration 9 PASS thin_dump_a_metadata_snap_of_an_active_pool...PASS thin_ls...PASS thin_repair_repeatable...PASS you_can_dump_a_metadata_snapshot...PASS you_cannot_dump_live_metadata...PASS you_cannot_run_thin_check_on_live_metadata...PASS you_cannot_run_thin_restore_on_a_live_metadata...PASS I start to think - as Joe writes new tool 'thin_generate_metadata' - maybe such tool could be enhanced with 'intelligent' 'error' placement into generated metadata ? So then it would be possible to 'create' 'repairable' metadata in repeatable way. > dd if=/dev/urandom of=/dev/mapper/snapper_thinp-POOL_tmeta count=512 seek=4096 bs=1
Is this the 'corruption' every time?
You need to check that the area you are changing (count=512 seek=4096 bs=1) holds actual metadata and is not just unused space. If it is unused, then it does not count as corruption. If it is in use then you would hope the checksum covering that area changes and gets noticed.
Let's mark this bug verified for the case where an mda backup is taken prior to the corruption. In that case, repair appears to work over and over. For the case where a back up isn't taken and 'lvconvert --repair' will be used (the original intent of this bz) there are still two remaining issues. 1. a reliable way to find and corrupt the exact area where the mda resides for each iteration. 2. a bug to be filed about the fact that without lvmetad running, and with the mda verified to be corrupted, the repair will still fail on the first iteration (see comment #11). Filed bug 1319937 for issue 2 listed above. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0960.html |