Bug 2214253

Summary: [Regression] thin_repair doesn't choose the most recent record while rebuilding superblock after updating DMDP to 1.0.4
Product: Red Hat Enterprise Linux 9 Reporter: Filip Suba <fsuba>
Component: device-mapper-persistent-dataAssignee: Ming-Hung Tsai <mtsai>
Status: CLOSED ERRATA QA Contact: Filip Suba <fsuba>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 9.3CC: agk, heinzm, lvm-team, mcsontos, msnitzer, mtsai, thornber
Target Milestone: rcKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: device-mapper-persistent-data-1.0.6-1.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:56:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Filip Suba 2023-06-12 10:59:48 UTC
Description of problem:
Using the reproducer from BZ#2020662, the snapshot is not repaired. It seems that this issue has been reintroduced in device-mapper-persistent-data-1.0.4 which contains rewritten tools in Rust.

Version-Release number of selected component (if applicable):
device-mapper-persistent-data-1.0.4-1.el9

How reproducible:
always


Steps to Reproduce:
+ lvcreate vg1 --type thin-pool --thinpool tp2 --size 64m --poolmetadatasize 4m -Zn --poolmetadataspare=n
  Thin pool volume with chunk size 64.00 KiB can address at most <15.88 TiB of data.
  WARNING: recovery of pools without pool metadata spare LV is not automated.
  Logical volume "tp2" created.
+ lvcreate vg1 --type thin --thinpool tp2 --name lv2 --virtualsize 16m
  Logical volume "lv2" created.
+ dd if=/dev/zero of=/dev/mapper/vg1-lv2 bs=1M count=4
4+0 records in
4+0 records out
4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.0247404 s, 170 MB/s
+ lvcreate vg1/lv2 --snapshot --name snap201
  Logical volume "snap201" created.
+ lvchange -an vg1/lv2
+ lvchange -an vg1/tp2
+ lvchange -ay vg1/tp2_tmeta -y
  Allowing activation of component LV.
+ dd if=/dev/mapper/vg1-tp2_tmeta of=src.bin
8192+0 records in
8192+0 records out
4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.0279756 s, 150 MB/s
+ thin_dump src.bin
<superblock uuid="" time="1" transaction="2" version="2" data_block_size="128" nr_data_blocks="1024">
  <def name="17">
    <range_mapping origin_begin="0" data_begin="0" length="64" time="0"/>
  </def>
  <device dev_id="1" mapped_blocks="64" transaction="0" creation_time="0" snap_time="1">
    <ref name="17"/>
  </device>
  <device dev_id="2" mapped_blocks="64" transaction="1" creation_time="1" snap_time="1">
    <ref name="17"/>
  </device>
</superblock>+ lvchange -an vg1/tp2_tmeta -y
+ dd if=/dev/zero of=src.bin bs=4K count=1 conv=notrunc
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.00010606 s, 38.6 MB/s
+ thin_dump src.bin --repair --transaction-id 2 --data-block-size 128 --nr-data-blocks 1024
<superblock uuid="" time="0" transaction="2" version="2" data_block_size="128" nr_data_blocks="1024">
  <device dev_id="1" mapped_blocks="0" transaction="0" creation_time="0" snap_time="0">
  </device>
</superblock>

Actual results:
The snapshot is not repaired

<superblock uuid="" time="0" transaction="2" version="2" data_block_size="128" nr_data_blocks="1024">
  <device dev_id="1" mapped_blocks="0" transaction="0" creation_time="0" snap_time="0">
  </device>
</superblock>


Expected results:
The output should be identical to that of the source metadata:

<superblock uuid="" time="1" transaction="2" version="2" data_block_size="128" nr_data_blocks="1024">
  <def name="17">
    <range_mapping origin_begin="0" data_begin="0" length="64" time="0"/>
  </def>
  <device dev_id="1" mapped_blocks="64" transaction="0" creation_time="0" snap_time="1">
    <ref name="17"/>
  </device>
  <device dev_id="2" mapped_blocks="64" transaction="1" creation_time="1" snap_time="1">
    <ref name="17"/>
  </device>
</superblock>

Additional info:

Comment 2 Ming-Hung Tsai 2023-07-11 12:21:06 UTC
Fixed upstream. Will be available in the next release.

https://github.com/jthornber/thin-provisioning-tools/commit/aa5a10af

Comment 3 Ming-Hung Tsai 2023-07-24 17:25:31 UTC
Shipped in upstream v1.0.5. Also added automated unit & functional tests (via cargo --test [--release])

Unit tests:
```
test thin::metadata_repair::sorting_roots_tests::test_with_empty_lhs ... ok
test thin::metadata_repair::sorting_roots_tests::test_with_empty_rhs ... ok
test thin::metadata_repair::sorting_roots_tests::test_with_greater_counts_at_lhs ... ok
test thin::metadata_repair::sorting_roots_tests::test_with_greater_counts_at_rhs ... ok
test thin::metadata_repair::sorting_roots_tests::test_with_greater_time_at_lhs ... ok
test thin::metadata_repair::sorting_roots_tests::test_with_greater_time_at_rhs ... ok
test thin::metadata_repair::sorting_roots_tests::test_with_two_empty_sets ... ok
```

Functional tests for thin_dump:
```
test repair_metadata_with_empty_roots ... ok
```

Comment 6 Filip Suba 2023-08-24 14:08:52 UTC
Verified with device-mapper-persistent-data-1.0.6-1.el9.

Comment 8 errata-xmlrpc 2023-11-07 08:56:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (device-mapper-persistent-data bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6701