This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 970798 - the pool *_tmeta device can only be corrupted and then restored once
the pool *_tmeta device can only be corrupted and then restored once
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2 (Show other bugs)
7.0
x86_64 Linux
urgent Severity urgent
: rc
: ---
Assigned To: Zdenek Kabelac
Cluster QE
:
Depends On:
Blocks: 1007074
  Show dependency treegraph
 
Reported: 2013-06-04 18:47 EDT by Corey Marthaler
Modified: 2014-07-11 15:10 EDT (History)
10 users (show)

See Also:
Fixed In Version: lvm2-2.02.103-4.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1007074 (view as bug list)
Environment:
Last Closed: 2014-06-13 08:14:57 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2013-06-04 18:47:04 EDT
Description of problem:
I was trying to loop through the following: thin_dump, corrupt _tmeta, thin_check, thin_restore, and thin_check again; and found that those operation will only work once or twice. After that problems occur.

[root@qalvm-01 ~]# lvs -a -o +devices
  LV           VG            Attr      LSize  Pool Origin Data%  Devices
  POOL         snapper_thinp twi-a-tz-  5.00g               5.86 POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi-aot--  5.00g                    /dev/vdh1(0)
  [POOL_tmeta] snapper_thinp ewi-aot--  8.00m                    /dev/vdd1(0)
  origin       snapper_thinp Vwi-aotz-  1.00g POOL         29.05
  other1       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  other2       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  other3       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  other4       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  other5       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  snap1        snapper_thinp Vwi-aotz-  1.00g POOL origin  14.66

[root@qalvm-01 ~]# df -h
Filesystem                        Size  Used Avail Use% Mounted on
/dev/mapper/snapper_thinp-origin 1014M  320M  695M  32% /mnt/origin
/dev/mapper/snapper_thinp-snap1  1014M  172M  843M  17% /mnt/snap1

[root@qalvm-01 ~]# thin_dump /dev/mapper/snapper_thinp-POOL_tmeta > /tmp/snapper_thinp_dump_1.61.9264
[root@qalvm-01 ~]# dd if=/dev/zero of=/dev/mapper/snapper_thinp-POOL_tmeta count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 2.6939e-05 s, 19.0 MB/s
[root@qalvm-01 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
bad checksum in superblock
[root@qalvm-01 ~]# thin_restore -i /tmp/snapper_thinp_dump_1.61.9264 -o /dev/mapper/snapper_thinp-POOL_tmeta
superblock 0
flags 0
blocknr 0
transaction id 0
data mapping root 7
details root 6
data block size 128
metadata block size 4096
metadata nr blocks 2048
[root@qalvm-01 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
[root@qalvm-01 ~]# sync


[root@qalvm-01 ~]# thin_dump /dev/mapper/snapper_thinp-POOL_tmeta > /tmp/snapper_thinp_dump_1.61.9265
[root@qalvm-01 ~]# dd if=/dev/zero of=/dev/mapper/snapper_thinp-POOL_tmeta count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 2.6454e-05 s, 19.4 MB/s
[root@qalvm-01 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
bad checksum in superblock
[root@qalvm-01 ~]# thin_restore -i /tmp/snapper_thinp_dump_1.61.9264 -o /dev/mapper/snapper_thinp-POOL_tmeta
superblock 0
flags 0
blocknr 0
transaction id 0
data mapping root 7
details root 6
data block size 128
metadata block size 4096
metadata nr blocks 2048
[root@qalvm-01 ~]# thin_check /dev/mapper/snapper_thinp-POOL_tmeta
[root@qalvm-01 ~]# 
[root@qalvm-01 ~]# umount /mnt/*
[root@qalvm-01 ~]# vgchange -an snapper_thinp
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Failed to get state of mapped device
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Failed to get state of mapped device
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Failed to get state of mapped device
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Failed to get state of mapped device
[...]
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  0 logical volume(s) in volume group "snapper_thinp" now active


Jun  4 17:19:27 qalvm-01 kernel: [11658.571259] device-mapper: block manager: btree_node validator check failed for block 3
Jun  4 17:19:27 qalvm-01 kernel: [11658.572468] device-mapper: btree spine: node_check failed: csum 2848471990 != wanted 2848607790
Jun  4 17:19:27 qalvm-01 kernel: [11658.573669] device-mapper: block manager: btree_node validator check failed for block 3
Jun  4 17:19:27 qalvm-01 kernel: [11658.574887] device-mapper: btree spine: node_check failed: csum 2848471990 != wanted 2848607790
Jun  4 17:19:27 qalvm-01 kernel: [11658.576042] device-mapper: block manager: btree_node validator check failed for block 3
Jun  4 17:19:27 qalvm-01 kernel: [11658.577291] device-mapper: btree spine: node_check failed: csum 2848471990 != wanted 2848607790
Jun  4 17:19:27 qalvm-01 kernel: [11658.578528] device-mapper: block manager: btree_node validator check failed for block 3
Jun  4 17:19:27 qalvm-01 kernel: [11658.579898] device-mapper: btree spine: node_check failed: csum 2848471990 != wanted 2848607790


[root@qalvm-01 ~]# lvremove snapper_thinp
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Failed to get state of mapped device
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Failed to get state of mapped device
[...]
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Huge memory allocation (size 67108864) rejected - metadata corruption?
  Couldn't create ioctl argument.
  Unable to deactivate logical volume "snap1"


Version-Release number of selected component (if applicable):
3.8.0-0.40.el7.x86_64

lvm2-2.02.99-0.39.el7    BUILT: Wed May 29 08:12:36 CDT 2013
lvm2-libs-2.02.99-0.39.el7    BUILT: Wed May 29 08:12:36 CDT 2013
lvm2-cluster-2.02.99-0.39.el7    BUILT: Wed May 29 08:12:36 CDT 2013
device-mapper-1.02.78-0.39.el7    BUILT: Wed May 29 08:12:36 CDT 2013
device-mapper-libs-1.02.78-0.39.el7    BUILT: Wed May 29 08:12:36 CDT 2013
device-mapper-event-1.02.78-0.39.el7    BUILT: Wed May 29 08:12:36 CDT 2013
device-mapper-event-libs-1.02.78-0.39.el7    BUILT: Wed May 29 08:12:36 CDT 2013
cmirror-2.02.99-0.39.el7    BUILT: Wed May 29 08:12:36 CDT 2013

How reproducible:
Often
Comment 1 Zdenek Kabelac 2013-06-04 19:02:05 EDT
I assume you've tried to use them with live  thin pool device - which is not how it's working.

If you are attempting to 'repair' metadata volume - you are doing it into a separate new LV (which should have at least the size of origin _tmeta device)
(2nd. note -this way you could offline resize _tmeta which is running out of free space)

So this is not a bug of thin tools - but rather lack of clear documentation dealing with this.

In general - it's prohibited to write anything to the _tmeta device if the dm thin target is using this device - it will lead to the cruel death of data location on such device.

The proper way for recovery goes along this path -

create a new empty LV with size of _tmeta.
Preferably keep thin pool device unused (when thin check detects error, you are left with active _tmeta, but not active pool - this is best case scenario for recovery)
The other way around is to combine/play with 'swapping' - explained later.

Now when you are recovering _tmeta -

you could pipe  thin_dump --repair | thin_restore  from one LV to another.

Once finished - you could swap this 'recovered' _tmeta LV into thin pool via:

lvconvert --poolmetadata NewLV  --thinpool ExistingThinPoolLV  (see man lvconvert(8)).

(You can swap any LV - activation of thin pool with incompatible metadata will fail or makes a horrible damage - so be careful here...).

Can you confirm this is the issue  - I guess this bug can be closed notabug.
Comment 2 Corey Marthaler 2013-06-04 19:12:14 EDT
I'll rewrite the test case and see what happens. Thanks for the info!
Comment 3 Corey Marthaler 2013-06-05 14:58:06 EDT
I can't seem to make this work. What am I doing wrong here, and why is lvm letting me do it wrong? :)

[root@qalvm-01 ~]# lvs -a -o +devices
  LV           VG            Attr      LSize  Pool Origin Data%  Devices
  POOL         snapper_thinp twi-a-tz-  5.00g               6.10 POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi-aot--  5.00g                    /dev/vdh1(0)
  [POOL_tmeta] snapper_thinp ewi-aot--  8.00m                    /dev/vdd1(0)
  origin       snapper_thinp Vwi-aotz-  1.00g POOL         30.27
  other1       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  other2       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  other3       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  other4       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  other5       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
  snap1        snapper_thinp Vwi-aotz-  1.00g POOL origin  15.52

[root@qalvm-01 ~]# df -h 
Filesystem                        Size  Used Avail Use% Mounted on
[...]
/dev/mapper/snapper_thinp-origin 1014M  332M  683M  33% /mnt/origin
/dev/mapper/snapper_thinp-snap1  1014M  181M  834M  18% /mnt/snap1

[root@qalvm-01 ~]# dmsetup ls
snapper_thinp-origin    (253:6)
snapper_thinp-POOL      (253:5)
snapper_thinp-snap1     (253:12)
snapper_thinp-other5    (253:11)
snapper_thinp-other4    (253:10)
snapper_thinp-other3    (253:9)
snapper_thinp-POOL-tpool        (253:4)
snapper_thinp-POOL_tdata        (253:3)
snapper_thinp-other2    (253:8)
snapper_thinp-POOL_tmeta        (253:2)
snapper_thinp-other1    (253:7)

[root@qalvm-01 ~]# thin_dump /dev/mapper/snapper_thinp-POOL_tmeta > /tmp/snapper_thinp_dump_1.408.19872

[root@qalvm-01 ~]# umount /mnt/*

[root@qalvm-01 ~]# vgchange -an snapper_thinp
  0 logical volume(s) in volume group "snapper_thinp" now active

[root@qalvm-01 ~]# lvs -a -o +devices
  LV           VG            Attr      LSize  Pool Origin Devices
  POOL         snapper_thinp twi---tz-  5.00g             POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi---t--  5.00g             /dev/vdh1(0)
  [POOL_tmeta] snapper_thinp ewi---t--  8.00m             /dev/vdd1(0)
  origin       snapper_thinp Vwi---tz-  1.00g POOL
  other1       snapper_thinp Vwi---tz-  1.00g POOL
  other2       snapper_thinp Vwi---tz-  1.00g POOL
  other3       snapper_thinp Vwi---tz-  1.00g POOL
  other4       snapper_thinp Vwi---tz-  1.00g POOL
  other5       snapper_thinp Vwi---tz-  1.00g POOL
  snap1        snapper_thinp Vwi---tz-  1.00g POOL origin

[root@qalvm-01 ~]# dmsetup ls

[root@qalvm-01 ~]# lvcreate -n meta -L 8M snapper_thinp
  Logical volume "meta" created

[root@qalvm-01 ~]# lvs -a -o +devices
  LV           VG            Attr      LSize  Pool Origin Devices
  POOL         snapper_thinp twi---tz-  5.00g             POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi---t--  5.00g             /dev/vdh1(0)
  [POOL_tmeta] snapper_thinp ewi---t--  8.00m             /dev/vdd1(0)
  meta         snapper_thinp -wi-a----  8.00m             /dev/vdh1(1280)
  origin       snapper_thinp Vwi---tz-  1.00g POOL
  other1       snapper_thinp Vwi---tz-  1.00g POOL
  other2       snapper_thinp Vwi---tz-  1.00g POOL
  other3       snapper_thinp Vwi---tz-  1.00g POOL
  other4       snapper_thinp Vwi---tz-  1.00g POOL
  other5       snapper_thinp Vwi---tz-  1.00g POOL
  snap1        snapper_thinp Vwi---tz-  1.00g POOL origin

[root@qalvm-01 ~]#  thin_restore -i /tmp/snapper_thinp_dump_1.408.19872 -o /dev/snapper_thinp/meta
superblock 0
flags 0
blocknr 0
transaction id 0
data mapping root 7
details root 6
data block size 128
metadata block size 4096
metadata nr blocks 2048

[root@qalvm-01 ~]# lvconvert --poolmetadata /dev/snapper_thinp/meta --thinpool /dev/snapper_thinp/POOL
Do you want to swap metadata of snapper_thinp/POOL pool with volume snapper_thinp/meta? [y/n]: y
  Thin pool transaction_id=0, while expected: 6.
  Unable to deactivate open snapper_thinp-POOL_tmeta (253:2)
  Unable to deactivate open snapper_thinp-POOL_tdata (253:3)
  Failed to deactivate snapper_thinp-POOL-tpool
  Failed to activate pool logical volume snapper_thinp/POOL.
  Device snapper_thinp-POOL_tdata (253:3) is used by another device.
  Failed to deactivate pool data logical volume.
  Device snapper_thinp-POOL_tmeta (253:2) is used by another device.
  Failed to deactivate pool metadata logical volume.

[root@qalvm-01 ~]# dmsetup ls
snapper_thinp-POOL      (253:5)
snapper_thinp-POOL-tpool        (253:4)
snapper_thinp-POOL_tdata        (253:3)
snapper_thinp-POOL_tmeta        (253:2)

[root@qalvm-01 ~]# lvchange -an snapper_thinp
[root@qalvm-01 ~]# lvchange -ay snapper_thinp
  Thin pool transaction_id=0, while expected: 6.
  Unable to deactivate open snapper_thinp-POOL_tmeta (253:2)
  Unable to deactivate open snapper_thinp-POOL_tdata (253:3)
  Failed to deactivate snapper_thinp-POOL-tpool
  device-mapper: reload ioctl on  failed: No data available
  device-mapper: reload ioctl on  failed: No data available
  device-mapper: reload ioctl on  failed: No data available
  device-mapper: reload ioctl on  failed: No data available
  device-mapper: reload ioctl on  failed: No data available
  device-mapper: reload ioctl on  failed: No data available
  device-mapper: reload ioctl on  failed: No data available

[root@qalvm-01 ~]#  lvconvert --poolmetadata /dev/snapper_thinp/meta --thinpool /dev/snapper_thinp/POOL
Do you want to swap metadata of snapper_thinp/POOL pool with volume snapper_thinp/meta? [y/n]: y
  Converted snapper_thinp/POOL to thin pool.

# the convert above supposedly worked, but the tmeta device is still on /dev/vdd1, and not on /dev/vdh1?
[root@qalvm-01 ~]# lvs -a -o +devices
  dm_report_object: report function failed for field data_percent
  dm_report_object: report function failed for field data_percent
  dm_report_object: report function failed for field data_percent
  dm_report_object: report function failed for field data_percent
  dm_report_object: report function failed for field data_percent
  dm_report_object: report function failed for field data_percent
  dm_report_object: report function failed for field data_percent
  One or more specified logical volume(s) not found.
  LV           VG            Attr      LSize  Pool Origin Data%  Devices
  POOL         snapper_thinp twi-a-tz-  5.00g               6.10 POOL_tdata(0)
  [POOL_tdata] snapper_thinp Twi-aot--  5.00g                    /dev/vdh1(0)
  [POOL_tmeta] snapper_thinp ewi-aot--  8.00m                    /dev/vdd1(0)
  meta         snapper_thinp -wi------  8.00m                    /dev/vdh1(1280)
  origin       snapper_thinp Vwi-d-tz-  1.00g POOL
  other1       snapper_thinp Vwi-d-tz-  1.00g POOL
  other2       snapper_thinp Vwi-d-tz-  1.00g POOL
  other3       snapper_thinp Vwi-d-tz-  1.00g POOL
  other4       snapper_thinp Vwi-d-tz-  1.00g POOL
  other5       snapper_thinp Vwi-d-tz-  1.00g POOL
  snap1        snapper_thinp Vwi-d-tz-  1.00g POOL origin
Comment 4 Zdenek Kabelac 2013-06-05 16:16:57 EDT
(In reply to Corey Marthaler from comment #3)
> I can't seem to make this work. What am I doing wrong here, and why is lvm
> letting me do it wrong? :)

In some cases I don't know ;) but in general - the 'intelligent' recovery is yet to be written - so far, we have rather helping tools for allowing 'assisted' recovery - and this exposes some 'dangerous to modify' internals of thin pool.

> 
> [root@qalvm-01 ~]# lvs -a -o +devices
>   LV           VG            Attr      LSize  Pool Origin Data%  Devices
>   POOL         snapper_thinp twi-a-tz-  5.00g               6.10
> POOL_tdata(0)
>   [POOL_tdata] snapper_thinp Twi-aot--  5.00g                    /dev/vdh1(0)
>   [POOL_tmeta] snapper_thinp ewi-aot--  8.00m                    /dev/vdd1(0)
>   origin       snapper_thinp Vwi-aotz-  1.00g POOL         30.27
>   other1       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
>   other2       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
>   other3       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
>   other4       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
>   other5       snapper_thinp Vwi-a-tz-  1.00g POOL          0.00
>   snap1        snapper_thinp Vwi-aotz-  1.00g POOL origin  15.52
> 
> [root@qalvm-01 ~]# thin_dump /dev/mapper/snapper_thinp-POOL_tmeta >
> /tmp/snapper_thinp_dump_1.408.19872

Running thin_dump on 'live' thin pool isn't really a good idea,
especially if there is something causing changes to the content
of metadata.

The easiest way to access metadata - is to deactivate thin pool,
swap _tmeta with some temporary volume - and active this swapped
LV as normal LV and read data from this place.


> [root@qalvm-01 ~]# lvconvert --poolmetadata /dev/snapper_thinp/meta
> --thinpool /dev/snapper_thinp/POOL
> Do you want to swap metadata of snapper_thinp/POOL pool with volume
> snapper_thinp/meta? [y/n]: y
>   Thin pool transaction_id=0, while expected: 6.
>   Unable to deactivate open snapper_thinp-POOL_tmeta (253:2)
>   Unable to deactivate open snapper_thinp-POOL_tdata (253:3)
>   Failed to deactivate snapper_thinp-POOL-tpool
>   Failed to activate pool logical volume snapper_thinp/POOL.
>   Device snapper_thinp-POOL_tdata (253:3) is used by another device.
>   Failed to deactivate pool data logical volume.
>   Device snapper_thinp-POOL_tmeta (253:2) is used by another device.
>   Failed to deactivate pool metadata logical volume.

hmmm - and this is interesting - your recovered data on meta volume
are not in 'sync' with lvm metadata.
Kernel dm-thin data have transaction_id 0 - but lvm2 metadata expects 6.
(Intelligent recovery here would have number of ways to resolve this issue)
You must select which version needs to be fixed to match -
either kernel data or lvm data.

But specifically in this case it looks your thin_restore simple recovered 'empty' data (looking at output of it)
So I guess it's related to your thin_dump of live thin pool device.

If you looks into  /tmp/snapper_thinp_dump_1.408.19872
you should see there listed at least some devices i.e.:

<superblock uuid="" time="1" transaction="3" data_block_size="256" nr_data_blocks="80">
  <device dev_id="1" mapped_blocks="0" transaction="0" creation_time="0" snap_time="1">
  </device>
  <device dev_id="2" mapped_blocks="0" transaction="1" creation_time="1" snap_time="1">
  </device>
  <device dev_id="3" mapped_blocks="0" transaction="2" creation_time="1" snap_time="1">
  </device>
</superblock>

But I think in your case it will be empty.


> [root@qalvm-01 ~]# lvchange -an snapper_thinp
> [root@qalvm-01 ~]# lvchange -ay snapper_thinp
>   Thin pool transaction_id=0, while expected: 6.
>   Unable to deactivate open snapper_thinp-POOL_tmeta (253:2)
>   Unable to deactivate open snapper_thinp-POOL_tdata (253:3)
>   Failed to deactivate snapper_thinp-POOL-tpool
>   device-mapper: reload ioctl on  failed: No data available
>   device-mapper: reload ioctl on  failed: No data available
>   device-mapper: reload ioctl on  failed: No data available
>   device-mapper: reload ioctl on  failed: No data available
>   device-mapper: reload ioctl on  failed: No data available
>   device-mapper: reload ioctl on  failed: No data available
>   device-mapper: reload ioctl on  failed: No data available

Yes in this case now the 'dmsetup' commands needs to step in.
There is serious incompatibility error between 'kernel' and 'lvm'
data - and needs clever recovery.

> 
> [root@qalvm-01 ~]#  lvconvert --poolmetadata /dev/snapper_thinp/meta
> --thinpool /dev/snapper_thinp/POOL
> Do you want to swap metadata of snapper_thinp/POOL pool with volume
> snapper_thinp/meta? [y/n]: y
>   Converted snapper_thinp/POOL to thin pool.

Yes - you've put valid old metadata back.

> [root@qalvm-01 ~]# lvs -a -o +devices
>   dm_report_object: report function failed for field data_percent
>   dm_report_object: report function failed for field data_percent
>   dm_report_object: report function failed for field data_percent
>   dm_report_object: report function failed for field data_percent
>   dm_report_object: report function failed for field data_percent
>   dm_report_object: report function failed for field data_percent
>   dm_report_object: report function failed for field data_percent
>   One or more specified logical volume(s) not found.
>   LV           VG            Attr      LSize  Pool Origin Data%  Devices

I believe this has been already fixed upstream - percent reporting missed check for active thin lv presence of queried device.
Comment 5 Corey Marthaler 2013-06-11 18:04:30 EDT
> The easiest way to access metadata - is to deactivate thin pool,
> swap _tmeta with some temporary volume - and active this swapped
> LV as normal LV and read data from this place.

Swapping the meta device doesn't currently work (See bug 973419 - thin pool mda device swapping doesn't work).
Comment 6 Zdenek Kabelac 2013-10-16 07:04:04 EDT
Swapping has been addressed by patches:

https://www.redhat.com/archives/lvm-devel/2013-October/msg00050.html
https://www.redhat.com/archives/lvm-devel/2013-October/msg00053.html

Tool will refuse to swap metadata when pool_lv is kept active by any thin volume.
In case only pool is active - it's deactivated - swapped and reactivated.

Thus if the new swapped metadata are incompatible with transaction_id of the thin pool in lvm2 metadata - user gets many error and warnings. This is currently expected - but since this is really 'low-level' weapon for this pool repairing - user needs to know what is he doing.

General rule here is - keep thin-pool and all related thin volumes inactivate before doing such risky operations.

User way of repairing pools is

lvconvert --repair  vgname/poolname

Which currently should try to repair at least broken kernel thin pool metadata.
Later version of lvm2 will do more consistency checks/validations/updates between kernel and lvm2 metadata.
Comment 8 Corey Marthaler 2014-03-20 17:22:47 EDT
Marking this verified for only the following "proper swap" test case. Any other dump/corruption/restore attempt will probably not work.


SCENARIO - [verify_io_between_offline_mda_corruptions]
Making origin volume
lvcreate --thinpool POOL --zero y -L 5G snapper_thinp
Sanity checking pool device metadata
(thin_check /dev/mapper/snapper_thinp-POOL_tmeta)
examining superblock
examining devices tree
examining mapping tree
lvcreate --virtualsize 1G -T snapper_thinp/POOL -n origin
lvcreate -V 1G -T snapper_thinp/POOL -n other1
lvcreate -V 1G -T snapper_thinp/POOL -n other2
lvcreate -V 1G -T snapper_thinp/POOL -n other3
lvcreate -V 1G -T snapper_thinp/POOL -n other4
lvcreate -V 1G -T snapper_thinp/POOL -n other5
Placing an XFS filesystem on origin volume

*** Pool MDA Corrupt/Restore iteration 1/5 ***
syncing before snap creation...
Creating thin snapshot of origin volume
lvcreate -K -s /dev/snapper_thinp/origin -n snap1
Create new device to swap in as the new _tmeta device
lvcreate -n newtmeta -L 8M snapper_thinp

Dumping current pool metadata to /tmp/snapper_thinp_dump_1.8504.28273
thin_dump /dev/mapper/snapper_thinp-POOL_tmeta > /tmp/snapper_thinp_dump_1.8504.28273

Current tmeta device: /dev/sdb2
Restoring valid mda to new device
thin_restore -i /tmp/snapper_thinp_dump_1.8504.28273 -o /dev/snapper_thinp/newtmeta
Corrupting pool meta device (/dev/mapper/snapper_thinp-POOL_tmeta)
dd if=/dev/zero of=/dev/mapper/snapper_thinp-POOL_tmeta count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.00163608 s, 313 kB/s
Verifying that pool meta device is now corrupt
thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
  superblock is corrupt
    bad checksum in superblock

vgchange -an snapper_thinp
Swap in new _tmeta device
lvconvert --yes --poolmetadata snapper_thinp/newtmeta --thinpool snapper_thinp/POOL
New swapped tmeta device: /dev/sdc3

vgchange -ay snapper_thinp
Verifying that pool meta device is no longer corrupt
thin_check /dev/mapper/snapper_thinp-POOL_tmeta
examining superblock
examining devices tree
examining mapping tree
Remove old swap device
lvremove -f snapper_thinp/newtmeta


*** Pool MDA Corrupt/Restore iteration 2/5 ***
[....]



3.10.0-110.el7.x86_64
lvm2-2.02.105-13.el7    BUILT: Wed Mar 19 05:38:19 CDT 2014
lvm2-libs-2.02.105-13.el7    BUILT: Wed Mar 19 05:38:19 CDT 2014
lvm2-cluster-2.02.105-13.el7    BUILT: Wed Mar 19 05:38:19 CDT 2014
device-mapper-1.02.84-13.el7    BUILT: Wed Mar 19 05:38:19 CDT 2014
device-mapper-libs-1.02.84-13.el7    BUILT: Wed Mar 19 05:38:19 CDT 2014
device-mapper-event-1.02.84-13.el7    BUILT: Wed Mar 19 05:38:19 CDT 2014
device-mapper-event-libs-1.02.84-13.el7    BUILT: Wed Mar 19 05:38:19 CDT 2014
device-mapper-persistent-data-0.2.8-4.el7    BUILT: Fri Jan 24 14:28:55 CST 2014
cmirror-2.02.105-13.el7    BUILT: Wed Mar 19 05:38:19 CDT 2014
Comment 9 Ludek Smid 2014-06-13 08:14:57 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.