Bug 1745204 - repair pool fails with "bad checksum in superblock"
Summary: repair pool fails with "bad checksum in superblock"
Keywords:
Status: NEW
Alias: None
Product: LVM and device-mapper
Classification: Community
Component: lvm2
Version: 2.02.183
Hardware: x86_64
OS: Linux
high
unspecified
Target Milestone: ---
: ---
Assignee: Joe Thornber
QA Contact: cluster-qe
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-23 21:31 UTC by redhat
Modified: 2023-08-10 15:40 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:
pm-rhel: lvm-technical-solution?
pm-rhel: lvm-test-coverage?


Attachments (Terms of Use)
metadata from vgcfgbackup (105.34 KB, text/plain)
2019-08-23 22:06 UTC, redhat
no flags Details

Description redhat 2019-08-23 21:31:00 UTC
Description of problem:

Using QubesOS, which creates a luks-encrypted LVM pool.  Laptop failed to shutdown properly, performed hard reset.  Upon reboot, system did not start up properly, instead to dracut emergency shell.  The disk appears to have physical problem.

I used `ddrescue` to create a copy of the partition with luks/LVM.  This ticket is about retrieving the data from the rescue copy (on a working drive).

I get these errors:

$ sudo lvconvert --repair qubes_dom0/pool00
  WARNING: Not using lvmetad because of repair.
  WARNING: Disabling lvmetad cache for repair command.
bad checksum in superblock, wanted 823063976
  Repair of thin metadata volume of thin pool qubes_dom0/pool00 failed (status:1). Manual repair required!

$ sudo thin_check /dev/mapper/encrypted_rescue
examining superblock
  superblock is corrupt
    bad checksum in superblock, wanted 636045691


(note that /dev/mapper/encrypted_rescue is created by `cryptsetup luksOpen ...` on the ddrescued partition.)



Version-Release number of selected component (if applicable):

Using fedora 30:

$ lvm version
  LVM version:     2.02.183(2) (2018-12-07)
  Library version: 1.02.154 (2018-12-07)

I get the errors shown above with thin-provisioning-tools versions "0.7.6-4.fc30" (which comes with 
fedora) and also "0.8.5" (which I compiled) on suggestion from Zdenek Kabelac.


How reproducible:

I'm not sure how to recreate the problem.  In my case, bad luck with a faulty drive.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

https://www.redhat.com/archives/linux-lvm/2019-August/msg00019.html

Zdenek Kabelac suggested attaching metadata to this issue...

> Upload 'dd' compressed copy of you ORIGINAL  _tmeta content (which now could 
> be likely already in volume  _meta0 - if you had one succesful run of --repair 
> command).

I don't understand how to use `dd` to get this.  I'm happy to try, I'm just not sure what command to run.  I can't activate any of the pool, if that is where the metadata is.

> To get lvm2 metadata backup just use  'vgcfgbackup -f output.txt  VGNAME'

This command runs successfully.  I'm attaching the output (but I've changed some of the volume names just to avoid posting sensitive data here).

Comment 1 redhat 2019-08-23 22:06:21 UTC
Created attachment 1607548 [details]
metadata from vgcfgbackup


Note You need to log in before you can comment on or make changes to this bug.