Bug 1745204

Summary: repair pool fails with "bad checksum in superblock"
Product: [Community] LVM and device-mapper Reporter: redhat
Component: lvm2Assignee: LVM Team <lvm-team>
lvm2 sub component: Thin Provisioning QA Contact: cluster-qe <cluster-qe>
Status: NEW --- Docs Contact:
Severity: unspecified    
Priority: high CC: agk, heinzm, jbrassow, prajnoha, zkabelac
Version: 2.02.183Flags: pm-rhel: lvm-technical-solution?
pm-rhel: lvm-test-coverage?
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
metadata from vgcfgbackup none

Description redhat 2019-08-23 21:31:00 UTC
Description of problem:

Using QubesOS, which creates a luks-encrypted LVM pool.  Laptop failed to shutdown properly, performed hard reset.  Upon reboot, system did not start up properly, instead to dracut emergency shell.  The disk appears to have physical problem.

I used `ddrescue` to create a copy of the partition with luks/LVM.  This ticket is about retrieving the data from the rescue copy (on a working drive).

I get these errors:

$ sudo lvconvert --repair qubes_dom0/pool00
  WARNING: Not using lvmetad because of repair.
  WARNING: Disabling lvmetad cache for repair command.
bad checksum in superblock, wanted 823063976
  Repair of thin metadata volume of thin pool qubes_dom0/pool00 failed (status:1). Manual repair required!

$ sudo thin_check /dev/mapper/encrypted_rescue
examining superblock
  superblock is corrupt
    bad checksum in superblock, wanted 636045691


(note that /dev/mapper/encrypted_rescue is created by `cryptsetup luksOpen ...` on the ddrescued partition.)



Version-Release number of selected component (if applicable):

Using fedora 30:

$ lvm version
  LVM version:     2.02.183(2) (2018-12-07)
  Library version: 1.02.154 (2018-12-07)

I get the errors shown above with thin-provisioning-tools versions "0.7.6-4.fc30" (which comes with 
fedora) and also "0.8.5" (which I compiled) on suggestion from Zdenek Kabelac.


How reproducible:

I'm not sure how to recreate the problem.  In my case, bad luck with a faulty drive.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

https://www.redhat.com/archives/linux-lvm/2019-August/msg00019.html

Zdenek Kabelac suggested attaching metadata to this issue...

> Upload 'dd' compressed copy of you ORIGINAL  _tmeta content (which now could 
> be likely already in volume  _meta0 - if you had one succesful run of --repair 
> command).

I don't understand how to use `dd` to get this.  I'm happy to try, I'm just not sure what command to run.  I can't activate any of the pool, if that is where the metadata is.

> To get lvm2 metadata backup just use  'vgcfgbackup -f output.txt  VGNAME'

This command runs successfully.  I'm attaching the output (but I've changed some of the volume names just to avoid posting sensitive data here).

Comment 1 redhat 2019-08-23 22:06:21 UTC
Created attachment 1607548 [details]
metadata from vgcfgbackup