Bug 1596313
Summary: | [GSS] Corrupted LVM logical volume | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Cal Calhoun <ccalhoun> | ||||
Component: | device-mapper-persistent-data | Assignee: | Joe Thornber <thornber> | ||||
Status: | CLOSED ERRATA | QA Contact: | Jakub Krysl <jkrysl> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 7.5 | CC: | agk, akaiser, bkunal, bmarzins, cmarthal, heinzm, jbrassow, jpittman, loberman, lvm-team, mcsontos, msnitzer, nravinas, prajnoha, rhandlin, thornber, zkabelac | ||||
Target Milestone: | rc | Keywords: | Rebase | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | device-mapper-persistent-data-0.8.1-1.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-08-06 13:17:59 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1614151 | ||||||
Bug Blocks: | 1577173 | ||||||
Attachments: |
|
Comment 6
Joe Thornber
2018-06-29 14:35:33 UTC
Joe, they ran 'lvconvert --repair' against the volume, then tried to activate the volumes afterwards, but could not. The meta0 volume was created, so we've attached that here in hopes manual repair is possible. Or are you saying they should have run thin_repair in addition to the 'lvconvert --repair'? // EMT // Since this case is escalated for a while and customer is pushing us I am setting the "Customer Escalation = Yes" into this BZ and for now I can see that we have a NEEDINFO set. -- Anderson Kaiser Escalation Manager Looking at the data here, the superblock appears to be fine. There is something wrong with the device details root node pointed to by the superblock. Its header doesn't look corrupted csum: 2423521636 flags: 2 blocknr: 2121 nr_entries: 206 max_entries: 252 value_size: 8 padding: 0 however it doesn't look like it's holding device detail entries. Those are 24 bytes in size instead of 8. The first 8 keys look like this: keys: 301910, 301911, 301912, 301913, 301914, 301915, 301916, 301917 And the corresponding values look like this: values: 0x0000006063040000 0x0000006163040000 0x0000006363040000 0x0000006563040000 0x0000006663040000 0x0000006263040000 0x0000006463040000 0x0000006763040000 These look a lot more like block_time values. There are two blocks in the metadata, #2265 and #2272, that do look like device_details root nodes. I'm not sure yet if these are old nodes. The data mapping root block pointed to by the superblock is, on the other hand, completely corrupted. Right now, I'm not sure if this is simply a bad superblock, or if there really is corruption across multiple data structures in the metadata device. I was mistaken in my last comment. The block that is listed as the data mapping root node in the superblock is not corrupted. It just isn't a btree node. It's a valid bitmap block. In fact, all of the blocks with a non-zero block number/checksum have a valid checksum for some type of block. However, I don't believe that there should be any way for the the blocks being pointed to by the superblock to change type without the superblock being updated. First they would have to be freed, and that shouldn't happen until after the superblock is updated to no longer point to them. Our upstream code seems now to be able to repair the metadata given in this case. We feel confident that this issue is resolved with the latest changes we have made. Created attachment 1559932 [details]
repaired metadata
0.8 release of thin_dump produces this output.
Given there were many improvements in the thin tools, and backporting is as likely to break things as to improve them, so this will be a rebase. Testing found no issues with: device-mapper-1.02.158-2.el7.x86_64 device-mapper-persistent-data-0.8.5-1.el7.x86_64 kernel-3.10.0-1058.el7.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2320 |