Description of problem: https://www.redhat.com/archives/linux-lvm/2006-November/msg00055.html Main use case is someone having old drives with partial VG(s) (maybe one of the disks failed), running the latest LVM2 code, and wanting to get data back from his old drives. Not exactly sure of the correct fix but wanted to write this down before I forgot about it. Problem seems to be that vg_read() does not honor the 'partial' flag (maybe good reason for this - just not sure yet; looks like some paths try to automatically correct metadata so maybe that is why) and fails in check_pv_segments(). If I understand it correctly, the failure in check_pv_segments() that causes the "PV segment VG free_count mismatch" and "PV segment VG extent_count mismatch" problems is a result of the fact that for LVM1, the PE count is actually stored on each PV, and so the summing fails. For LVM2, this isn't the case - all the PE counts (the numbers) are stored in the metadata on each PV, so summing the PEs to get a consistent count for the VG is not a problem. One simpleton idea would be a corrective action for LVM1 VGs like this is just update the vg free and extent count to the count that you come up with after summing. Conceptually this would solve the case this bug is filed for and allow LVs that are contained in existing PVs/VGs to be re-activated. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Configure a system with LVM1 pvs (or loopback-based files with first 1MB or so) that contain a partial volume group 2. vgcfgbackup -P vg_system (or whatever the vg name is) ** you can also see the problem if you just try "vgscan -P" - same result as partial VGs will fail because the PE counts are incorrect Actual results: Unable to "find" the VG because there are PVs missing, so no vg operations can be done. Example: # tools/lvm vgcfgbackup -P vg_system File descriptor 4 left open WARNING: Activation disabled. No device-mapper interaction will be attempted. Partial mode. Incomplete volume groups will be activated read-only. 5 PV(s) found for VG vg_system: expected 6 Logical volume (lvol4) contains an incomplete mapping table. PV segment VG free_count mismatch: 1381 != 2572 PV segment VG extent_count mismatch: 1417 != 2608 Internal error: PV segments corrupted in vg_system. Volume group "vg_system" not found Expected results: vg_read() to not fail and partial vg to be backed up, as is the case with LVM2 Additional info: Might not be reasonable/possible to fix this and perhaps a lower priority item. However, this does play into the people's trust (or lack thereof) in LVM2 and "losing data", so we should not just dismiss it, IMO.
I wonder if making check_pv_segments a method of struct metadata_area_ops would be a good part of the solution. That way we could isolate this change for lvm1 and reduce risk. Seems to make sense too, since those summation checks should never fail for LVM2 unless something is truly corrupted, whereas on LVM1 they will fail if a PV is missing.
Created attachment 149855 [details] Major hack - just skip failure on PE checks in check_pv_segments and check_lv_segments if partial flag is set Proof of concept patch. Amazingly, this is all I needed to get past the failure and write the backup file. Not sure if the backup file is usable but it looks pretty good. I have 'lvol1' - 'lvol3' that look valid. 'lvol4' and 'lvol5' look like they were on the missing PV(s) - the backup file has a long list of segments like this for 'lvol4' and 'lvol5': segment197 { start_extent = 196 extent_count = 1 # 32 Megabytes type = "striped" stripe_count = 1 # linear stripes = [ "Missing", 0 ] }
There's clearly something wrong with the lvm1 format code. See also the stripe bug posted to lvm-devel. The code that reads in the stripes is broken - it misses a factor of #stripes when setting the total segment length and the algorithm to calculate the area length in the segment looks bogus.
We also wanted it to give the PV IDs instead of 'Missing' and not to create a separate segment for each extent.
Not sure this is worth doing.
Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
This message is a reminder that Fedora 9 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 9. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '9'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 9's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 9 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.