Description of problem:
Main use case is someone having old drives with partial VG(s) (maybe one of the
disks failed), running the latest LVM2 code, and wanting to get data back from
his old drives.
Not exactly sure of the correct fix but wanted to write this down before I
forgot about it.
Problem seems to be that vg_read() does not honor the 'partial' flag (maybe good
reason for this - just not sure yet; looks like some paths try to automatically
correct metadata so maybe that is why) and fails in check_pv_segments(). If I
understand it correctly, the failure in check_pv_segments() that causes the
"PV segment VG free_count mismatch" and "PV segment VG extent_count mismatch"
problems is a result of the fact that for LVM1, the PE count is actually stored
on each PV, and so the summing fails. For LVM2, this isn't the case - all the
PE counts (the numbers) are stored in the metadata on each PV, so summing the
PEs to get a consistent count for the VG is not a problem.
One simpleton idea would be a corrective action for LVM1 VGs like this is just
update the vg free and extent count to the count that you come up with after
summing. Conceptually this would solve the case this bug is filed for and allow
LVs that are contained in existing PVs/VGs to be re-activated.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Configure a system with LVM1 pvs (or loopback-based files with first 1MB or
so) that contain a partial volume group
2. vgcfgbackup -P vg_system (or whatever the vg name is)
** you can also see the problem if you just try "vgscan -P" - same result as
partial VGs will fail because the PE counts are incorrect
Unable to "find" the VG because there are PVs missing, so no vg operations can
# tools/lvm vgcfgbackup -P vg_system
File descriptor 4 left open
WARNING: Activation disabled. No device-mapper interaction will be attempted.
Partial mode. Incomplete volume groups will be activated read-only.
5 PV(s) found for VG vg_system: expected 6
Logical volume (lvol4) contains an incomplete mapping table.
PV segment VG free_count mismatch: 1381 != 2572
PV segment VG extent_count mismatch: 1417 != 2608
Internal error: PV segments corrupted in vg_system.
Volume group "vg_system" not found
vg_read() to not fail and partial vg to be backed up, as is the case with LVM2
Might not be reasonable/possible to fix this and perhaps a lower priority item.
However, this does play into the people's trust (or lack thereof) in LVM2 and
"losing data", so we should not just dismiss it, IMO.
I wonder if making check_pv_segments a method of struct metadata_area_ops would
be a good part of the solution. That way we could isolate this change for lvm1
and reduce risk. Seems to make sense too, since those summation checks should
never fail for LVM2 unless something is truly corrupted, whereas on LVM1 they
will fail if a PV is missing.
Created attachment 149855 [details]
Major hack - just skip failure on PE checks in check_pv_segments and check_lv_segments if partial flag is set
Proof of concept patch. Amazingly, this is all I needed to get past the
failure and write the backup file. Not sure if the backup file is usable but
it looks pretty good. I have 'lvol1' - 'lvol3' that look valid. 'lvol4' and
'lvol5' look like they were on the missing PV(s) - the backup file has a long
list of segments like this for 'lvol4' and 'lvol5':
start_extent = 196
extent_count = 1 # 32 Megabytes
type = "striped"
stripe_count = 1 # linear
stripes = [
There's clearly something wrong with the lvm1 format code. See also the stripe
bug posted to lvm-devel. The code that reads in the stripes is broken - it
misses a factor of #stripes when setting the total segment length and the
algorithm to calculate the area length in the segment looks bogus.
We also wanted it to give the PV IDs instead of 'Missing' and not to create a
separate segment for each extent.
Not sure this is worth doing.
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '9'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 9's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 9 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.