Bug 231874

Summary: vgcfgbackup does not work with partial lvm1 vg despite the fact that lvs exist with all pvs accounted for
Product: [Fedora] Fedora Reporter: Dave Wysochanski <dwysocha>
Component: lvm2Assignee: Dave Wysochanski <dwysocha>
Status: CLOSED WONTFIX QA Contact: Corey Marthaler <cmarthal>
Severity: medium Docs Contact:
Priority: low    
Version: 9CC: agk, dwysocha, jbrassow, mbroz, prockai
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-07-14 15:19:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Major hack - just skip failure on PE checks in check_pv_segments and check_lv_segments if partial flag is set none

Description Dave Wysochanski 2007-03-12 18:38:49 UTC
Description of problem:
https://www.redhat.com/archives/linux-lvm/2006-November/msg00055.html

Main use case is someone having old drives with partial VG(s) (maybe one of the
disks failed), running the latest LVM2 code, and wanting to get data back from
his old drives.

Not exactly sure of the correct fix but wanted to write this down before I
forgot about it.

Problem seems to be that vg_read() does not honor the 'partial' flag (maybe good
reason for this - just not sure yet; looks like some paths try to automatically
correct metadata so maybe that is why) and fails in check_pv_segments().  If I
understand it correctly, the failure in check_pv_segments() that causes the 
"PV segment VG free_count mismatch" and "PV segment VG extent_count mismatch"
problems is a result of the fact that for LVM1, the PE count is actually stored
on each PV, and so the summing fails.  For LVM2, this isn't the case - all the
PE counts (the numbers) are stored in the metadata on each PV, so summing the
PEs to get a consistent count for the VG is not a problem.

One simpleton idea would be a corrective action for LVM1 VGs like this is just
update the vg free and extent count to the count that you come up with after
summing.  Conceptually this would solve the case this bug is filed for and allow
LVs that are contained in existing PVs/VGs to be re-activated.


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Configure a system with LVM1 pvs (or loopback-based files with first 1MB or
so) that contain a partial volume group
2. vgcfgbackup -P vg_system (or whatever the vg name is) 
** you can also see the problem if you just try "vgscan -P" - same result as
partial VGs will fail because the PE counts are incorrect
  
Actual results:
Unable to "find" the VG because there are PVs missing, so no vg operations can
be done.
Example:
# tools/lvm vgcfgbackup -P vg_system
File descriptor 4 left open
  WARNING: Activation disabled. No device-mapper interaction will be attempted.
  Partial mode. Incomplete volume groups will be activated read-only.
  5 PV(s) found for VG vg_system: expected 6
  Logical volume (lvol4) contains an incomplete mapping table.
  PV segment VG free_count mismatch: 1381 != 2572
  PV segment VG extent_count mismatch: 1417 != 2608
  Internal error: PV segments corrupted in vg_system.
  Volume group "vg_system" not found


Expected results:
vg_read() to not fail and partial vg to be backed up, as is the case with LVM2

Additional info:
Might not be reasonable/possible to fix this and perhaps a lower priority item.
However, this does play into the people's trust (or lack thereof) in LVM2 and
"losing data", so we should not just dismiss it, IMO.

Comment 1 Dave Wysochanski 2007-03-12 19:03:27 UTC
I wonder if making check_pv_segments a method of struct metadata_area_ops would
be a good part of the solution.  That way we could isolate this change for lvm1
and reduce risk.  Seems to make sense too, since those summation checks should
never fail for LVM2 unless something is truly corrupted, whereas on LVM1 they
will fail if a PV is missing.

Comment 2 Dave Wysochanski 2007-03-12 19:25:30 UTC
Created attachment 149855 [details]
Major hack - just skip failure on PE checks in check_pv_segments and check_lv_segments if partial flag is set

Proof of concept patch.  Amazingly, this is all I needed to get past the
failure and write the backup file.  Not sure if the backup file is usable but
it looks pretty good.  I have 'lvol1' - 'lvol3' that look valid.  'lvol4' and
'lvol5' look like they were on the missing PV(s) - the backup file has a long
list of segments like this for 'lvol4' and 'lvol5':
			segment197 {
				start_extent = 196
				extent_count = 1	# 32 Megabytes

				type = "striped"
				stripe_count = 1	# linear

				stripes = [
					"Missing", 0
				]
			}

Comment 3 Alasdair Kergon 2007-03-14 17:05:23 UTC
There's clearly something wrong with the lvm1 format code.  See also the stripe
bug posted to lvm-devel.  The code that reads in the stripes is broken - it
misses a factor of #stripes when setting the total segment length and the
algorithm to calculate the area length in the segment looks bogus.

Comment 4 Alasdair Kergon 2007-03-14 17:13:56 UTC
We also wanted it to give the PV IDs instead of 'Missing' and not to create a
separate segment for each extent.

Comment 5 Dave Wysochanski 2008-01-17 22:57:10 UTC
Not sure this is worth doing.

Comment 6 Bug Zapper 2008-05-14 02:40:16 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 7 Bug Zapper 2009-06-09 22:29:47 UTC
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '9'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 9's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 9 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 8 Bug Zapper 2009-07-14 15:19:13 UTC
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.