Bug 604767

Summary: vgchange -ay takes about 10 minutes
Product: [Fedora] Fedora Reporter: Joerg <joerg>
Component: lvm2Assignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 12CC: agk, bmarzins, bmr, dwysocha, heinzm, jbrassow, jonathan, lvm-team, mbroz, msnitzer, prajnoha, prockai, support
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-03 13:50:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
output of dmesg command
none
output of dmesg command reduced to dracut output
none
lvm.conf
none
outpus of dmesg,pvs,vgs,lvs with highest verbosity none

Description Joerg 2010-06-16 17:20:27 UTC
Created attachment 424528 [details]
output of dmesg command

Description of problem:
during boot the activation of the volumegroup which holds the rootfs takes about
10 minutes.
I have enable verbose=1 in lvm.conf, run dracut and rebootet.
The log (dmesg) is attached.
Between the lines 

dracut: Resuming VG50-iscsi1_20100414 (253:14)
dracut: Resuming VG50-iscsi1_20100513 (253:16)
dracut: Resuming VG50-iscsi1_20100530 (253:18)
dracut: Resuming VG50-iscsi1 (253:12)
and
dracut: Found volume group "VG50"

there is a freeze for about 10 minutes

Version-Release number of selected component (if applicable):


How reproducible:
just reboot

Steps to Reproduce:
1. reboot
  
Actual results:
long boot

Expected results:
faster boot

Additional info:
see attached file

Comment 1 Jonathan Earl Brassow 2010-06-16 19:46:04 UTC
Is it possible to deactivate VG50 and reactivate when the machine is up?  Then you could collect the results of 'vgchange -ay -vvvv VG50' and attach them.  As it is, the log collection you posted doesn't give any more information.  It may also be possible to collect extra information by setting the log level to '3' - the highest setting.

Comment 2 Joerg 2010-06-16 20:28:14 UTC
Created attachment 424578 [details]
output of dmesg command reduced to dracut output

Hi, 

I cannot deactivate VG50 because it holds the rootfs (/). I modified lvm.conf to increase the log level but no more information. I have attached the vgchange part of the boot process. I have noticed that after each
dracut: Loading VG50-iscsi1_20100530 table (253:18)
the freezes for some time (I made comments in the file, search for "---").

I also attach my lvm.conf

Comment 3 Joerg 2010-06-16 20:29:07 UTC
Created attachment 424579 [details]
lvm.conf

Comment 4 Jonathan Earl Brassow 2010-06-16 20:45:32 UTC
hmmm, ok.  You may be forced to update the initrd after making changes to lvm.conf... after all, the system cannot see lvm.conf when the file system isn't there yet.

Perhaps some information about the LVs and VGs on the system?  We may be able to try a reproduction if we have enough information.

Comment 5 Joerg 2010-06-17 15:59:12 UTC
Created attachment 424866 [details]
outpus of dmesg,pvs,vgs,lvs with highest verbosity

I have managed to get verbose = 3.

Here is the result.
The system freezes after
1. dracut: dm reload (253:18) NF [16384]
2. dracut: dm reload (253:12) NF [16384]
3. dracut: dm reload (253:15) NF [16384]

Comment 6 Ray Morris 2010-07-09 00:18:39 UTC
I believe I have the same problem, and a more extreme case.  When activating 
a VG, it hangs for a couple of minutes after printing "dm reload" for each 
snapshot.  We have a lot of LVs, with a lot of snapshots, so the machine
takes about 90 minutes to boot.



ioctl/libdm-iface.c:1702         dm create clones-c24_s1_snap2-cow LVM-xTknbZ2teX4F1abPCxOOwXiiIrj64dx0kf2agS7V0pyl1R8yJmg3xGH0a89DLLpI-cow NF   [16384]
#libdm-common.c:763         clones-c24_s1_snap2-cow: Stacking NODE_ADD (253,92) 0:6 0660
#libdm-deptree.c:1524     Loading clones-c24_s1_snap2-cow table (253:92)
#libdm-deptree.c:1470         Adding target to (253:92): 0 2621440 linear 8:48 2559737984
#libdm-deptree.c:1470         Adding target to (253:92): 2621440 11141120 linear 8:48 2622652544
#libdm-deptree.c:1470         Adding target to (253:92): 13762560 15728640 linear 8:48 3500441728
#libdm-deptree.c:1470         Adding target to (253:92): 29491200 1310720 linear 8:48 3126886528
#ioctl/libdm-iface.c:1702         dm table   (253:92) OF   [16384]
#ioctl/libdm-iface.c:1702         dm reload   (253:92) NF   [16384]
#libdm-deptree.c:1564         Table size changed from 0 to 30801920 for clones-c24_s1_snap2-cow
#libdm-deptree.c:903     Resuming clones-c24_s1_snap2-cow (253:92)
#ioctl/libdm-iface.c:1702         dm resume   (253:92) NF   [16384]
#libdm-common.c:791         clones-c24_s1_snap2-cow: Stacking NODE_READ_AHEAD 256 (flags=1)
#libdm-deptree.c:1196     Creating clones-c24_s1_snap2
#ioctl/libdm-iface.c:1702         dm create clones-c24_s1_snap2 LVM-xTknbZ2teX4F1abPCxOOwXiiIrj64dx0kf2agS7V0pyl1R8yJmg3xGH0a89DLLpI NF   [16384]
#libdm-common.c:763         clones-c24_s1_snap2: Stacking NODE_ADD (253,93) 0:6 0660
#libdm-deptree.c:1524     Loading clones-c24_s1_snap2 table (253:93)
#libdm-deptree.c:1470         Adding target to (253:93): 0 109051904 snapshot 253:90 253:92 P 8
#ioctl/libdm-iface.c:1702         dm table   (253:93) OF   [16384]
#ioctl/libdm-iface.c:1702         dm reload   (253:93) NF   [16384]

Comment 7 Alasdair Kergon 2010-07-09 02:28:22 UTC
How large & full are the snapshot LVs?  At that point it is reading the 'used' part of the snapshot LV to construct the snapshot 'overlay' table in memory.

Comment 8 Ray Morris 2010-07-09 02:42:08 UTC
> How large & full are the snapshot LVs? 

I failed to note other examples, but the one LV which I know was 
slow us as follows:

 LV Name                /dev/clones/c24_s1_snap2
  VG Name                clones
  LV UUID                kf2agS-7V0p-yl1R-8yJm-g3xG-H0a8-9DLLpI
  LV Write Access        read/write
  LV snapshot status     active destination for /dev/clones/c24_s1
  LV Status              available
  # open                 0
  LV Size                52.00 GB
  Current LE             832
  COW-table size         14.69 GB
  COW-table LE           235
  Allocated to snapshot  82.34% 


> At that point it is reading the 'used' part of the snapshot LV to 
> construct the snapshot 'overlay' table in memory.  

I may be barely qualified enough and just motivated enough to add 
new code to save that information on disk rather than reconstructing 
it with each activation, if someone were kind enough to point me in 
the right direction.  I haven't read the LVM source, but do have a 
pretty good understanding of how it works, I believe.

Comment 9 Ray Morris 2010-07-09 02:44:36 UTC
I should mention most of the other snapshots are not that large.
A more typical example might be about 1/3rd that size:

  LV Size                113.00 GB
  Current LE             1808
  COW-table size         4.19 GB
  COW-table LE           67
  Allocated to snapshot  83.07%

Comment 10 Joerg 2010-07-09 16:20:08 UTC
My Snapshots are 10GB in size and filled somewhere between 30%-90%.

Comment 11 Bug Zapper 2010-11-03 13:09:06 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 12 Bug Zapper 2010-12-03 13:50:22 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 13 Alasdair Kergon 2010-12-03 14:37:24 UTC
The current snapshot implementation is slow to load.  That won't change.

New implementations are being worked on which will be much faster.