Red Hat Bugzilla – Bug 261521
pvdisplay of 250 luns with 4 paths each (1000 paths) takes many hours or days and consumes 4+GB of RAM
Last modified: 2013-02-28 23:05:50 EST
Basic issue is pvdisplay is taking an unreasonably long time and consuming an
unreasonably large amount of RAM.
I tried a simple test with rhel4u5 xen node connected to iSCSI LUNs (about 100
luns each with 2 paths) and got execution time of around 33s and memory
consumption of 14MB.
Did a quick check of the lvmdump provided (see attached) - checked the
/etc/lvm/lvm.conf filter line and it looks correct, as does the .cache file.
Looks like the pvdisplay process was blocked doing direct IO when the cmd was
run (not surprising).
Might be storage dependent (EMC with powerpath).
Snips from IRC session:
Aug 27 15:40:45 <deepthot> is this a cluster or just single node with
Aug 27 15:40:55 <deepthot> the output from lvmdump will tell a lot of the
Aug 27 15:57:51 <csm-laptop> 64 gig of ram
Aug 27 15:58:05 <csm-laptop> it is not a cluster
Aug 27 15:58:21 <deepthot> could be related to multipath - something like
Aug 27 15:58:37 <deepthot> so you have 1000 luns, 4 paths / LUN, and so
4000 paths total?
Aug 27 15:58:46 <deepthot> That will definately cause a problem with
Aug 27 15:59:03 <deepthot> I am not sure it is your pvdisplay problem
though - could be another problem altogether
Aug 27 16:00:43 <csm-laptop> i see no mp file in etc at all
Aug 27 16:01:45 <csm-laptop> so this is using powerpath
Aug 27 16:01:54 <csm-laptop> i have the storage guy sitting next to me
Aug 27 16:02:50 <deepthot> csm-laptop: ok, I'm not familar with powerpath
Aug 27 16:02:59 <deepthot> know what it is, but don't know it really
Aug 27 16:03:44 <csm-laptop> 225 devices *4 paths = 1000 devices
Aug 27 16:05:36 <csm-laptop> okay so pvs is just sort of hanging there too
Aug 27 16:18:52 <deepthot> are you getting any output at all, or is it just
hanging and consuming memory?
Aug 27 16:22:28 <deepthot> I'm looking at the code now and have a rhel4u5
machine I can work with
Aug 27 16:22:45 <deepthot> on my setup, I do see the memory consumption
going up as the command executes - I have 100 LUNs with 2 paths each
Aug 27 16:23:48 <deepthot> Output is somewhat slow, but it does complete -
I'm getting like 14MB consumption and ~33s execution time for ~200 paths so
nowhere near what you are seeing
Aug 27 16:32:38 <deepthot> have you tried tweaking the "filter" line in
Aug 27 16:32:48 <csm-laptop> this has been running for hours and has not finished
Aug 27 16:33:34 <deepthot> If you know, for instance, that all /dev/sd*
devices are underlying paths, and you just want to scan /dev/foobar* devices
instead (these are the multipath devices), you could add a filter line that
would exclude them
Aug 27 16:33:58 <deepthot> email@example.com
Aug 27 16:34:28 <deepthot> and you can't kill it?
Aug 27 16:35:42 <csm-laptop> email on it's way
Aug 27 16:37:06 <csm-laptop> I have not tried to kill the process
Aug 27 16:37:31 <csm-laptop> also I have not tweaked the filter
Aug 27 16:37:44 <csm-laptop> take a look at the lvmdump and lets talk tomorrow?
Aug 28 14:49:55 <deepthot> So are you getting any output at all, or does it
Aug 28 15:00:01 <csm-laptop> eventually the process finishes.... it just
takes freaking forever!
Aug 28 15:00:45 <deepthot> so on my system, it looks like I get output for
one pv, followed by a pause of say 500ms, followed by the next one,
Aug 28 15:01:02 <deepthot> but on your system, the pause is in minutes,
hours, or days?
Aug 28 15:01:24 <csm-laptop> hours
Aug 28 15:04:55 <deepthot> did you try running pvdisplay on just one of the
Aug 28 15:06:00 <deepthot> you could try running on one PV in verbose mode,
e.g. " pvdisplay -vvvv /dev/mapper/mpath0 2&>output.txt"
Aug 28 15:07:43 <csm-laptop> well I can't test anything as the host is down
for maintenance right now
Created attachment 177301 [details]
lvmdump file of system with the problem
This problem continues to haunt us here at Bloomberg. Has there been any
progress on this issue?
More importantly, the issue also causes extremely long boot times (1-2 hours)
because the vgscan in the init scripts has similar behavior. It appears to
perform (number of devices)^(number of devices) device stats (about ~800k in
our case) - for every device it finds, it appears to recheck every device
including the ones already checked. Also, the filter in lvm.conf is configured
to ignore /dev/sd* devices (the LUNs); however this appears to only apply as to
whether or not it will consider any metadata on the devices - it still stats the
I have changed the priority on this to medium instead of low... given the nature
of the problem and the fact that the machine, on boot, is out of service so long
it seems to merit that. Please change it back if I am wrong.
Created attachment 207341 [details]
I have not made any progress on this. I am about to leave on a short vacation
but will try to take at least a brief look when I get back next week.
I think I know roughly why this is but not sure how hard it is to fix. Probably
not easy but maybe there is something we can do to improve the situation.
If there is anything you would like us to provide or test please let us know.
presuming that vacation is over do we have anything to report about this?
Not yet - other things getting in the way sorry.
Did you set up the VG specifically to contain a large number of PVs or are you
just using the default settings? (See man pvcreate --metadatacopies and
[We know about the two performance enhancements needed (lack of internal
metadata caching so operations are repeated needlessly; lack of automated VG
metadata area mangement).]
In our testing here the first point you make about repeated operations seems to
be our likely problem. I am working on getting answers to how this was
created... since I didn't do it I really don't know.
I have confirmed that the default settings were used in creation of the PVs.
Customer in IT 133260 seeing this as well. I've reproduced this internally with
about 500 (small) PVs created with default options. Using pvcreate with
--metadatacopies 0 gets rid of the huge delays on VG/LV/PV operations.
Does the suggestion in bug 229560 make sense here (add VG name to .cache file)?
What progress do we see on this? Customer wants to know.
Largely the answer is removing the majority of MDAs as discussed. It's the
direction upstream seems to be taking. There are also tool updates coming down
the pipe which will help with managing such a setup.
Anything new to report on this? It's a month on from the last update and I am
sure to get hammered soon!
----- Additional Comments From firstname.lastname@example.org 2008-02-01 03:39 EDT
Is there any update at the RedHat site for that Bugzilla ? Do you need
assistance from IBM ?
This event sent from IssueTracker by jkachuck
There are basically two steps to speed up this process we are working on
1) use internal cache for device labels
2) use internal cache for metadata areas
A solution for problem 1) was just submitted in upstream code (but need some
subsequent patches for non-mda PVs), we are working on 2) issue.
I will update this bugzilla when patches are ready.
Then some testing on affected configuration would be nice of course.
Any further updates regarding the availability of patches.
So it's almost 2 months from the last update at this point, the Solaris and AIX
people are laughing about how long this is taking and the lack of patch
availability. I have to admit that this is less than optimal in terms of
support for an "Enterprise" solution.
The fix for this BZ is planned for RHEL 4.7. A prerequisite is to get the change
reviewed and accepted upstream, and thoroughly tested. This work is underway,
and continues to be a high priority.
Setting this bug to POST status because crucial patch (solving the activation
time) is now in upstream CVS.
(Several previous commits were already in tree and solved partial problems -
like caching of device labels (see comment #36).
Anyway, several steps are needed now to prepare test package for RHEL4, I will
update this bugzilla when we have packages ready.
Testing build for RHEL4 already exist now.
If anyone want test it before it reach public beta testing phase, please contact
Red Hat support.
(For reference, upstream package containing fixes is LVM2 2.02.35 release.)
Thanks for your patience.
Added storage-related partners for their heads-up and request for testing.
----- Additional Comments From email@example.com 2008-07-01 05:33 EDT
Hello Red Hat,
Can you please post your test results for the improved fix ?
This event sent from IssueTracker by jkachuck
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.