672782 – lvm2-monitor script does more then necessary

Bug 672782 - lvm2-monitor script does more then necessary

Summary: lvm2-monitor script does more then necessary

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	lvm2
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	LVM and device-mapper development team
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2011-01-26 11:13 UTC by Peter Hjalmarsson
Modified:	2011-08-26 11:08 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-08-26 11:01:02 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Peter Hjalmarsson 2011-01-26 11:13:38 UTC

Description of problem:
Currently it seems like systemd in rawhide during boot executes lvm2-monitor.
The problem is that in essential what this script does is:
"for vg in $(vgs) ; do echo "doing something with $vg" && vgchange --monitor y --poll y $vg ; done"
However my experimentations on this machine and others suggest that the following is enought to activate monitoring and polling on all active volume groups:
"vgchange --monitor y --poll y"

The only differences between the both may be output, but with systemd this output seems non-existent anyway.

My bootchart suggest that removing the vgs also removes a process running for about 1 second with much CPU/IO from the boot process.

Version-Release number of selected component (if applicable):
lvm2.x86_64 2.02.82-2.fc15 @rawhide
lvm2-libs.x86_64 2.02.82-2.fc15 @rawhide
systemd.x86_64 17-1.fc15 @rawhide
systemd-units.x86_64 17-1.fc15 @rawhide
initscripts.x86_64 9.24-1.fc15 @rawhide
initscripts-legacy.x86_64 9.24-1.fc15 @rawhide

How reproducible:
Always

Steps to Reproduce:
1. Utilize LVM2
2. Use bootchart
3. Look at the results

Actual results:
You will se a vgs-process followed by a vgchange-process.

Expected results:
You will only see the vgchange-process.

Additional info:
I think this may remove some overhead from the bootup as firstly the whole vgs seems unecessary, and running the command vgchange multiple times when one is enough seems overkill.

Comment 1 Milan Broz 2011-01-26 12:36:30 UTC

There is one important difference in vgchange:

vgchange -a y (no explicit vg)
will not activate VG if the device was not previously in lvm device cache.

vgchange -a y VG
will try to run full device scan if it cannot find device in its cache

So change can cause that some VG will not be activated. But active volume groups should be in cache anyway. I am not sure why it was written this way.

One day device scan will be udev job and this will not be problem anymore...

Comment 2 Peter Hjalmarsson 2011-01-26 13:50:58 UTC

(In reply to comment #1)
> There is one important difference in vgchange:
> 
> vgchange -a y (no explicit vg)
> will not activate VG if the device was not previously in lvm device cache.
> 
> vgchange -a y VG
> will try to run full device scan if it cannot find device in its cache
>

We are talking about different scripts.

You are talking about the initial run of vgchange that exists to activate all present volume groups. This run usually is called with --sysinit because monitoring and polling may need stuff not mounted/remounted rw/available.

I talk about the latter script that is supposed to start monitoring and polling for all volumegroups that the former step found, and thus already are activated but do not have monitoring and polling avaible.


I am not currently able to test but I am pretty sure that
"vgchange -an test && vgchange --monitoring y --poll y test" will fail since volumegroup test is not activated and vgchange is not told to activate it (and lvm2-monitoring only does the latter).

Comment 3 Peter Hjalmarsson 2011-01-26 13:59:59 UTC

(In reply to comment #2)
> I am not currently able to test but I am pretty sure that
> "vgchange -an test && vgchange --monitoring y --poll y test" will fail since
> volumegroup test is not activated and vgchange is not told to activate it (and
> lvm2-monitoring only does the latter).

Found the possibility to try this and "vgchange --monitor y --poll y test" (which is exactly what lvm2-monitoring does) returns without doing anything unless volumegroup test is activated beforehand.

Comment 4 Peter Hjalmarsson 2011-04-13 09:12:49 UTC

Ok, can anyone have a look and comment on this?

"vgchange --monitor y --poll y $vg" does NOTHING unless $vg has already been marked available before (i.e. "vgchange -ay $vg").
However "vgchange --monitor -y --poll y" enables polling and monitoring for all volumegroups already marked avaible.

So you have a script that runs vgs to find volumegroups (which will only find already cached volumegroups if vgscan has not been run before) to find all volumegroups in the system, so you can run a command on them which does nothing unless the volume group is already in cache and has already been marked available.

The script can just run vgchange which looks into the cache to find available volumegroups and start monitoring/polling on them.

Which sound more over-engineered?
Which sounds like it does more CPU/IO?
Which sounds like the one booting faster?

According to "systemd-analyze blame", this change cuts about 1000ms from the booting time, on a system running one volumegroup on one pretty fast harddrive.
On my more complex desktop and home-server the impact were bigger.


So could this change please get done, or could I get a relevant comment on why not (since comment 2 is for this case not relevant).

Comment 5 Peter Hjalmarsson 2011-04-13 13:58:48 UTC

So after being talking in #lvm@freenode I got a comment from asalor about this having something to do with how vgchange did a full scan of some sort if it was not given a VG, which took precious time.

I think I will question that statement here and now:

The system:
1 Core 7i CPU, 920 4 core HT enabled.
6G memory.

linux 2.6.38.2
udev-167
lvm2-2.02.84

8 ATA-drives connected (6 SATA, 2 PATA on two different controllers), where two SATA drives are connected to a fakeraid for windows, and two other are connected to a fakeraid for linux.

The linux fakeraid hosts a VG with 8 LVs (lillen). Another drive hosts one VG containing one LV (lillen-home).

So two VGs, a lot of LVs and a lot of unrelated disks/partitions for LVM to scan.


Now to the numbers (we all love numbers):

The first while loop is essentially what the monitoring init script does currently (call vgchange once for each VG).
The second one is what the script could do (call vgchange once with all VGs).
The third one is my proposal (remove vgs and just run vgchange once).

I did this test a couple of times both with a usb-stick connected and without, and it did not alter the results (the +/- 3s for all results was there no matter what).


# sh --verbose lvm-test.sh 
vgs="lillen lillen-home"

vgscan && vgchange -ay --monitor y --poll y && udevadm settle && sync
  Reading all physical volumes.  This may take a while...
  Found volume group "lillen-home" using metadata type lvm2
  Found volume group "lillen" using metadata type lvm2
  1 logical volume(s) in volume group "lillen-home" now active
  8 logical volume(s) in volume group "lillen" now active


id=0
time while [ $((id++)) -ne 50 ]
do
	for vg in ${vgs}
	do
		vgchange --monitor n --poll n $vgs > /dev/null
		vgchange --monitor y --poll y $vg > /dev/null
	done
done

real	1m13.471s
user	0m0.517s
sys	0m2.791s


id=0
time while [ $((id++)) -ne 50 ]
do
	vgchange --monitor n --poll n $vgs > /dev/null
	vgchange --monitor y --poll y $vgs > /dev/null
done

real	0m36.839s
user	0m0.286s
sys	0m1.558s


id=0
time while [ $((id++)) -ne 50 ]
do
	vgchange --monitor n --poll n > /dev/null
	vgchange --monitor y --poll y > /dev/null
done

real	0m37.062s
user	0m0.268s
sys	0m1.373s

Comment 6 Milan Broz 2011-04-13 14:52:24 UTC

(In reply to comment #5)
> So after being talking in #lvm@freenode I got a comment from asalor about this
> having something to do with how vgchange did a full scan of some sort if it was
> not given a VG, which took precious time.

As I said, all this will change with removal of internal scan and replacing it with libudev interface, so it really make no sense to optimise this now (monitoring script is not optimal, I hope it will be fixed soon.)

All info I have is that libudev scan is planned for next upstream release, so let's wait for that and then decide what to do with monitoring initscript.

> vgscan && vgchange -ay --monitor y --poll y && udevadm settle && sync

"udevadm settle && sync" should be NOOP here

>  for vg in ${vgs}
>  do
>   vgchange --monitor n --poll n $vgs > /dev/null

$vgs ? or $vg?

anyway, this loop really makes no sense - this is run only once during startup.
better compare -vvvv output forboth commands - it is possible that it really do more than neccessary.

>   vgchange --monitor y --poll y $vg > /dev/null
>  done
> done

Comment 7 Peter Hjalmarsson 2011-04-13 20:13:42 UTC

(In reply to comment #6)
> As I said, all this will change with removal of internal scan and replacing it
> with libudev interface, so it really make no sense to optimise this now
> (monitoring script is not optimal, I hope it will be fixed soon.)
> 

Well the problem I have here is that I cannot see why not fix monitoring script already, since the problem I see is a unnecessary step (vgs) and that does not change no matter if LVM2 is using its own cache or is able to utilize udev db.

> $vgs ? or $vg?
> 

vgs, both are beign un-monitored ad un-polled as the thought was that only the "enabling"-part should have a possibility of different impact on the time.

Still, fixing the first loop so the un-monitoring/unpolling only gets done once per while loop, it still takes about 50% more time then the once only calling vgchange once.

So in short, what I was trying to show three differetn senarios here:

1. how the script currently works (uses vgs and using one vgchange call per volumegroup)
2. trivial chage to the script (uses vgs and using one vgchange call for all volumegroups)
3. how the script should work (not using vgs, one vgchange call for all volumegroups)

the timing said the following:
1. was about 50% slower then the other two (not counting vgs, probably because of the extra vgchange)
2. was about the same speed as 3. (not counting vgs)
3. had the same speed as 2. (but would not need vgs)


> anyway, this loop really makes no sense - this is run only once during startup.

Well, if you want to know how costly one call is, it does make sense.
it tells me that even with hot caches on avaible but unmonitored/unpolled volumegroups (which is the state they should be in when lvm-monitoring is being run) "vgchange --monitor y --poll y lillen && vgchange --monitor y --poll y lillen-home" takes a lot more time then "vgchange --monitor y --poll y lillen lillen-home" (not suprising).
And that "vgchange --monitor y --poll y lillen lillen-home" takes the same as "vgchange --monitor y --poll y" (which came out first changed from run to run).

> better compare -vvvv output forboth commands - it is possible that it really do
> more than neccessary.
> 

Well, the only thing that seems to get done is one more call to the cache if you do not specifies one volume group (probably the call done to find the volume groups).


And here comes my two current questions:
What will change if those calls are made to udev db instead of LVMs own cache?
What is most time costly during boot: having vgs do the call to find volume groups, or having vgchange do that call itself?



lillen ~ # vgchange --monitor n --poll n lillen-home lillen
  1 logical volume(s) in volume group "lillen-home" unmonitored
  8 logical volume(s) in volume group "lillen" unmonitored

lillen ~ # vgchange -vvvv --monitor y --poll y lillen-home lillen > with-vgs 2>&1

lillen ~ # vgchange --monitor n --poll n lillen-home lillen
  1 logical volume(s) in volume group "lillen-home" unmonitored
  8 logical volume(s) in volume group "lillen" unmonitored

lillen ~ # vgchange -vvvv --monitor y --poll y > without-vgs 2>&1

lillen ~ # diff -u with*
--- without-vgs	2011-04-13 21:44:28.472983804 +0200
+++ with-vgs	2011-04-13 21:44:14.589883412 +0200
@@ -1,10 +1,15 @@
-#lvmcmdline.c:1089         Processing: vgchange -vvvv --monitor y --poll y
+#lvmcmdline.c:1089         Processing: vgchange -vvvv --monitor y --poll y lillen-home lillen
 #lvmcmdline.c:1092         O_DIRECT will be used
 #config/config.c:994       Setting global/locking_type to 1
 #config/config.c:994       Setting global/wait_for_locks to 1
 #locking/locking.c:240       File-based locking selected.
 #config/config.c:971       Setting global/locking_dir to /var/lock/lvm
-#toollib.c:566     Finding all volume groups
+#toollib.c:526     Using volume group(s) on command line
+#toollib.c:466     Finding volume group "lillen-home"
+#locking/file_locking.c:235       Locking /var/lock/lvm/V_lillen-home RB
+#locking/file_locking.c:141         _do_flock /var/lock/lvm/V_lillen-home:aux WB
+#locking/file_locking.c:51         _undo_flock /var/lock/lvm/V_lillen-home:aux
+#locking/file_locking.c:141         _do_flock /var/lock/lvm/V_lillen-home RB
 #device/dev-io.c:495         Opened /dev/ram0 RO O_DIRECT
 #device/dev-io.c:134         /dev/ram0: block size is 4096 bytes
 #label/label.c:184       /dev/ram0: No label detected
@@ -220,7 +225,6 @@
 #cache/lvmcache.c:1166         lvmcache: /dev/sde1: now in VG lillen-home with 1 mdas
 #cache/lvmcache.c:947         lvmcache: /dev/sde1: setting lillen-home VGID to Mzz7qN0lw9i4XRrXtghA9uiXebR3lK0r
 #cache/lvmcache.c:1203         lvmcache: /dev/sde1: VG lillen-home: Set creation host to lillen.
-#device/dev-io.c:541         Closed /dev/sde1
 #device/dev-io.c:495         Opened /dev/sdf1 RO O_DIRECT
 #device/dev-io.c:134         /dev/sdf1: block size is 4096 bytes
 #label/label.c:184       /dev/sdf1: No label detected
@@ -233,19 +237,7 @@
 #device/dev-io.c:134         /dev/sdg: block size is 4096 bytes
 #label/label.c:184       /dev/sdg: No label detected
 #device/dev-io.c:541         Closed /dev/sdg
-#toollib.c:466     Finding volume group "lillen-home"
-#locking/file_locking.c:235       Locking /var/lock/lvm/V_lillen-home RB
-#locking/file_locking.c:141         _do_flock /var/lock/lvm/V_lillen-home:aux WB
-#locking/file_locking.c:51         _undo_flock /var/lock/lvm/V_lillen-home:aux
-#locking/file_locking.c:141         _do_flock /var/lock/lvm/V_lillen-home RB
-#device/dev-io.c:495         Opened /dev/sde1 RO O_DIRECT
-#device/dev-io.c:134         /dev/sde1: block size is 512 bytes
-#label/label.c:160       /dev/sde1: lvm2 label detected
-#cache/lvmcache.c:1166         lvmcache: /dev/sde1: now in VG #orphans_lvm2 (#orphans_lvm2) with 1 mdas
-#format_text/format-text.c:1186         /dev/sde1: Found metadata at 6656 size 955 (in area at 4096 size 1044480) for lillen-home (Mzz7qN-0lw9-i4XR-rXtg-hA9u-iXeb-R3lK0r)
-#cache/lvmcache.c:1166         lvmcache: /dev/sde1: now in VG lillen-home with 1 mdas
-#cache/lvmcache.c:947         lvmcache: /dev/sde1: setting lillen-home VGID to Mzz7qN0lw9i4XRrXtghA9uiXebR3lK0r
-#cache/lvmcache.c:1203         lvmcache: /dev/sde1: VG lillen-home: Set creation host to lillen.
+#label/label.c:270         Using cached label for /dev/sde1
 #label/label.c:270         Using cached label for /dev/sde1
 #format_text/format-text.c:524         Read lillen-home metadata (3) from /dev/sde1 at 6656 size 955
 #cache/lvmcache.c:126         Metadata cache: VG lillen-home (Mzz7qN-0lw9-i4XR-rXtg-hA9u-iXeb-R3lK0r) stored (953 bytes).

lillen ~ # grep "Finding volume group" with-vgs 
#toollib.c:466     Finding volume group "lillen-home"
#toollib.c:466     Finding volume group "lillen"

lillen ~ # grep "Finding volume group" without-vgs 
#toollib.c:466     Finding volume group "lillen-home"
#toollib.c:466     Finding volume group "lillen"

Comment 8 Peter Rajnoha 2011-08-26 11:01:02 UTC

lvm2 monitoring is initialized via systemd unit since version 2.02.86-2 where we call "vgchange --monitor y" directly without doing any loop over all VGs (together with calling "vgs" before that) like it was in the legacy SysV init script.

Also, we use libudev to get the list of usable devices as mentioned in comment #6 already (so the /etc/lvm/cache/.cache is obsolete). I'm closing this bug then.

Comment 9 Peter Rajnoha 2011-08-26 11:08:46 UTC

(In reply to comment #7)
> What will change if those calls are made to udev db instead of LVMs own cache?

Udev db always has the most recent state - it is updated automatically by udevd anytime a new device appears. The LVM .cache file had to be updated explicitly within the lvm command execution. But that differs based on how the command is called - full VG scan was not always done as already pointed out in comment #1.

Note You need to log in before you can comment on or make changes to this bug.