Hide Forgot
Two problems actually but with the same consequence: 1 - vgscan bug ============== If there's lvmetad running but it does not contain any metadata information yet, the initial vgscan does not populate it with existing PV/VG information. This causes a problem mainly in the lvmetad init script that calls vgscan for lvmetad initialisation. If vgscan does not populate it, the PV/VG is invisible for any subsequent LVM commands. The pvscan --cache works correctly though! Also, any device that appears after lvmetad is initialised, is processed correctly as well since udev rule already calls "pvscan --cache". So it's just vgscan that does not work with lvmetad. I've changed the vgscan to pvscan in the init script (as per Mornfall's recommendation). But we should fix vgscan as well. 2 - init script not synchronized (the problem I haven't actually hit, but I can imagine someone could hit it sooner or later) ============== There's a possible race while running the init script - we're starting lvmetad daemon and then we call vgscan/pvscan. It's possible that the socket is still not ready and the vgscan/pvscan would fallback to normal scanning without populating lvmetad and then the lvmetad will miss this information. A quick solution here would be to (actively) wait for the socket to appear and call vgscan/pvscan only after we're really sure the daemon is ready to accept any conncetion. (Note: systemd is OK here since the socket is already prepared and any request on the socket is buffered until the daemon is ready to process it.)
(In reply to comment #0) > 2 - init script not synchronized (the problem I haven't actually hit, but I can > imagine someone could hit it sooner or later) > ============== Well, looking at the code, we have: if (!_systemd_activation && s.socket_path) { s.socket_fd = _open_socket(s); if (s.socket_fd < 0) failed = 1; } /* Signal parent, letting them know we are ready to go. */ if (!s.foreground) kill(getppid(), SIGTERM); The problem should not arise then, fortunately. So the only problem remaining is the problem #1.
The patch for vgscan is proposed here: https://www.redhat.com/archives/lvm-devel/2012-March/msg00137.html It's disputable whether this is the correct way considering we have a separate pvscan and "pvscan --cache", vgscan should probably follow the same principle. But anyway, I've already changed the init script to call "pvscan --cache" instead which does the job we need for certain.
I've added the "--cache" option to vgscan as well so it behaves the same way as pvscan/pvscan --cache. This way, it's more consistent and it's not misleading for users. (Anyway, the original bug is already fixed with using "pvscan --cache" in the init script.)
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: No Documentation needed.
So now we have (using preexisting "vg" as testing volume group): global/use_lvmetad=0 + lvmetad not running ------------------------------------------ [0] devel/~ # vgscan Reading all physical volumes. This may take a while... Found volume group "vg" using metadata type lvm2 [0] devel/~ # vgscan --cache Cannot proceed since lvmetad is not active. global/use_lvmetad=0 + lvmetad running -------------------------------------- [0] devel/~ # vgscan Reading all physical volumes. This may take a while... Found volume group "vg" using metadata type lvm2 [0] devel/~ # vgscan --cache Cannot proceed since lvmetad is not active. global/use_lvmetad=1 + lvmetad running -------------------------------------- -> first vgscan run after running lvmetad [0] devel/~ # vgscan Reading all physical volumes. This may take a while... No volume groups found [0] devel/~ # vgscan --cache Reading all physical volumes. This may take a while... Found volume group "vg" using metadata type lvm2 -> second and further vgscan after running lvmetad [0] devel/~ # vgscan Reading all physical volumes. This may take a while... Found volume group "vg" using metadata type lvm2 [0] devel/~ # vgscan --cache Reading all physical volumes. This may take a while... Found volume group "vg" using metadata type lvm2 global/use_lvmetad=1 + lvmetad not running ------------------------------------------ [0] devel/~ # vgscan WARNING: Failed to connect to lvmetad: No such file or directory. Falling back to internal scanning. Reading all physical volumes. This may take a while... Found volume group "vg" using metadata type lvm2 [0] devel/~ # vgscan --cache WARNING: Failed to connect to lvmetad: No such file or directory. Falling back to internal scanning. Cannot proceed since lvmetad is not active. That's the same sort of behaviour as we already have in pvscan.
Verified the test case listed in comment #5 works as expected in the latest rpms. 2.6.32-268.el6.x86_64 lvm2-2.02.95-6.el6 BUILT: Wed Apr 25 04:39:34 CDT 2012 lvm2-libs-2.02.95-6.el6 BUILT: Wed Apr 25 04:39:34 CDT 2012 lvm2-cluster-2.02.95-6.el6 BUILT: Wed Apr 25 04:39:34 CDT 2012 udev-147-2.41.el6 BUILT: Thu Mar 1 13:01:08 CST 2012 device-mapper-1.02.74-6.el6 BUILT: Wed Apr 25 04:39:34 CDT 2012 device-mapper-libs-1.02.74-6.el6 BUILT: Wed Apr 25 04:39:34 CDT 2012 device-mapper-event-1.02.74-6.el6 BUILT: Wed Apr 25 04:39:34 CDT 2012 device-mapper-event-libs-1.02.74-6.el6 BUILT: Wed Apr 25 04:39:34 CDT 2012 cmirror-2.02.95-6.el6 BUILT: Wed Apr 25 04:39:34 CDT 2012 global/use_lvmetad=0 + lvmetad not running ------------------------------------------ [root@taft-01 ~]# vgscan Reading all physical volumes. This may take a while... Found volume group "taft" using metadata type lvm2 Found volume group "vg_taft01" using metadata type lvm2 [root@taft-01 ~]# vgscan --cache Cannot proceed since lvmetad is not active. global/use_lvmetad=0 + lvmetad running -------------------------------------- [root@taft-01 ~]# lvmetad [root@taft-01 ~]# ps -ef | grep lvmetad root 30221 1 4 16:47 ? 00:00:00 lvmetad root 30223 1945 0 16:47 pts/0 00:00:00 grep lvmetad [root@taft-01 ~]# vgscan Reading all physical volumes. This may take a while... Found volume group "taft" using metadata type lvm2 Found volume group "vg_taft01" using metadata type lvm2 [root@taft-01 ~]# vgscan --cache Cannot proceed since lvmetad is not active. global/use_lvmetad=1 + lvmetad running -------------------------------------- -> first vgscan run after running lvmetad [root@taft-01 ~]# vgscan Reading all physical volumes. This may take a while... No volume groups found [root@taft-01 ~]# vgscan --cache Reading all physical volumes. This may take a while... Found volume group "taft" using metadata type lvm2 Found volume group "vg_taft01" using metadata type lvm2 -> second and further vgscan after running lvmetad [root@taft-01 ~]# vgscan Reading all physical volumes. This may take a while... Found volume group "vg_taft01" using metadata type lvm2 Found volume group "taft" using metadata type lvm2 [root@taft-01 ~]# vgscan --cache Reading all physical volumes. This may take a while... Found volume group "taft" using metadata type lvm2 Found volume group "vg_taft01" using metadata type lvm2 global/use_lvmetad=1 + lvmetad not running ------------------------------------------ [root@taft-01 ~]# killall lvmetad [root@taft-01 ~]# ps -ef | grep lvmetad root 30242 1945 0 16:49 pts/0 00:00:00 grep lvmetad [root@taft-01 ~]# vgscan WARNING: Failed to connect to lvmetad: No such file or directory. Falling back to internal scanning. Reading all physical volumes. This may take a while... Found volume group "taft" using metadata type lvm2 Found volume group "vg_taft01" using metadata type lvm2 [root@taft-01 ~]# vgscan --cache WARNING: Failed to connect to lvmetad: No such file or directory. Falling back to internal scanning. Cannot proceed since lvmetad is not active.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0962.html