Hide Forgot
Description of problem: A system booted by grub from a PV that is directly on disk will see a successful boot but the volume group on which root resides (e.g., any additional volumes in the VG sitting on the PV) will not be activated by default. If this activation is normally done using udev following the discovery of a partition (device) such an event might not happen since no partitions will be reported by udev. That's just a foolish assumption on my part but it might be true. How reproducible: Always Steps to Reproduce: 1. Compile Grub using the patch (or grub-version) from https://github.com/drydenp/grub2-pvinstall 2. Create a PV directly on disk (e.g. /dev/sda) using pvcreate --bootloaderareasize 1M /dev/sda 3. Install grub using grub-install /dev/sda -s. 4. Create a VG and at least two volumes (root and boot, for instance) 5. Install a system onto it. 6. run update-grub or grub-mkconfig -o /boot/grub/grub.cfg 7. reboot the system Actual results: The volumes in the VG will not be activated except for the root volume. Expected results: All volumes in the VG are activated. Additional info: LVM2 activation is done exclusively for root in the initrd using the lvm2 script. Post-boot activation is done by systemd and udev probably following device discovery (dmeventd is supposed to handle this?) and vgchange -aay --sysinit refuses to run when lvmetad is active, so I can assume that this activation is done based on an event; there are no manual calls anymore in the systemd boot sequence (?). Pardon my ignorance here. I would simply assume that my assumption is correct and that there is no udev event being fired for the PV directly on-disk causing nothing to be activated in the required volume group. Regards.
Boot from PV via grub is currently unsupported (and unadviced) by lvm2 - so you are clearly here on your own. Once lvm2 will support boot out of PV - this BZ will be updated. Suggested/adviced logic is to create small boot partition. Start system out of this place and proceed.
(In reply to Xen from comment #0) > LVM2 activation is done exclusively for root in the initrd using the lvm2 > script. Post-boot activation is done by systemd and udev probably following > device discovery (dmeventd is supposed to handle this?) and vgchange -aay > --sysinit refuses to run when lvmetad is active, so I can assume that this > activation is done based on an event; there are no manual calls anymore in > the systemd boot sequence (?). > > Pardon my ignorance here. I would simply assume that my assumption is > correct and that there is no udev event being fired for the PV directly > on-disk causing nothing to be activated in the required volume group. During initrd stage, only the LV on which root fs resides is activated. When we're switch over to root fs, any udev daemon (if there was one at all) is stopped (killed), then we switch over to root fs where new instance of udev daemon is run and then there's udev trigger executed which just iterates over all existing devices in the system (it traverses the sysfs) and it replays events for all those existing devices, causing all udev rules to apply again and fill udev database for that new udev instance running from root fs. Now, when it comes to the VG/LV activation, it depends on whether lvmetad is enabled or not and it also depends on distribution where each one can have a few differences in this area... But usually, if lvmetad is enabled, then we also make use of LVM autoactivation. That means there's no direct vgchange -aay call to activate the VGs, but instead, we collect PVs by running pvscan --cache -aay on each udev event that notifies about new device that appeared on the system and the last PV that makes the VG complete also causes the pvscan to activate the whole VG. The pvscan knows whether the VG is complete or not by asking lvmetad about this which caches all the LVM metadata. There are slight differences between methods how pvscan is called in systemd and non-systemd environment, but the important thing here is that the pvscan --cache --ay is responsible for the VG autoactivation. If lvmetad is disabled, then LVM autoactivation can't be used - in this case, there needs to be direct vgchange -aay call somewhere (again, this differs from distribution to distribution and whether systemd is used or not). If systemd is used, there are lvm2-activation-early.service, lvm2-activation.service and lvm2-activation-net.service to do this at various stages of boot sequence (where various kinds of devices are available). So, in summary, the VG should normally end up as activated unless the activation is forbidden by filters set in lvm.conf (devices/global_filter, devices/filter, activation/volume_list and activation/auto_activation_volume_list) - by default, ALL VGs are activated.
Under systemd environment and with lvmetad used, there are lvm2-pvscan@major:minor.service which run the pvscan --cache -aay for each PV with major:minor (and these services are instantiated from within udev rules based on events). So first thing is to check for the state of these services whether they triggered or not...
Thank you for your responses. There is nothing peculiar about the state of the particular device that I mentioned (in this case was /dev/sdb on this system) -- I mean the output of: systemctl status lvm2-pvscan.service As opposed to other devices (and partitions). I have reproduced the thing on two different systems but both Kubuntu 16.04 installs. I would always find that some mounts would not work and the system would not boot because SystemD is a ***** with regards to failing /var or /boot. Then I would enable lvm2.service from SysV and find that lvmetad was running because it wouldn't do anything with -aay --sysinit, until I removed the --sysinit flag. So something goes wrong and I have no clue what. There are no special filters on my system. All are default (e.g, empty or default values). I can see no other reason why this wouldn't work given my limited knowledge. The only thing I can say is that: pvscan --cache -ay apparently doesn't activate it. vgchange -aay does activate it (after the pvscan has already run, in any case). I am not sure what else to check. I could create a system that requires only root (quite easily, from this one) and revert my changes. Then the system will be pristine and I can check the state of any devices after booting, or any logs thereof. In fact I can already check logs. Alright this system is too much modified, I have to boot the other system that is simpler. lvm2.service was actually still disabled and yet everything was getting activated, but my initrd at this point also does more activation in advance. I remember. I reinstalled lvm2 and this recreated the lvm2.service file (symlink to /dev/null). Seeing my boot log I have rebooted since and lvm2 didn't run. And yet everything worked. The only additional thing that happens... but let's forget about that. There is just an additional unrelated lvm lvchange being run, that's all. And the current boot and root devices are raid1 mirrors to that other LV, so I needed to manually activate that LV in order to have the PV for my complete RAID1 set. I created a script (hook) that would traverse the chain of required PVs from the running system, created a /conf/rootlvs file in the initrd (initramfs) and then ran a modified version of the lvm2 script (in local-top) that would activate the entire sequence. But this should only activate a single LV on a different disk during the initrd phase. Unless the RAID1 changes things, it should not make a difference. So I can't understand right now why suddenly it does work even without the change I needed to make to the lvm2 service to run as a way of activating everything. Isn't that peculiar. I am dead certain this behaviour that I described occured on two nearly identical systems one of which is still on a HDD in my system I can boot right off the bat. If I disable the lvm2.service change I made (disable the service again, in effect) and it still works then I don't know what has happened, unless there has been an upgrade of LVM2 in the meantime. I'll reboot to that disk and see what happens. I'll post this comment now, if you don't mind.
I can confirm a few things. - On the raided system (LVM raid 1) I activate the "HDD" volume in advance (I call it hdd/ssd-lv, so to speak) and then the "ssd" volume group has two PV, one is the aformentioned /dev/sdb (for instance) and the other is that hdd/ssd-lv volume. Now on this system apparently since it has started using this RAID setup, my "sdd" volumes ARE all getting activated. So my "ssd/boot" is actually getting found, with our without my custom "LVM2" service running or not. This didn't use to be the case before I started running raid1 on it. - The original system that also used a PV directly on disk still sees the exact same behaviour I saw before: upon boot all the volumes in the main VG (that also has root) are not getting loaded (except root, that was already activated in initrd). - On this system I get a rescue shell from SystemD because some volumes cannot be mounted. If I then execute pvscan --cache --activate ay /dev/sda then the required volumes ARE activated. On this system with lvmetad ostensibly already running, the pvscan WORKS. However it still does NOT work (or run) during boot itself. So I can think of only two things: - pvscan is not run at startup (unlikely) - pvscan behaves differently after lvmetad is running, and this causes it to work differently (not sure here). If you say that (Peter) lvmetad would already be running before this happens then I do not know what is different. So: - original system still exhibits the problem - original system sees manual pvscan --cache --activate ay succeed - new system did exhibit the problem - new system does not (no longer) exhibit the problem now that raid1 is used on both volumes (there is no 3rd volume that doesn't use raid1, it's only those two volumes) - original system sees a failing mount of /boot because boot volume is never found (vg/boot) but manual run of pvscan does find it - new system does no longer see a failing mount of /boot even when just depending on auto-activation - I could not see a difference on original system while exhibiting the problem in the state of the lvm2-pvscan@ services. All services were "loaded" and "dead" and systemctl show displayed a "success" status. - the only difference the new system still has is an additional call to the activation of "hdd/ssd-lv" which probably doesn't make a difference but I'll have to check.
Even when reverting my changes to the initrd I can no longer reproduce the symptoms on my new system which means the root volumes (volume group) for that system is/are getting activated. That's the raid1 system. It has a vg with just two LV and both are raid1 now. Before, they did not get activated. I boot this system now and check the PVs of my old system. It has a single PV with a single VG and all volumes are not activated. boot xenpc1 -wi------- 500,00m data xenpc1 -wi------- 400,00g root xenpc1 -wi------- 30,00g swap xenpc1 -wc------- 8,00g var xenpc1 -wi------- 3,00g Thus, whether I boot from it or not, my volumes are not getting activated by default. # systemctl status lvm2-pvscan ● lvm2-pvscan - LVM2 PV scan on device 8-0 Loaded: loaded (/lib/systemd/system/lvm2-pvscan@.service; static; vendor preset: enabled) Active: inactive (dead) Docs: man:pvscan(8) There is no difference between the service information of the device that doesn't load, and the device that does load: # diff -u <(systemctl show lvm2-pvscan) <(systemctl show lvm2-pvscan) --- /dev/fd/63 2016-08-18 19:42:45.389482812 +0200 +++ /dev/fd/62 2016-08-18 19:42:45.389482812 +0200 @@ -112,15 +112,15 @@ KillSignal=15 SendSIGKILL=yes SendSIGHUP=no -Id=lvm2-pvscan -Names=lvm2-pvscan +Id=lvm2-pvscan +Names=lvm2-pvscan Requires=system-lvm2\x5cx2dpvscan.slice lvm2-lvmetad.socket -BindsTo=dev-block-8-0.device +BindsTo=dev-block-8-16.device Conflicts=shutdown.target Before=shutdown.target After=systemd-journald.socket system-lvm2\x5cx2dpvscan.slice lvm2-lvmetad.socket lvm2-lvmetad.service Documentation=man:pvscan(8) -Description=LVM2 PV scan on device 8-0 +Description=LVM2 PV scan on device 8-16 LoadState=loaded ActiveState=inactive SubState=dead and both report success It is very likely that running a manual pvscan now is going to work. For kicks I disable lvmetad first, but it doesn't matter, it just gets restarted and it loads them: # pvscan --cache --activate ay /dev/sda 5 logical volume(s) in volume group "xenpc1" now active When I disable lvmetad completely pvscan --cache no longer works, but that is probaly as expected. lvmetad definitely loads before the devices are supposed to show up (but don't): aug 18 19:30:15 xenpc2 systemd[1]: Started LVM2 metadata daemon. And it starts finding other unrelated devices: aug 18 19:30:17 xenpc2 systemd[1]: Found device /dev/raid/var. But nothing from xenpc1 volume group. Reenabling lvmetad (use_lvmetad in conf file) immediately restores functionality and manual pvscan --cache --activate ay /dev/sda succeeds. So during boot my on-disk PV of the OTHER system *is* getting found (it is listed in lvs) but its VG is not activated. It *has* been scanned, but the VG not activated. Maybe this is actually getting caused by the installation of Grub in case the patch is faulty. That seems highly unlikely though. I could wipe the boatloader-area and see if it makes a difference. That won't change a thing though. The only thing I can really do is disable lvmetad, regen the initramfs, and then reboot. But I would like some feedback perhaps also. - automatic initializing during boot (systemd) with lvmetad enabled does not activate the volumes - manually running the same command after boot does activate the same volumes. - systemctl status and systemctl show give no weird output.
The lvm2-pvscan@ service is not running at all. Or at least the service file is not getting used. I can botch it up and nothing will change. I guess it uses lvmetad udev rules instead.
Okay. So the udev rules require the block device to be identified as LVM2_member but blkid doesn't report partitionless PVs as LVM2_member. ENV{ID_FS_TYPE}!="LVM2_member|LVM1_member", GOTO="lvm_end" It reports them as "dos" or (for the longest time I had a promise fasttrack signature at the end of the disk, so it was recognised as a fasttrack raid member. I just wiped that signature (the last 1MB). Now it gets reported as "dos"). So in this sense it is really a bug in, or limitation of, blkid. I have tried to send email to util-linux-ng@vger (.kernel.org) to report the issue and ask whether they can fix it for us (or for me :P). That was my first foray into Udev ;-). Regards.
My previous comment hadn't shown up yet. I wrote the last one not having seen it, I thought it didn't make it through. I botched the udev script to send -v output to some file, and the /dev/sda device was not scanned. Then I realized it was not passing that test for LVM2_member-ship. /lib/udev/rules.d/69-lvm-metad.rules is the culprit here. Regards.
Boy did I spend many hours wasting on this project. Your pvscan-service comment Peter sent me completely in the woods. If it hadn't been for that I would have zoned in on a solution much faster. I spent so much time checking out the differences between the systems and seeing what worked and what didn't work, instead of just checking out UDEV and finding a solution much faster.... They way I had originally intended when creating this bug report. I knew it must have been udev, and it was udev. It just required checking out that file and I spent half a day messing with SystemD. I lost so much today because of that. I wish the endless time-wasting would cease once and for all at some point. You get so much wrong advice in the Linux world. I mean I am sorry for pointing this at you because you were a help of course, it was helpful, but I should have stuck to my original intent and not mess with SystemD services that provide scarcely any information on whether they actually work. There is just no good output there to know if anything has happened. I wasted a good 4 hours of my day on this probably. Just trying to come up with more data for this thinking the problem would be SystemD or its services. When my original assumption was correct all along.... With the slight adjustment that the event was indeed being fired, but just not acted upon. > Under systemd environment and with lvmetad used, there are > lvm2-pvscan@major:minor.service which run the pvscan --cache -aay for each > PV with major:minor (and these services are instantiated from within udev > rules based on events). So first thing is to check for the state of these > services whether they triggered or not... Just not the case. I acted completely on this. How stupid of me. There are no services instantiated from udev rules. If anything they are instantiated by systemd-udevd but not by any rules belonging to LVM. And, with udev in place, they are also not called..... I suspect this is what gets used with lvmetad is disabled. > If systemd is used, there are lvm2-activation-early.service, > lvm2-activation.service and lvm2-activation-net.service to do this at > various stages of boot sequence (where various kinds of devices are > available). These are not on my system. I suspect the pvscan services are actually meant for that, not these (or not anymore). I guess I should just be considered a dead man now. (Not related to you). SystemD works always by doing udev. It's just that the lvm-metad file (rules file) seems only intended for lvmetad, or systems running it. So either it's a "lvmetad" service doing it outside of SystemD, or it is a systemd service inside of SystemD. Pretty much doing the same thing. But in the former thing no SystemD service is ever created or even instantiated like that. I guess, not sure (about the instantiation). At least on my system it's like this (Kubuntu 16.04). For the longest time I did not know that the pvscan services weren't actually being called. And SystemD doesn't tell you much about it either. At least, not when you don't know what information is missing. I even plotted a graph but I didn't realize they were supposed to show up (?) IF they were being called. They were not in the graph. I guess it would have shown execution history. But I didn't know it was supposed to show that. It seems just a fallback measure for when lvmetad is disabled. Two ways of doing the same thing. Well. Of course if you'd want to fix this in the 69-lvm-metad.rules file you would just need to add |dos to it. Here is a patch: --- 69-lvm-metad.rules.orig 2016-08-18 22:53:04.634767493 +0200 +++ 69-lvm-metad.rules 2016-08-18 22:54:41.165432023 +0200 @@ -33,7 +33,7 @@ ENV{LVM_PV_GONE}=="1", GOTO="lvm_scan" # Only process devices already marked as a PV - this requires blkid to be called before. -ENV{ID_FS_TYPE}!="LVM2_member|LVM1_member", GOTO="lvm_end" +ENV{ID_FS_TYPE}!="LVM2_member|LVM1_member|dos", GOTO="lvm_end" ENV{DM_MULTIPATH_DEVICE_PATH}=="1", GOTO="lvm_end" # Inform lvmetad about any PV that is gone.
Created attachment 1191999 [details] Possible way to fix the issue within LVM itself Seems that the solution needs to be made at util-linux (libblkid) but this is the only thing that LVM could do itself. It would just cause scanning on all MS-DOS partition label devices. A PV is not an MS-DOS partition label device, but it is identified as such. That means that all disks (e.g. /dev/sda) that had regular partitions, would also be scanned for a PV. It simply says: Device 8:32 not found. Cleared from lvmetad cache. without a second lost. That's all it would do.
Ehm, that needs to be --- 69-lvm-metad.rules.orig 2016-08-18 22:53:04.634767493 +0200 +++ 69-lvm-metad.rules 2016-08-18 22:54:41.165432023 +0200 @@ -33,7 +33,7 @@ ENV{LVM_PV_GONE}=="1", GOTO="lvm_scan" # Only process devices already marked as a PV - this requires blkid to be called before. -ENV{ID_FS_TYPE}!="LVM2_member|LVM1_member", GOTO="lvm_end" +ENV{ID_FS_TYPE}!="LVM2_member|LVM1_member|", GOTO="lvm_end" ENV{DM_MULTIPATH_DEVICE_PATH}=="1", GOTO="lvm_end" # Inform lvmetad about any PV that is gone. The ID_FS_TYPE is actually empty. It's the PTTYPE (or something like it) that's reported as "dos". Matching the empty string does work (I hadn't tested it yet).
Created attachment 1192002 [details] Version that works Matching the empty string would also match at least regular disks without a partition table and a PV on them. (And having a PV on them).
(In reply to Xen from comment #10) > Boy did I spend many hours wasting on this project. Your pvscan-service > comment Peter sent me completely in the woods. If it hadn't been for that I > would have zoned in on a solution much faster. I spent so much time checking > out the differences between the systems and seeing what worked and what > didn't work, instead of just checking out UDEV and finding a solution much > faster.... > > They way I had originally intended when creating this bug report. I knew it > must have been udev, and it was udev. It just required checking out that > file and I spent half a day messing with SystemD. Well, I just tried to explain how these things work - the problem reported was that the VG had not been activated. That's why, as a starting point, I tried to explain how event-based LVM activation works because it seemed it was not clear (because you noted and asked "...and vgchange -aay --sysinit refuses to run when lvmetad is active, so I can assume that this activation is done based on an event; there are no manual calls anymore in the systemd boot sequence (?).). So it's important to go the principles and make sure they're clear before trying to solve anything. Please, take into consideration that I also spent time with you. So it's reciprocal, like it is always when people are trying to find a source of the problem and help each other the best way they can. It's very tacktless that you ask for help first and then just blame people if things didn't work the way you imagined (but that's how real life sometimes goes). Also, I just asked you for debug output, I didn't ask you to spend time investigating this on your own. (Now I don't want any flame war, so I'll just cease the discussion here.)
(In reply to Xen from comment #11) > Created attachment 1191999 [details] > Possible way to fix the issue within LVM itself > > Seems that the solution needs to be made at util-linux (libblkid) but this > is the only thing that LVM could do itself. It would just cause scanning on > all MS-DOS partition label devices. A PV is not an MS-DOS partition label > device, but it is identified as such. That means that all disks (e.g. > /dev/sda) that had regular partitions, would also be scanned for a PV. It > simply says: > > Device 8:32 not found. Cleared from lvmetad cache. > > without a second lost. That's all it would do. Sorry - such patch is surely not acceptable and is even seriously broken. (Because you fail to understand how it works) lvm2 does 'pvscan' ONLY devices marked as PV. When blkid doesn't see PV signature on a disk and mark this as 'dos' partitioned device - then it's NOT a PV - plain simple. If you were digging into 'device' headers with 'dd' command - like restoring partition MBR header (to probably let the system boot) - you broke the logic so please avoid blaming us on this... The deal here is pretty 'clear' When you 'pvcreate' device - 1st sector IS cleared! If you don't have there zeros - it's not a PV! (and when 'use_blkid_wiping is enable also any other signature are wiped first) There is 'historical' reasoning why 'lvm2' let pass a device with 0 partitions but still with 'dos' header as a PV - this likely will need to be fixed at some point when more users like you will start to play with disk signatures on their own. You can observe blkid scan with: LIBBLKID_DEBUG=all blkid /dev/disk where you will likely notice - lvm2 signature is found - and then rejected because there is a 'dos' header on your disk. So NEXT time you get the great idea to fiddle with your devices with tools like 'dd' - prepare yourself many free hours to experience results...
(In reply to Peter Rajnoha from comment #14) > Well, I just tried to explain how these things work - the problem reported > was that the VG had not been activated. That's why, as a starting point, I > tried to explain how event-based LVM activation works because it seemed it > was not clear (because you noted and asked "...and vgchange -aay --sysinit > refuses to run when lvmetad is active, so I can assume that this activation > is done based on an event; there are no manual calls anymore in the systemd > boot sequence (?).). > > So it's important to go the principles and make sure they're clear before > trying to solve anything. > > Please, take into consideration that I also spent time with you. So it's > reciprocal, like it is always when people are trying to find a source of the > problem and help each other the best way they can. Well I am glad you take this well. Many people wouldn't right. Sure I was very grateful for your explanation (in the first post, at least, sure, why not, that was extremely helpful and thankful). But you also derailed me in a way, I'm sorry to say. I can't just provide debug output and then wait until someone does it for me. I don't have that liberty you might say. I also feel I have a moral obligation to act on the help that's provided me. > It's very tacktless that you ask for help first and then just blame people > if things didn't work the way you imagined (but that's how real life > sometimes goes). Also, I just asked you for debug output, I didn't ask you > to spend time investigating this on your own. I think it's more than I don't have a good email solution (who can blame me, right) and I didn't see your responses while I had already written a length and quite proficient response to Zdenek that I was content with. When I tried to post, I had to deal with your posts as well, which caused me to disregard or at least shelf what I had written before. That's just this bugzilla not having any ajax based notification. Otherwise it probably wouldn't have mattered as much. I mean, it's not always humans' fault. Sometimes it is just "faulty systems" you have to deal with, without wanting to blame any creator thereof here. A lot of misunderstandings just happen as the result of systems that don't quite provide the means necessary to communicate in full and without getting hurt by something. And in the end you are just people who suffer the same thing and try to do their best. So of course, thank you for that. And I just lost the entire day basically. Sure you can blame me for that: use a VM, don't use your own system, learn SystemD better beforehand so you don't get surprised, etc. etc. It's just the way it is right. But I just wish I would have continued my original intuition and would have checked out the udev rules beforehand and disregarded what you have said or what I thought you would have said in the 2nd post. And I know I am blunt right but I also don't have endless seas of time to pick my words nicely, so perhaps I can hope you can have some leniency here. Time is of the essence ;-). Perhaps not to you, but to me, the udev rules file clearly states that it requires a LVM2_member signature (or LVM_member) for it to work in the first place. And well, of course, you coudln't know perhaps, but I have had this suspicion for a long time that the blkid was not entirely right and possible could be causing errors somewhere, I just didn't know what or when. Or if at all. You didn't know but I knew it didn't say "LVM2_member". So upon seeing the udev file it was just immediately clear to me. But it's just my fault for not sticking to what I know or something or keeping a little bit of a clarity as to what I am doing I guess and not waste endless seas of time on the pursuit of something that isn't very rewarding. I even bought an alarm timer at some point to help me with this ;-). I will just set it to 10 minutes and it will start beeping and I will know when to step back for a moment to think and reconsider if what I am doing is actually fruitful now here ;-). Not trying to derail myself here, just human things. So I really wouldn't consider myself tactless, just blunt. Of course you were helpful in writing and you wrote a lot and I thank you for that. It's just that I blame myself for listening to other people that might be sending me into the woods when my original intuition was already correct, and that is what happens most of the time everywhere.... :(. I was once on the ##kernel channel asking for an answer. The guy didn't want to provide me an answer, but only a place to look. So I started looking and pieced together a solution or answer but it still wasn't solid. My question was "is something NOT happening" and all I could find where things that did happen, so in order to get an answer I would have to exhaustively search all the things that do happen and then conclude that my thing isn't among them. And the guy already had the answer he just wanted me to search for it on my own. And I wasted countless hours not getting anywhere. Again. Yes. And of course that's not the case here and you were friendly and helpful. Yes. Sorry. I am sorry for my bluntless or lack of tact as you perceive it. Not trying to flame you and I don't think it will happen here. I mean I think we can have an understanding here of sorts. But. You are actually much more helpful than most people have ever been. I'll just leave it at that then. But I will add that I just suffered immensely because of it ;-). Haha. Sure I got ahead. But I didn't mention that I actually didn't need the solution all that much because I already had a working system by using /etc/init.d/lvm2 and bypassing systemd and udev almost entirely. I was just reporting a bug. You said "the first thing to do is to..." so I did that. And I just carried on with it... because after a first thing comes a second thing ;-). Anyway, enough. I am sorry if I sound pettiful here in the last paragraph. Got a little confused there. I had just created a robust alternative that I actually liked better than the solution I have now (which is that patch, effected). Regards anyway. Anyway. : I will just say right away that the util-linux maintainer and developer Karel Zak has said he will implement the feature for blkid. > > (Now I don't want any flame war, so I'll just cease the discussion here.)
(In reply to Zdenek Kabelac from comment #15) > lvm2 does 'pvscan' ONLY devices marked as PV. > > When blkid doesn't see PV signature on a disk and mark this as 'dos' > partitioned device - then it's NOT a PV - plain simple. No "dos" is an PARTTYPE or something, not FSTYPE. > If you were digging into 'device' headers with 'dd' command - like > restoring partition MBR header (to probably let the system boot) - you > broke the logic so please avoid blaming us on this... I wasn't. The system boots because Grub2 can use the --bootloaderarea(size) area which is that 1M area you see in the output of: Actually I don't remember what command. There is a command that shows the BA_offset and BA_length or something. Grub can install into the PV as I've said, it will install itself into the boot sector so there is probably some MBR related code there. Oh I see, you think the blkid logic breaks because of that boot sector. But there is no partition table, just a boot sector. blkid doesn't scan the second sector (where the PV signature is) after it finds the MBR boot code in the first sector, and that is the issue here. > The deal here is pretty 'clear' > > When you 'pvcreate' device - 1st sector IS cleared! > > If you don't have there zeros - it's not a PV! That's just stupid. Why default to the second sector then. It is clear; to allow room for a boot sector or boot code to exist there. So yes, my friend, it IS a PV, but thanks for clearing it up here. It is a PV and LVM will absolutely be able to work with it *just fine*. In fact from the way I understand it the PV signature can sit in whatever of the first 4 sectors and it will be as designed; according to specification. It's just the second sector by default to make room for a boot sector; and this works. It works just fine. It's just that blkid doesn't yet scan the second sector. I could use an offset for that to see if that indeed works now. > There is 'historical' reasoning why 'lvm2' let pass a device with 0 > partitions but still with 'dos' header as a PV - this likely will need to > be fixed at some point when more users like you will start to play with disk > signatures on their own. I don't play with disk signatures on my own. I install Grub2 into it. I mean you can stop the condescension here you know. "More users like you". More of those scumbags, right. So meaning, you just want to disrupt those users from getting their way, even if it is not an impediment to the system as it stands, but just to cut down on the support burden for people doing unsupported things? Or people doing stuff you don't like? I don't know. Cut down on the number of possible use cases, you know. Stuff like that. You would actually prevent people from using Grub2 on a PV when a Bootloader Area was clearly designed years ago for it? Really? Maybe I am just getting confused here. Sorry about that. > You can observe blkid scan with: > > LIBBLKID_DEBUG=all blkid /dev/disk > > where you will likely notice - lvm2 signature is found - and then rejected > because there is a 'dos' header on your disk. You mean this: 5182: libblkid: LOWPROBE: [16] LVM2_member: 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: magic sboff=536, kboff=0 5182: libblkid: LOWPROBE: call probefunc() 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: assigning UUID [superblocks] 5182: libblkid: LOWPROBE: wiper set to superblocks::LVM2_member off=0 size=8192 5182: libblkid: LOWPROBE: assigning TYPE [superblocks] 5182: libblkid: LOWPROBE: <-- leaving probing loop (type=LVM2_member) [SUBLKS idx=16] 5182: libblkid: LOWPROBE: freeing values list 5182: libblkid: LOWPROBE: chain safeprobe topology DISABLED 5182: libblkid: LOWPROBE: chain safeprobe partitions ENABLED 5182: libblkid: LOWPROBE: reseting partitions values 5182: libblkid: LOWPROBE: --> starting probing loop [PARTS idx=-1] 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: magic sboff=510, kboff=0 5182: libblkid: LOWPROBE: dos: ---> call probefunc() 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: magic sboff=0, kboff=0 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: reuse buffer: off=0 len=1024 pr=0x21e3160 5182: libblkid: LOWPROBE: previously wiped area modified -- ignore previous results 5182: libblkid: LOWPROBE: zeroize wiper 5182: libblkid: LOWPROBE: reseting superblocks values 5182: libblkid: LOWPROBE: free value UUID 5182: libblkid: LOWPROBE: free value TYPE 5182: libblkid: LOWPROBE: assigning PTUUID [partitions] 5182: libblkid: LOWPROBE: dos: <--- (rc = 0) 5182: libblkid: LOWPROBE: assigning PTTYPE [partitions] 5182: libblkid: LOWPROBE: <-- leaving probing loop (type=dos) [PARTS idx=3] > So NEXT time you get the great idea to fiddle with your devices with tools > like 'dd' - prepare yourself many free hours to experience results... I did not my friend. And you can stop being an ass you know. You were perfectly aware I think that I was using Grub on it. And it's a perfectly valid use case, too.
(In reply to Zdenek Kabelac from comment #15) > So NEXT time you get the great idea to fiddle with your devices with tools > like 'dd' - prepare yourself many free hours to experience results... Anyway, thanks Zdenek. You pinpointed the issue exactly. Good boy. JUST KIDDING. Jeez.
(In reply to Zdenek Kabelac from comment #15) > Sorry - such patch is surely not acceptable and is even seriously broken. > (Because you fail to understand how it works) Oh and. I do understand how the patch works. I will just agree that it is not the right place to fix it. "Seriously broken" doesn't mean a lot when no system actually breaks, but it just does a lot of unnecessary scanning. In any case, I was not proposing for it to be included (I would probably never do that) but then again, being robust and resilient is not a bad thing either. Linux works the moment all things work, there are a lot of weakest links and anything can break the system. There is almost no redundancy in these things. Personally I feel that if you have a system that will continue working even if people deviate slightly from the Right Path is an asset. But then, I'm an ass who says these things. It's just that if the system did scan everything with a bit of leniency, that would mean slightly more "unnecessary" work being done at system boot (in this case) but it would make your system independent of both blkid and people doing exactly as they are told, you know. I seriously don't think it is a good thing that everything has to be picture perfect or the system won't work. Not only did the "dos" signature prevent the thing from being recognised as LVM2_member, a remaining raid signature that I didn't know how to fix, ALSO did that. See, "promise fasttrack raid member" took precedence over LVM2_member. It was a signature at the end of the disk that didn't hurt anyone but that was created by a different motherboard etc. etc. so I couldn't even use this motherboard to wipe it. I didn't know how these things worked and let it stand for many months, so to speak. In the end I figured it had to be at the end of the disk and I zeroed it, then it was gone. It took precedence over LVM2_member. Even if the blkid of the Grub boot image + PV would have been fixed, I still would not have had a booting system. Or an activating system, it did actually boot just fine. From the perspective of the root volume at least. Also the initrd does the most minimal activation, and that is also a weakest link. The whole system only does barely just what is required. If any of those barely just things fail, you have a broken system. The Linux boot order is just a sequence of weakest links, and any one of them can break (if you adjust something). That's just not very robust you know, you will agree. So now Karel Zak is fixing the grub2 boot.img problem. My other disk actually has some silicon medley raid signature at the end (but I'm not using it). So it also will fail to work. Nothing to do with DD here. DD is my only solution. For some reason at this point the system does get activated without my intervention, probably because a prior PV is getting activated that does not suffer from this, causing an automatic scan and activation of the current one, without having to depend on that signature or the BLKID of it. Taking that away, it would probably fail as well. I am not trying to insult or criticise anyone here by criticising Linux. I am just saying that given my system (and all those before it) the boot order is very fragile because it does the minimal thing at every step. There is no redundancy there so actually in reality the choice to do away with blkid and just do it yourself, wouldn't be all that terrible. Pvscan already does that. PVscan already performs that function. It already filters devices that don't match. You are just causing udev to filter it also. To prevent unnecessary activations of the pvscan binary. But that's it. You are merely cutting down on a minimal amount of work in executing your pvscan 3 more times on your average or non-average system. You save 3 freaking pvscan invocations on the ordinary system. That's all your precious udev rules do for you you know. All of that work and it only prevents a small bunch of unnecessary invocations. And the result is that your system will break the moment a blkid signature is wrong, because your PVSCAN actually CAN do the thing just fine and is not broken, but blkid is, in a way. And now the question becomes: * Can we reconcile the system dysfunctioning whenever someone has in their firmware BIOS created a RAID disk that they don't use? With the fact that we depend on some external tool to give us the right id for our devices before we do anything? * Are we okay with a system dysfunctioning and possibly causing a boot failure the moment some garbage (or not garbage, but unused) RAID signature is sitting at the end of some disk not harming anyone? People can create such a signature very easily using just a BIOS-based firmware configuration screen. But Linux ordinarily won't activate the RAID just like that, or you might not use it in your default configuration. Grub appears to also have issues that have been solved in a later version than what is on my system by default (with the firmware raid). So you can expect people to not easily use the raid thing. The raid-activated device would not have the signature at the end, the disk would be slightly smaller. But the non-raid-activated device (the ordinary disk, as presented by the BIOS and usable by the system without issues) (no driver is going to interfere with that, not even if you have dmraid installed) ---- is going to be identified as not LVM2_member just because someone added it to a motherboard-based RAID array whether you boot from it or not, not knowing what would happen, and not having anything to do with grub here. So any Linux system now using partitionless PV (I mean, blame me here, right, I think it is a valid use case) even if it is not the boot disk will fail to see those partitions activated (those LVs) based on the presence of a RAID signature at the end of the disk that doesn't really harm anyone. That's pretty fallable. That's pretty non-robust. Your pvscan does work, your blkid doesn't. That's why I don't like the solution here so much. BLKID in this sense is a liability. Anyway, too much, signing off, thanks.
This is a patch that would do the above a little more nicely :p. --- 69-lvm-metad.rules.orig 2016-08-18 22:53:04.634767493 +0200 +++ 69-lvm-metad.rules 2016-08-21 00:57:41.552262081 +0200 @@ -32,9 +32,11 @@ ENV{ID_FS_TYPE}="$env{.ID_FS_TYPE_NEW}" ENV{LVM_PV_GONE}=="1", GOTO="lvm_scan" -# Only process devices already marked as a PV - this requires blkid to be called before. -ENV{ID_FS_TYPE}!="LVM2_member|LVM1_member", GOTO="lvm_end" +# Only create symlinks and process remove events for devices that have a +# LVM2_member or LVM1_member signature as determined by blkid. + ENV{DM_MULTIPATH_DEVICE_PATH}=="1", GOTO="lvm_end" +ENV{ID_FS_TYPE}!="LVM2_member|LVM1_member", GOTO="next" # Inform lvmetad about any PV that is gone. ACTION=="remove", GOTO="lvm_scan" @@ -42,6 +44,22 @@ # Create /dev/disk/by-id/lvm-pv-uuid-<PV_UUID> symlink for each PV ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-id/lvm-pv-uuid-$env{ID_FS_UUID_ENC}" +LABEL="next" + +# But if the device is a hd, sd or vd device, process it regardless in case it +# might be a PV after all. It will not have a symlink in that case, but it will +# properly activate. + +KERNEL=="hd[a-z]|sd[a-z]|vd[a-z]", GOTO="lvm_scan" + +# This may cause devices to be activated twice, once for the raw device, and +# once for the dm-raid device that is activated based on it. However, I believe +# LVM already deals with this sufficiently. + +# Only continue at this point if the signature is actually there. +ENV{ID_FS_TYPE}!="LVM2_member|LVM1_member", GOTO="lvm_end" + + # If the PV is a special device listed below, scan only if the device is # properly activated. These devices are not usable after an ADD event, # but they require an extra setup and they are ready after a CHANGE event. Let's say this works for me but I still think I should just depend on an actual physical vgscan service that will work regardless of udev and will just be a simple SystemD service for lvm2 (lvm2.service) calling vgscan at system boot. To me depending on events and triggers is just not robust. Too many things can go wrong and it is just not required because vgscan handles it well. At that point probably the default pv-scanning (via udev) would already have happened and it would just be a failsafe. pvscan --cache -a ay would probably suffice as well. That wouldn't deal with any kind of hotplugging of that kind but it would ensure that at boot everything would be activated regardless of udev or blkid rules. Less dependency = better and a systemd service would be better requiring only a single file. vgscan works regardless of a pvscan having been done on the device yet. I don't know the great difference so I will use vgscan. Then I won't need this patch nor require the change to blkid but it would still be nice to have the correct blkid. This patch wouldn't be required at all if blkid worked correctly but it is difficult to reconcile that with any disks that might have RAID signatures. A vgscan as a SystemD service would just catch all of that with like zero effort and no chance for failure of any kind. I could even let it be followed by a lvchange --refresh <volume group> in case some raid device failed to load properly ;-).
We have pointed to kzak (util-linux) that blkid is misinterpreting devices. The exception which lvm2 is using - having device with no partitions, but 'dos' signature on it and then regular PV header is still a PV. And since this existed long before udev, this needs a fix in blkid to be complaint with this semantic. However we still claim that boot out of PV from 'grub2 lvm2' parser is completely unsupported by lvm2 and may simply ruin your data - you've been warned... There is a reason we suggest to boot out of standard partition. Once we provide proper parser of lvm2 metadata we will mark this as supported feature. Meanwhile you may open bugs to blkid where blkid is seeing a different disk type than lvm2....
(In reply to Xen from comment #17) > > If you were digging into 'device' headers with 'dd' command - like > > restoring partition MBR header (to probably let the system boot) - you > > broke the logic so please avoid blaming us on this... > > I wasn't. The system boots because Grub2 can use the --bootloaderarea(size) > area which is that 1M area you see in the output of: > Just repeating once again - in case it's not been yet explicitly said: ATM lvm2 does NOT support boot out of PV through grub. Yes - there exist 'unauthorized' grub2 hack for parsing small subset of lvm2 metadata - which may handle couple device types and eventually create a usable disk mapping for them - but this hack is far away from supportable state. So please - if you want to use it - you need to prepare yourself for many many hours.. If you do not want to lose your precise time - switch to partition. I hope I've been now as explicit as possible. > > So NEXT time you get the great idea to fiddle with your devices with tools > > like 'dd' - prepare yourself many free hours to experience results... > > I did not my friend. And you can stop being an ass you know. You were > perfectly aware I think that I was using Grub on it. And it's a perfectly > valid use case, too. As it's unsupported (by lvm2), it's been Ubuntu's hack which provided this as 'a feature' (assuming without even knowing what they are doing...) Lvm2 tests and supports only 'pvcreate' and no further fiddling with PV device. It's plain admin fault to pass PV to a tool like grub! Please try to understand this... Support for boot from a PV will be supported through native lvm2 commands - since only these are authorized to manipulated with them (with proper locking). Accessing a PV device with any other tools simply cannot work in a reliable way!