Description of problem: When a hypervisor with FC storage is rebooted, the host sees all the LVs of the storage domain, fine. But then it LVM PV scan scans all these devices and finds the VM's LVM metadata in case it's raw disk! So we end up with dm-xxx in the host pointing to the VM's internal LVs. These devices are never cleared/removed so it ends up with lot's of stale LVs and mapper devices, which can lead to disk corruption. Version-Release number of selected component (if applicable): Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160711.0.el7ev) How reproducible: 100% on customer side Steps to Reproduce: 1. Reboot host Actual results: - vdsm will fail to lvchange -an - stale LVs in all hosts Expected results: - LV only open in the host running the VM - No child dm-xxx devices from VM internal LVM. Additional info: Logs Host booting: lvm: 140 logical volume(s) in volume group "b442d48e-0398-4cad-b9bf-992a3e663573" now active <--- storage domain systemd: Started LVM2 PV scan on device 253:249. <----- dm-249 is the LV of the VM disk (raw, prealloc) lvm: 1 logical volume(s) in volume group "vg_pgsql" now active <--- this is VM internal! Having this VM internal child, will make vdsm's lvchange -an fail when the VM stops. $ cat sos_commands/devicemapper/dmsetup_info_-c | egrep 'pgsql|5853cd' | awk -F' ' '{print $1" dm-"$3}' b442d48e--0398--4cad--b9bf--992a3e663573-5853cdf8--7b84--487e--ab70--827bf5b00140 dm-249 <--- LV of VM image vg_pgsql-lv_pgsql dm-272 <--- internal VM stuff vg_pgsql-lv_pgsql (253:272) `-b442d48e--0398--4cad--b9bf--992a3e663573-5853cdf8--7b84--487e--ab70--827bf5b... `-36000144000000010706222888cc3683f (253:107) |- (131:880) |- (67:992) |- (132:592) |- (68:704) |- (133:304) |- (69:416) |- (134:16) `- (70:128) So in the HOST we have dm-272 which is VMs internal business relying on dm-249, which is the VM disk (LV). vdsm fails to deactivate dm-249 as well. The result is a full Data Center where ALL hosts have these LVs active, asking for trouble. jsonrpc.Executor/4::ERROR::2016-09-06 19:53:08,144::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Cannot deactivate Logical Volume: (\'General Storage Exception: ("5 [] [\\\' Logical volume b442d48e-0398-4cad-b9bf-992a3e663573/5853cdf8-7b84-487e-ab70-827bf5b00140 is used by another device.\\\']\\\\nb442d48e-0398-4cad-b9bf-992a3e663573/[\\\'ffd27b7d-5525-4126-9f59-5a26dedad157\\\', \\\'5853cdf8-7b84-487e-ab70-827bf5b00140\\\']",)\',)', 'code': 552}} I think the root cause here is that this https://gerrit.ovirt.org/#/c/21291/ will fail to deactivate LVs in case the Disk LV already has children (VM internal). This comment was originaly posted by gveitmic
(In reply to Germano Veit Michel from comment #0) Germano, thanks for this detailed report. I don't know if we can prevent systemd from scanning lvs inside other lvs, but we can prevent it from auto activating lvs. Can you check if disabling auto activation fixes this issue? Edit /etc/lvm/lvm.conf: auto_activation_volume_list = [] This comment was originaly posted by nsoffer
> systemd: Started LVM2 PV scan on device 253:249 Peter, can you explain why systemd is looking for lvs inside another lv, and why it is automatically activates these lvs? Can we configure lvm to avoid this scan? This comment was originaly posted by nsoffer
See also bug 1253640, both seems to be caused by the lvm auto-activation. This comment was originaly posted by nsoffer
(In reply to Nir Soffer from comment #10) > Can you check if disabling auto activation fixes this issue? > > Edit /etc/lvm/lvm.conf: > > auto_activation_volume_list = [] Hi Nir, I just asked the customer to try it. I will keep you updated. Cheers This comment was originaly posted by gveitmic
(In reply to Nir Soffer from comment #11) > > systemd: Started LVM2 PV scan on device 253:249 > > Peter, can you explain why systemd is looking for lvs inside another lv, and > why it is automatically activates these lvs? > It's because the internal LV, if found, is just like any other LV. Unless you mark that somehow, LVM has no way to know whether this the LV is the one that should not be activated (...you might as well have a stack of LVs - LV on top of another LV without any VMs so from this point of view nothing is "internal" and you want to activate the LV in this case). By default, LVM autoactivates all VGs/LVs it finds. > Can we configure lvm to avoid this scan? You have several ways: - You can set devices/global_filter to include only PVs which should be scanned and any LVs activated on the host and reject everything else (this also prevents any scans for all the other devices/LVs which contain further PVs/VGs inside. - You can mark LVs with tags and then set activation/auto_actiavation_volume_list to activate only LVs with certain tag. Or, without tagging, directly listing the VGs/LVs which should be autoactivated only. But this way, the VMs PVs inside are going to be scanned still, just the VGs/LVs not autoactivated. - You can mark individual LVs to be skipped on autoactivation (lvchange -K|--setactivationskip y). But this way, you will also prevent the autoactivation within guest system too as the flag to skip activation is stored directly in VG metadata!!! So then you'd need someone to call "vgchange/lvchagne -ay" - the direct activation (in contrast to "vgchange/lvchange -aay" - the autoactivation, which is used by boot scripts) inside the guest to activate the LV. - You can use new "LVM system id" (see man lvmsystemid) feature which marks VGs with system id automatically and then only the VGs created on that system are visible/accessible (again, in this case, the PVs inside are going to be scanned still because we need to get the "ID" from VG metadata to be able to do the comparison of system IDs. If you really want to avoid scanning of internal PVs which are inside VG/LV by chance, the best is probably to use the global_filter to include only the PVs you know that are safe to access. This comment was originaly posted by prajnoha
(In reply to Peter Rajnoha from comment #14) > (In reply to Nir Soffer from comment #11) > > > systemd: Started LVM2 PV scan on device 253:249 > > > > Peter, can you explain why systemd is looking for lvs inside another lv, and > > why it is automatically activates these lvs? > > > > It's because the internal LV, if found, is just like any other LV. Unless > you mark that somehow, LVM has no way to know whether this the LV is the one > that should not be activated (...you might as well have a stack of LVs - LV > on top of another LV without any VMs so from this point of view nothing is > "internal" and you want to activate the LV in this case). > > By default, LVM autoactivates all VGs/LVs it finds. (Also, when using autoactivation, the activation is based on udev events, so if the PV appears, it's scanned for VG metadata and LVs are autoactivated. Each LV activation generates another udev event and the procedure repeates - if there's any PV found inside the LV, the autoactivation triggers for that PV. It's because the is like any other block device and as such, it can contain further PVs/VGs/LVs stacked inside... So when it comes to autoactivation, it's domino effect - the activation triggers another activation and so on unless you stop LVM from doing that by the means I described in coment #14.) This comment was originaly posted by prajnoha
(In reply to Peter Rajnoha from comment #14) > (In reply to Nir Soffer from comment #11) > > > systemd: Started LVM2 PV scan on device 253:249 > > > > Peter, can you explain why systemd is looking for lvs inside another lv, and > > why it is automatically activates these lvs? > > > > It's because the internal LV, if found, is just like any other LV. Unless > you mark that somehow, LVM has no way to know whether this the LV is the one > that should not be activated (...you might as well have a stack of LVs - LV > on top of another LV without any VMs so from this point of view nothing is > "internal" and you want to activate the LV in this case). > > By default, LVM autoactivates all VGs/LVs it finds. > > > Can we configure lvm to avoid this scan? > > You have several ways: > > - You can set devices/global_filter to include only PVs which should be > scanned and any LVs activated on the host and reject everything else (this > also prevents any scans for all the other devices/LVs which contain further > PVs/VGs inside. This is an issue since we don't know what are the pvs that must be scanned on this host. We want to avoid scanning any pv created by vdsm, but there is no easy way to detect these - basically anything under /dev/mapper/guid may be pv owned by vdsm. I don't think we can change multipath configuration / udev rules to link devices elsewhere, since it can break other software using multipath devices. Also we cannot use global_filter since it overrides filter used by vdsm commands. > - You can mark LVs with tags and then set > activation/auto_actiavation_volume_list to activate only LVs with certain > tag. Or, without tagging, directly listing the VGs/LVs which should be > autoactivated only. But this way, the VMs PVs inside are going to be scanned > still, just the VGs/LVs not autoactivated. We plan to disable auto activation (see comment 4), so this seems to be the best option. Can you confirm that this should resolve this issue? > - You can use new "LVM system id" (see man lvmsystemid) feature which > marks VGs with system id automatically and then only the VGs created on that > system are visible/accessible (again, in this case, the PVs inside are going > to be scanned still because we need to get the "ID" from VG metadata to be > able to do the comparison of system IDs. This will not work for shared storage, the vg/lvs created on the spm host must be accessible on other hosts. This comment was originaly posted by nsoffer
(In reply to Nir Soffer from comment #16) > We plan to disable auto activation (see comment 4), so this seems to be > the best option. Can you confirm that this should resolve this issue? > The disks (LVs inside which the PV is found) is still going to be scanned. Only activation/global_filter prevents this scan. But yes, the LVs found inside won't get activated. However, if you disable autoactivation completely, no LV will get activated at boot, not even the ones on the host, if you have any LVs there. > > - You can use new "LVM system id" (see man lvmsystemid) feature which > > marks VGs with system id automatically and then only the VGs created on that > > system are visible/accessible (again, in this case, the PVs inside are going > > to be scanned still because we need to get the "ID" from VG metadata to be > > able to do the comparison of system IDs. > > This will not work for shared storage, the vg/lvs created on the spm host > must be accessible on other hosts. You can share the same ID for all the hosts where you need the VGs/LVs to be visible and accessible (see also "lvmlocal" or "file" system_id_source in man lvmsystemid). This comment was originaly posted by prajnoha
(In reply to Peter Rajnoha from comment #17) > (In reply to Nir Soffer from comment #16) > > We plan to disable auto activation (see comment 4), so this seems to be > > the best option. Can you confirm that this should resolve this issue? > > > > The disks (LVs inside which the PV is found) is still going to be scanned. > Only activation/global_filter prevents this scan. But yes, the LVs found > inside won't get activated. However, if you disable autoactivation > completely, no LV will get activated at boot, not even the ones on the host, > if you have any LVs there. The admin can fix this by adding the needed lvs to the auto_activation_volume_list, right? We cannot work with auto activate everything policy, only with auto activate only the volumes specified by the admin. > > > - You can use new "LVM system id" (see man lvmsystemid) feature which > > > marks VGs with system id automatically and then only the VGs created on that > > > system are visible/accessible (again, in this case, the PVs inside are going > > > to be scanned still because we need to get the "ID" from VG metadata to be > > > able to do the comparison of system IDs. > > > > This will not work for shared storage, the vg/lvs created on the spm host > > must be accessible on other hosts. > > You can share the same ID for all the hosts where you need the VGs/LVs to be > visible and accessible (see also "lvmlocal" or "file" system_id_source in > man lvmsystemid). Ok, this way looks good to solve bug 1202595 and . Can you confirm on that bug? This comment was originaly posted by nsoffer
(In reply to Nir Soffer from comment #18) > The admin can fix this by adding the needed lvs to the > auto_activation_volume_list, right? > > We cannot work with auto activate everything policy, only with auto activate > only the volumes specified by the admin. > Sure, if that's what the configuration setting is for... (...the only issue is that it requires some manual actions/configuration from admins). > > > > - You can use new "LVM system id" (see man lvmsystemid) feature which > > > > marks VGs with system id automatically and then only the VGs created on that > > > > system are visible/accessible (again, in this case, the PVs inside are going > > > > to be scanned still because we need to get the "ID" from VG metadata to be > > > > able to do the comparison of system IDs. > > > > > > This will not work for shared storage, the vg/lvs created on the spm host > > > must be accessible on other hosts. > > > > You can share the same ID for all the hosts where you need the VGs/LVs to be > > visible and accessible (see also "lvmlocal" or "file" system_id_source in > > man lvmsystemid). > > Ok, this way looks good to solve bug 1202595 and . Can you confirm on that > bug? Yes, this should resolve the issue (see also bug #867333) as far as VG metadata is readable so we can read the ID and then decide whether the VG is allowed or not on that system. This comment was originaly posted by prajnoha
I renamed the bug to reflect the important issue in this bug. We already know that guest created lvs are accessible via lvm, see bug 1202595. This comment was originaly posted by nsoffer
David, in this bug we recommended to disable lvm auto activation by setting: auto_activation_volume_list = [] Based on your comment: https://bugzilla.redhat.com/show_bug.cgi?id=1303940#c52 Do you think we should also recommend disabling lvmetad by setting use_lvmetad = 0 The results are not clear yet, see comment 32. This comment was originaly posted by nsoffer
Sorry, my comment in the other bz was misleading. You still want to set auto_activation_volume_list = [] to prevent the system (lvm/systemd/udev) from automatically activating LVs, whether use_lvmetad is 0 or 1. So in your case, you want to both disable caching by setting use_lvmetad=0 in lvm.conf, and disable autoactivation by setting auto_activation_volume_list = [ ] in lvm.conf. This comment was originaly posted by teigland
Is auto_activation_volume_list = [] set in the copy of lvm.conf used during boot (initramfs)? Something must be calling vgchange or lvchange to activate LVs, but nothing is coming to mind at the moment. Will have to look into that more on Monday. This comment was originaly posted by teigland
I could partially reproduce this issue on rhel 7.3 beta and vdsm master with iscsi storage. 1. Setup standard lvm.conf: - use_lvmetad=1 - no auto_activation_volume_list option 2. Crate preallocated disk on iscsi storage domain and attach to vm (4df47a96-8a1b-436e-8a3e-3a638f119b48) 3. In the guest, create pv, vg and lvs: pvcreate /dev/vdb vgcreate guest-vg /dev/vdb lvcreate -n guest-lv -L 10g guest-vg lvcreate -n guest-lv-2 -L 5g guest-vg 4. Shutdown vm 5. Put host to maintenance 6. Activate host pvscan --cache lvs 3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m 4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 20.00g d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m ids bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m inbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m leases bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 2.00g master bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 1.00g metadata bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m outbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m guest-lv guest-vg -wi-a----- 10.00g guest-lv-2 guest-vg -wi-a----- 5.00g lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g - All lvs were activated - the raw lvs used as guest pc is active and open - guest created lvs are active On this system, disabling lvmetad fixes the issue with open lvs: 1. Disable lvmetad systemctl stop lvm2-lvmetad.service lvm2-lvmetad.socket systemctl mask lvm2-lvmetad.service lvm2-lvmetad.socket 2. Edit /etc/lvm/lvm.conf: use_lvmetad = 0 Note: I did not set auto_activation_volume_list, since this host won't boot with this setting. Boot fail with dependcy for /home 3. Move host to maintenance 4. Reboot host 5. Activate host After boot, all disk lvs are inactive, and guest lvs do not show in lvm commands: 3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m 4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 20.00g d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m ids bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m inbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m leases bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 2.00g master bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 1.00g metadata bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m outbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g 6. Starting the vm using the raw disk with guest lvs The guest lvs show when the raw lv is activated (opened by qemu) 3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m 4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 20.00g d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m ids bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m inbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m leases bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 2.00g master bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 1.00g metadata bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m outbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m guest-lv guest-vg -wi------- 10.00g guest-lv-2 guest-vg -wi------- 5.00g lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g 7. Shutting down the vm hide the guest lvs again 3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m 4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 20.00g d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m ids bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m inbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m leases bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 2.00g master bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 1.00g metadata bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m outbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g To hide both guest lvs and rhev lvs from the host, I tried this filter in lvm.conf: filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|" ] With this filter, lvs show only local lvs: lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g But when starting the vm using the raw lv with guest pv, the guest lvs appear again: guest-lv guest-vg -wi------- 10.00g guest-lv-2 guest-vg -wi------- 5.00g lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g To keep the guest lvs hidden, I tried this filter: filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|", "r|^/dev/disk/by-id/dm-uuid-LVM-.*|" ] lvs show now only the local lvs. This filter may not work with RHEVH, you may need to add the internal lvs to the filter. Vdsm is not effected by this filter since it overrides the filter in all lvm commands, using --config option. Note that sos commands need to override this filter to be able to see shared rhev lvs: lvs --config 'devices { filter = [ "a|.*|" ] }' So it seems that filter is the best way now. Garmano, see also also comment 47. This comment was originaly posted by nsoffer
Peter had a good explanation of the options above, so I'll just repeat some of that. Ideally, you don't want RHEV PVs/LVs to be scanned or activated by lvm run from the host (by which I mean lvm commands not run explicitly by RHEV). This means that the host's lvm.conf/global_filter (on root fs and initramfs) should exclude RHEV PVs. (Instead of excluding RHEV PVs, you could also whitelist only non-RHEV PVs. I'm not sure if whitelist or blacklist for the host would work better here.) Without using the global_filter, the next best option as you've found may be to disable autoactivation in the host's lvm.conf (on root fs and initramfs). This has limitations: - It will not protect RHEV PVs/LVs from being seen by the host's lvm. - It will not protect RHEV LVs from being activated by an unknown vgchange/lvchange -ay command run from the host that doesn't include the extra 'a' flag. - It will protect RHEV LVs from being autoactivated by the host's own vgchange/lvchange -aay commands. If you use one of these two methods, and RHEV LVs are still being activated by the host, outside of your own control, then those methods are not set up correctly, or there is a rogue vgchange/lvchange being run, or there's an lvm bug. Peter also mentioned some more exotic options (e.g. system ID) which would probably take more effort to get working, but may be worth trying in a future version. For now, global_filter or auto_activation_volume_list should be able to solve the problem of unwanted activation. This comment was originaly posted by teigland
This bug reveals two issues: 1. systemd and lvm are too eager to activate anything on a system - this is a regression in rhel 7 compared with rhel 6. 2. vdsm startup deactivation does not handle ovirt lvs with guest lvs The root cause is 1. We will work on configuring lvm during vdsm configuration. This seems to be very delicate, requiring special filter and regenerating initramfs. For 4.0 we can improve vdsm deactivation to handle lvs which are used as guest pvs. Workarounds: - setting up a lvm.conf filter (see comment 48) and regenerating initramfs - or avoiding creating pvs directly on guest devices (without creating partition table). This comment was originaly posted by nsoffer
Nir / David, Is it possible to filter by lvm tags? If yes - maybe it will give us more flexibility? This comment was originaly posted by mkalinin
global_filter/filter in lvm.conf operate at the device level, and only take device path names. tags operate at the VG level and can be used with autoactivation. This comment was originaly posted by teigland
lol, I just now looked into Nir's commits. I am sorry. He is using the tags. This comment was originaly posted by mkalinin
To add to the uses cases this bug affects, I believe direct lun attached to the guest with a VG on top of it, will have the same issue, right? This comment was originaly posted by mkalinin
(In reply to Nir Soffer from comment #56) > This bug reveals two issues: > 1. systemd and lvm are too eager to activate anything on a system - this is > a regression in rhel 7 compared with rhel 6. > 2. vdsm startup deactivation does not handle ovirt lvs with guest lvs > > The root cause is 1. We will work on configuring lvm during vdsm > configuration. > This seems to be very delicate, requiring special filter and regenerating > initramfs. > > For 4.0 we can improve vdsm deactivation to handle lvs which are used as > guest pvs. > > Workarounds: > - setting up a lvm.conf filter (see comment 48) and regenerating initramfs > - or avoiding creating pvs directly on guest devices (without creating > partition > table). Thank you Nir. I'm also happy about 2, quite sure it will prevent related issues in the future! I have created a customer facing solution explicitly for the workarounds, both on RHEL/RHV-H and RHEV-H https://access.redhat.com/solutions/2662261 * Will do some tests regarding the required filter on RHEV-H and update the solution as well. * I'll suggest the customer to also evaluate the workaround in a test host. Cheers, This comment was originaly posted by gveitmic
(In reply to Nir Soffer from comment #56) > Workarounds: > - setting up a lvm.conf filter (see comment 48) and regenerating initramfs I'm testing this in RHEV-H, to come up with a proper filter. 1. Added this to a RHEV-H host: filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|", "r|^/dev/disk/by-id/dm-uuid-LVM-.*|", "a|^/dev/disk/by-id/dm-name-HostVG-.*|", "a|^/dev/disk/by-id/dm-name-live-.*|"] 2. ovirt-node-rebuild-initramfs 3. Reboot 4. lv-guest is active right after boot, but not open. lvs --config 'devices { filter = [ "a|.*|" ] }' | grep lv-guest lv-guest vg-guest -wi-a----- 4.00m 5. VDSM did not deactivate it because it was open(!?)? storageRefresh::DEBUG::2016-09-28 04:27:09,074::lvm::661::Storage.LVM::(bootstrap) Skipping open lv: vg=76dfe909-20a6-4627-b6c4-7e16656e89a4 lv=6aacc711-0ecf-4c68-b64d-990ae33a54e3 Just to confirm it's the Guest one being seen in the host: vg--guest-lv--guest (253:52) `-76dfe909--20a6--4627--b6c4--7e16656e89a4-6aacc711--0ecf--4c68--b64d--990ae33a54e3 (253:47) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) Thoughts: A) Is this happening exclusively in RHEV-H? B) Should the workaround in RHEV-H also include some of the previously discussed options? Like the volume activation one? This comment was originaly posted by gveitmic
Just to complement with more data: Even after applying that filter + regenerating initramfs Disk LV boots up open. lvs --config 'devices { filter = [ "a|.*|" ] }' | grep 6aacc 6aacc711-0ecf-4c68-b64d-990ae33a54e3 76dfe909-20a6-4627-b6c4-7e16656e89a4 -wi-ao---- 1.00g Guest LV is active # lvs --config 'devices { filter = [ "a|.*|" ] }' | grep lv-guest lv-guest vg-guest -wi-a----- 4.00m vg--guest-lv--guest (253:52) `-76dfe909--20a6--4627--b6c4--7e16656e89a4-6aacc711--0ecf--4c68--b64d--990ae33a54e3 (253:47) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) lvm.conf seems to be working as intended: # lvs LV VG Attr LSize Config HostVG -wi-ao---- 8.00m Data HostVG -wi-ao---- 39.92g Logging HostVG -wi-ao---- 2.00g Swap HostVG -wi-ao---- 17.68g lv_home vg_rhevh1 -wi------- 39.68g lv_root vg_rhevh1 -wi------- 50.00g lv_swap vg_rhevh1 -wi------- 9.83g Probably initramfs was not updated properly? This comment was originaly posted by gveitmic
OK, got it. ovirt-node-rebuild-initramfs is not pulling the modified lvm.conf, seems to be using it's own. # cp /run/initramfs/live/initrd0.img . # mv initrd0.img initrd0.gz # gunzip unitrd0.gz # file initrd0 initrd0: ASCII cpio archive (SVR4 with no CRC) # cpio -i -F initrd0 # find ./ -name lvm.conf ./etc/lvm/lvm.conf # cat etc/lvm/lvm.conf global { locking_type = 4 use_lvmetad = 0 } Explains my results... Any ideas on how to update this in a supported way that customers can use? Or the only option is to do it manually (mount, dracut, umount)? This comment was originaly posted by gveitmic
You always want to use the standard lvm.conf as the starting point and modify fields in that. This comment was originaly posted by teigland
Reply to comment 69: My idea would be to allow passing some arguments to dracut, i.e. --lvmconf in this case. We can work on such a solution once we know that the filtering is working correct. Btw I do not see a technical reason why the filters should not work in RHEV-H 3.6's initrd. This comment was originaly posted by fdeutsch
From discussion with Nir Soffer - here are couple points for better functionality of this vdsm system: Every host in the system should have lvm.conf 'filter'&'global_filter' setting set in a way - it will NOT see a 'SharedVG' mpath device. (Such change needs to be reflected in initramdisk - so regeneration is needed) This is IMHO the most 'complex' step - since lvm2 does not support some 'filter' chaining - the filter has to be properly configured on every host. I'd always advise 'white-list' logic - so if the host knows it's using only 'sda' & 'sdb' as a PV - only those 2 device should be 'accepted' and all other devices 'rejected'. But I've already seen way more complex filters - so this part of advice is not 'trivial' to be easily automated. It's always possible to check via look at 'lvs -vvvv' output whether devices are rejected accordingly. To validate which settings are in-use by command - see 'man lvmconfig'. Once the 'host' is set to never ever see SharedVG mpath device - it's mostly done. Since easier part is now to ensure every executed 'vdsm' command comes with special --config option which DOES make only SharedVG mpath visible and reject every other device - and it should also go with 'locking_type=4' to ensure host is not able to accidentally modify anything on a VG (even it would be some internal lvm2 bug issue) This should lead to a system - where 'individual' lvm2 commands executed on a host in such system - DO NEVER influence state of 'SharedVG' - as well as they will never try to auto-active LVs, never try to fix invalid metadata and so on.... Also 'vdsm' will clearly provide his FULL control - which command across whole system may be working with sharedVG metadata. I'd like to emphasize - while 'vdsm' is running i.e. activation command on any host - it should NOT try to modify VG metadata anywhere else - especially if such VG consists of multiple PV - there is possibly to hit the 'race' where read-only metadata user could see partially updated metadata. So 'update' of VG metadata require exclusive access. And a final comment - IMHO such configured host system then could possibly use 'lvmetad' locally for locally available devices - since there shall be no interference. Just vdsm commands needs to go with --config lvmetad=0 This comment was originaly posted by zkabelac
(In reply to Zdenek Kabelac from comment #74) > Every host in the system should have lvm.conf 'filter'&'global_filter' > setting set in a way - it will NOT see a 'SharedVG' mpath device. > (Such change needs to be reflected in initramdisk - so regeneration is > needed) Unfortunately we cannot use global_filter with current vdsm, sine it will override vdsm filter, and vdsm will not be able to use shared storage. We will consider switching to global_filter in future release. The reason we are using filter is to allow an admin to reject certain devices from our filter. This comment was originaly posted by nsoffer
(In reply to Zdenek Kabelac from comment #74) > And a final comment - IMHO such configured host system then could possibly > use 'lvmetad' locally for locally available devices - since there shall be > no interference. > > Just vdsm commands needs to go with --config lvmetad=0 This requires modifying global_filter, this is not compatible with current vdsm (3.5, 3.6, 4.0). This comment was originaly posted by nsoffer
Fabian, maybe we should open a separate bug (RFE?) for node? I think it will be easier to get good filtering on node baked into node, before we can get a general purpose solution for rhel or other systems. This comment was originaly posted by nsoffer
Yes - if you seek solution without mods on vdsm side - then just skip 'global_filter' & 'locking_type=4' advise. One has to just make sure that any host is NOT masking mpath device needed by vdsm in its local lvm.conf file. And since it's not possible to exclude vdsm mpath devices via global_filter user as well MAY NOT use lvmetad locally. This comment was originaly posted by zkabelac
Hi Nir, Can we consider steps 1 - 6 from comment #48 as a steps to reproduce this bug? If not, can you please provide it? This comment was originaly posted by ratamir
The old patches were trying to cleanup after lvm during vdsm bootstrap. The new patch <https://gerrit.ovirt.org/66893> is handling the root cause by adding a global filter rejecting ovirt lvs. The configuration is generic and should work on any host, assuming that only ovirt uses uuids for vg names. After testing this, we will work on installing this file during vdsm configuration. Germano, can we test this configuration in some the relevant cases? This comment was originaly posted by nsoffer
(In reply to Raz Tamir from comment #84) > Can we consider steps 1 - 6 from comment #48 as a steps to reproduce this > bug? Yes This comment was originaly posted by nsoffer
(In reply to Nir Soffer from comment #85) > The old patches were trying to cleanup after lvm during vdsm bootstrap. > > The new patch <https://gerrit.ovirt.org/66893> is handling the root cause by > adding a global filter rejecting ovirt lvs. > > The configuration is generic and should work on any host, assuming that only > ovirt uses uuids for vg names. > > After testing this, we will work on installing this file during vdsm > configuration. > > Germano, can we test this configuration in some the relevant cases? Hi Nir, Absolutely. We currently have a Sev4 case open, I can check with that customer is he wants to help testing. We can also use our own reproducer from comment #54 (re-installing with RHEL). Or even better, do both. But first, I have two questions: 1) You want me to cherry-pick both new and old patches, not just the new right? (all the gerrits attached to the bz into latest stable vdsm + any dependency) Not sure if we should go for latest master, especially if we are asking for a customer to help testing. 2) Does that filter in the new patch needs to go into initrd as well? Thanks, Germano This comment was originaly posted by gveitmic
(In reply to Germano Veit Michel from comment #87) > (In reply to Nir Soffer from comment #85) > 1) You want me to cherry-pick both new and old patches, not just the new > right? No, just the new configuration. This is trivial to deploy on a customer machine and compatible with any version of RHV. > 2) Does that filter in the new patch needs to go into initrd as well? It should, so guest lvs on RHV raw lvs are never active on the host. If we find that this configuration is a good solution for this issue we will integrate this in vdsm-tool configure later. This comment was originaly posted by nsoffer
Hi Nir, I added the global filter to both initrd's and etc's lvm.conf and the raw disks LVs are not activated in Host upon reboot. So it looks good. I am not sure if this will work for Direct LUNs though (Roman's bug). As AFAIK they don't follow the regex you specified in the global filter. Still, it looks like a solid step forward. Below is only the test data. # cat /etc/redhat-release Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160711.0.el7ev) initrd: [root@rhevh-2 ~]# dir=`mktemp -d` && cd $dir [root@rhevh-2 tmp.NShI7bl4lg]# cp /run/initramfs/live/initrd0.img . [root@rhevh-2 tmp.NShI7bl4lg]# mv initrd0.img initrd0.gz [root@rhevh-2 tmp.NShI7bl4lg]# gunzip initrd0.gz [root@rhevh-2 tmp.NShI7bl4lg]# cpio -i -F initrd0 236815 blocks [root@rhevh-2 tmp.NShI7bl4lg]# cat etc/lvm/lvm.conf global { locking_type = 4 use_lvmetad = 0 } devices { global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ] } etc: cat /etc/lvm/lvm.conf | grep global_filter | grep -v '#' global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ] # lvs | awk -F' ' '{print $1,$3}' | egrep '\-wi\-a' ids -wi-ao---- inbox -wi-a----- leases -wi-a----- master -wi-ao---- metadata -wi-a----- outbox -wi-a----- ids -wi-a----- inbox -wi-a----- leases -wi-a----- master -wi-a----- metadata -wi-a----- outbox -wi-a----- Config -wi-ao---- Data -wi-ao---- Logging -wi-ao---- Swap -wi-ao---- # dmsetup ls --tree HostVG-Logging (253:69) `-2a802d0e800d00000p4 (253:4) `-2a802d0e800d00000 (253:0) `- (8:0) 603f851b--7388--49f1--a8cc--095557ae0a20-ids (253:18) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) HostVG-Swap (253:67) `-2a802d0e800d00000p4 (253:4) `-2a802d0e800d00000 (253:0) `- (8:0) 76dfe909--20a6--4627--b6c4--7e16656e89a4-inbox (253:28) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 603f851b--7388--49f1--a8cc--095557ae0a20-master (253:20) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) HostVG-Data (253:70) `-2a802d0e800d00000p4 (253:4) `-2a802d0e800d00000 (253:0) `- (8:0) 603f851b--7388--49f1--a8cc--095557ae0a20-outbox (253:16) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) 603f851b--7388--49f1--a8cc--095557ae0a20-metadata (253:15) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) 603f851b--7388--49f1--a8cc--095557ae0a20-inbox (253:19) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) live-base (253:6) `- (7:1) 360014380125989a10000400000500000p2 (253:11) `-360014380125989a10000400000500000 (253:9) |- (8:96) `- (8:48) 76dfe909--20a6--4627--b6c4--7e16656e89a4-leases (253:26) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 360014380125989a10000400000500000p1 (253:10) `-360014380125989a10000400000500000 (253:9) |- (8:96) `- (8:48) 76dfe909--20a6--4627--b6c4--7e16656e89a4-ids (253:27) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 76dfe909--20a6--4627--b6c4--7e16656e89a4-metadata (253:24) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 603f851b--7388--49f1--a8cc--095557ae0a20-leases (253:17) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) HostVG-Config (253:68) `-2a802d0e800d00000p4 (253:4) `-2a802d0e800d00000 (253:0) `- (8:0) live-rw (253:5) |- (7:2) `- (7:1) 2a802d0e800d00000p3 (253:3) `-2a802d0e800d00000 (253:0) `- (8:0) 76dfe909--20a6--4627--b6c4--7e16656e89a4-master (253:29) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 2a802d0e800d00000p2 (253:2) `-2a802d0e800d00000 (253:0) `- (8:0) 76dfe909--20a6--4627--b6c4--7e16656e89a4-outbox (253:25) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 2a802d0e800d00000p1 (253:1) `-2a802d0e800d00000 (253:0) `- (8:0) This comment was originaly posted by gveitmic
*** This bug has been marked as a duplicate of bug 1374545 ***
In this bug we track only the issue of dynamic activation of guest logical volumes on RHV raw volumes when using iSCSI storage. This issue is caused by lvmetad, which is not compatible with RHV shared storage and causes also other trouble. This service is disabled in 4.0.7. The issue of activation of guest logical volumes during boot on when using FC storage will be track in another bug (we have several lvm bugs related to this).
Nir, please define whether or not this bug requires doc text.
Add doc text.
We have 2 issues that we can reproduce and verify here: A. All lvs are activated after connecting to iSCSI storage domain 1. Setup system with one host, and one iSCSI storage domain. 2. Put host to maintenance 3. Activate host 4. Check the state of all lvs on this storage domain on 4.0 - all lvs will be active on 4.1 - only the special lvs, OVF_STORE lvs and lvs used by running vms will be active B. Guest lvs created inside a vm on a raw volume are activated on the host this will cause failure to deactivate lvs when shutting a vm. 1. Setup system with one host, and one iSCSI storage domain. 2. Create and start one vm running linux 3. Create a raw volume and attach it to the running vm 4. Login to the vm, and create a pv, vg and lv using the new raw disk assuming that the disk is connected using virtio-scsi at /dev/sdb: pvcreate /dev/sdb vgcreate guest-vg /dev/sdb lvcreate --name test-lv --size 1g guest-vg 5. On the host, check if the guest lv is active On 4.0, run: pvscan --cache lvs guest-vg expected results: guest-lv will be active On 4.1 run: lvs guest-vg (pvscan --cache is not needed, since lvmetad is disabled) expected result: guest-lv will not be active 6. Stop the vm On 4.0: expected result: - you should see error in vdsm log about deactivating the raw volume, "logical volume xxxyyy in use" - Both the raw volume lv and guest-lv will remain active. On 4.1: expected results: - the raw volume lv should be deactivated without error
Verified with code: ----------------------- vdsm-4.18.24-3.el7ev.x86_64 ovirt-engine-4.0.7.4-0.1.el7ev.noarch rhevm-4.0.7.4-0.1.el7ev.noarch Verified with the scenario above: Moving to VERIFIED!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0544.html