Bug 1374545
Summary: | Guest LVs created in ovirt raw volumes are auto activated on the hypervisor in RHEL 7 | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Germano Veit Michel <gveitmic> | |
Component: | vdsm | Assignee: | Nir Soffer <nsoffer> | |
Status: | CLOSED ERRATA | QA Contact: | Natalie Gavrielov <ngavrilo> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 3.6.7 | CC: | agk, amureini, baptiste.agasse, bazulay, bcholler, boruvka.michal, cshao, dojones, dougsland, eberman, fdeutsch, gveitmic, gwatson, hannsj_uhl, jbrassow, jcall, jcoscia, jforeman, lsurette, markus.oswald, mkalinin, ngavrilo, nsoffer, obockows, otheus.uibk, prajnoha, rabraham, ratamir, rbarry, rhev-integ, rhodain, shipatil, srevivo, talayan, tcarlin, teigland, tnisan, ycui, ykaul, ylavi, zkabelac | |
Target Milestone: | ovirt-4.1.1 | Keywords: | Performance, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1398918 (view as bug list) | Environment: | ||
Last Closed: | 2017-04-25 00:43:56 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1130527, 1202595, 1253640, 1303940, 1313588, 1325844, 1326828, 1331978, 1342786, 1371939, 1373118, 1374549, 1377157, 1398918, 1400446, 1400528, 1403839, 1411197, 1454287 | |||
Attachments: |
Description
Germano Veit Michel
2016-09-09 00:40:40 UTC
(In reply to Germano Veit Michel from comment #0) Germano, thanks for this detailed report. I don't know if we can prevent systemd from scanning lvs inside other lvs, but we can prevent it from auto activating lvs. Can you check if disabling auto activation fixes this issue? Edit /etc/lvm/lvm.conf: auto_activation_volume_list = [] > systemd: Started LVM2 PV scan on device 253:249
Peter, can you explain why systemd is looking for lvs inside another lv, and
why it is automatically activates these lvs?
Can we configure lvm to avoid this scan?
See also bug 1253640, both seems to be caused by the lvm auto-activation. (In reply to Nir Soffer from comment #10) > Can you check if disabling auto activation fixes this issue? > > Edit /etc/lvm/lvm.conf: > > auto_activation_volume_list = [] Hi Nir, I just asked the customer to try it. I will keep you updated. Cheers (In reply to Nir Soffer from comment #11) > > systemd: Started LVM2 PV scan on device 253:249 > > Peter, can you explain why systemd is looking for lvs inside another lv, and > why it is automatically activates these lvs? > It's because the internal LV, if found, is just like any other LV. Unless you mark that somehow, LVM has no way to know whether this the LV is the one that should not be activated (...you might as well have a stack of LVs - LV on top of another LV without any VMs so from this point of view nothing is "internal" and you want to activate the LV in this case). By default, LVM autoactivates all VGs/LVs it finds. > Can we configure lvm to avoid this scan? You have several ways: - You can set devices/global_filter to include only PVs which should be scanned and any LVs activated on the host and reject everything else (this also prevents any scans for all the other devices/LVs which contain further PVs/VGs inside. - You can mark LVs with tags and then set activation/auto_actiavation_volume_list to activate only LVs with certain tag. Or, without tagging, directly listing the VGs/LVs which should be autoactivated only. But this way, the VMs PVs inside are going to be scanned still, just the VGs/LVs not autoactivated. - You can mark individual LVs to be skipped on autoactivation (lvchange -K|--setactivationskip y). But this way, you will also prevent the autoactivation within guest system too as the flag to skip activation is stored directly in VG metadata!!! So then you'd need someone to call "vgchange/lvchagne -ay" - the direct activation (in contrast to "vgchange/lvchange -aay" - the autoactivation, which is used by boot scripts) inside the guest to activate the LV. - You can use new "LVM system id" (see man lvmsystemid) feature which marks VGs with system id automatically and then only the VGs created on that system are visible/accessible (again, in this case, the PVs inside are going to be scanned still because we need to get the "ID" from VG metadata to be able to do the comparison of system IDs. If you really want to avoid scanning of internal PVs which are inside VG/LV by chance, the best is probably to use the global_filter to include only the PVs you know that are safe to access. (In reply to Peter Rajnoha from comment #14) > (In reply to Nir Soffer from comment #11) > > > systemd: Started LVM2 PV scan on device 253:249 > > > > Peter, can you explain why systemd is looking for lvs inside another lv, and > > why it is automatically activates these lvs? > > > > It's because the internal LV, if found, is just like any other LV. Unless > you mark that somehow, LVM has no way to know whether this the LV is the one > that should not be activated (...you might as well have a stack of LVs - LV > on top of another LV without any VMs so from this point of view nothing is > "internal" and you want to activate the LV in this case). > > By default, LVM autoactivates all VGs/LVs it finds. (Also, when using autoactivation, the activation is based on udev events, so if the PV appears, it's scanned for VG metadata and LVs are autoactivated. Each LV activation generates another udev event and the procedure repeates - if there's any PV found inside the LV, the autoactivation triggers for that PV. It's because the is like any other block device and as such, it can contain further PVs/VGs/LVs stacked inside... So when it comes to autoactivation, it's domino effect - the activation triggers another activation and so on unless you stop LVM from doing that by the means I described in coment #14.) (In reply to Peter Rajnoha from comment #14) > (In reply to Nir Soffer from comment #11) > > > systemd: Started LVM2 PV scan on device 253:249 > > > > Peter, can you explain why systemd is looking for lvs inside another lv, and > > why it is automatically activates these lvs? > > > > It's because the internal LV, if found, is just like any other LV. Unless > you mark that somehow, LVM has no way to know whether this the LV is the one > that should not be activated (...you might as well have a stack of LVs - LV > on top of another LV without any VMs so from this point of view nothing is > "internal" and you want to activate the LV in this case). > > By default, LVM autoactivates all VGs/LVs it finds. > > > Can we configure lvm to avoid this scan? > > You have several ways: > > - You can set devices/global_filter to include only PVs which should be > scanned and any LVs activated on the host and reject everything else (this > also prevents any scans for all the other devices/LVs which contain further > PVs/VGs inside. This is an issue since we don't know what are the pvs that must be scanned on this host. We want to avoid scanning any pv created by vdsm, but there is no easy way to detect these - basically anything under /dev/mapper/guid may be pv owned by vdsm. I don't think we can change multipath configuration / udev rules to link devices elsewhere, since it can break other software using multipath devices. Also we cannot use global_filter since it overrides filter used by vdsm commands. > - You can mark LVs with tags and then set > activation/auto_actiavation_volume_list to activate only LVs with certain > tag. Or, without tagging, directly listing the VGs/LVs which should be > autoactivated only. But this way, the VMs PVs inside are going to be scanned > still, just the VGs/LVs not autoactivated. We plan to disable auto activation (see comment 4), so this seems to be the best option. Can you confirm that this should resolve this issue? > - You can use new "LVM system id" (see man lvmsystemid) feature which > marks VGs with system id automatically and then only the VGs created on that > system are visible/accessible (again, in this case, the PVs inside are going > to be scanned still because we need to get the "ID" from VG metadata to be > able to do the comparison of system IDs. This will not work for shared storage, the vg/lvs created on the spm host must be accessible on other hosts. (In reply to Nir Soffer from comment #16) > We plan to disable auto activation (see comment 4), so this seems to be > the best option. Can you confirm that this should resolve this issue? > The disks (LVs inside which the PV is found) is still going to be scanned. Only activation/global_filter prevents this scan. But yes, the LVs found inside won't get activated. However, if you disable autoactivation completely, no LV will get activated at boot, not even the ones on the host, if you have any LVs there. > > - You can use new "LVM system id" (see man lvmsystemid) feature which > > marks VGs with system id automatically and then only the VGs created on that > > system are visible/accessible (again, in this case, the PVs inside are going > > to be scanned still because we need to get the "ID" from VG metadata to be > > able to do the comparison of system IDs. > > This will not work for shared storage, the vg/lvs created on the spm host > must be accessible on other hosts. You can share the same ID for all the hosts where you need the VGs/LVs to be visible and accessible (see also "lvmlocal" or "file" system_id_source in man lvmsystemid). (In reply to Peter Rajnoha from comment #17) > (In reply to Nir Soffer from comment #16) > > We plan to disable auto activation (see comment 4), so this seems to be > > the best option. Can you confirm that this should resolve this issue? > > > > The disks (LVs inside which the PV is found) is still going to be scanned. > Only activation/global_filter prevents this scan. But yes, the LVs found > inside won't get activated. However, if you disable autoactivation > completely, no LV will get activated at boot, not even the ones on the host, > if you have any LVs there. The admin can fix this by adding the needed lvs to the auto_activation_volume_list, right? We cannot work with auto activate everything policy, only with auto activate only the volumes specified by the admin. > > > - You can use new "LVM system id" (see man lvmsystemid) feature which > > > marks VGs with system id automatically and then only the VGs created on that > > > system are visible/accessible (again, in this case, the PVs inside are going > > > to be scanned still because we need to get the "ID" from VG metadata to be > > > able to do the comparison of system IDs. > > > > This will not work for shared storage, the vg/lvs created on the spm host > > must be accessible on other hosts. > > You can share the same ID for all the hosts where you need the VGs/LVs to be > visible and accessible (see also "lvmlocal" or "file" system_id_source in > man lvmsystemid). Ok, this way looks good to solve bug 1202595 and . Can you confirm on that bug? (In reply to Nir Soffer from comment #18) > The admin can fix this by adding the needed lvs to the > auto_activation_volume_list, right? > > We cannot work with auto activate everything policy, only with auto activate > only the volumes specified by the admin. > Sure, if that's what the configuration setting is for... (...the only issue is that it requires some manual actions/configuration from admins). > > > > - You can use new "LVM system id" (see man lvmsystemid) feature which > > > > marks VGs with system id automatically and then only the VGs created on that > > > > system are visible/accessible (again, in this case, the PVs inside are going > > > > to be scanned still because we need to get the "ID" from VG metadata to be > > > > able to do the comparison of system IDs. > > > > > > This will not work for shared storage, the vg/lvs created on the spm host > > > must be accessible on other hosts. > > > > You can share the same ID for all the hosts where you need the VGs/LVs to be > > visible and accessible (see also "lvmlocal" or "file" system_id_source in > > man lvmsystemid). > > Ok, this way looks good to solve bug 1202595 and . Can you confirm on that > bug? Yes, this should resolve the issue (see also bug #867333) as far as VG metadata is readable so we can read the ID and then decide whether the VG is allowed or not on that system. I renamed the bug to reflect the important issue in this bug. We already know that guest created lvs are accessible via lvm, see bug 1202595. David, in this bug we recommended to disable lvm auto activation by setting: auto_activation_volume_list = [] Based on your comment: https://bugzilla.redhat.com/show_bug.cgi?id=1303940#c52 Do you think we should also recommend disabling lvmetad by setting use_lvmetad = 0 The results are not clear yet, see comment 32. Sorry, my comment in the other bz was misleading. You still want to set auto_activation_volume_list = [] to prevent the system (lvm/systemd/udev) from automatically activating LVs, whether use_lvmetad is 0 or 1. So in your case, you want to both disable caching by setting use_lvmetad=0 in lvm.conf, and disable autoactivation by setting auto_activation_volume_list = [ ] in lvm.conf. Is auto_activation_volume_list = [] set in the copy of lvm.conf used during boot (initramfs)? Something must be calling vgchange or lvchange to activate LVs, but nothing is coming to mind at the moment. Will have to look into that more on Monday. I could partially reproduce this issue on rhel 7.3 beta and vdsm master with iscsi storage. 1. Setup standard lvm.conf: - use_lvmetad=1 - no auto_activation_volume_list option 2. Crate preallocated disk on iscsi storage domain and attach to vm (4df47a96-8a1b-436e-8a3e-3a638f119b48) 3. In the guest, create pv, vg and lvs: pvcreate /dev/vdb vgcreate guest-vg /dev/vdb lvcreate -n guest-lv -L 10g guest-vg lvcreate -n guest-lv-2 -L 5g guest-vg 4. Shutdown vm 5. Put host to maintenance 6. Activate host pvscan --cache lvs 3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m 4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 20.00g d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m ids bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m inbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m leases bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 2.00g master bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 1.00g metadata bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m outbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m guest-lv guest-vg -wi-a----- 10.00g guest-lv-2 guest-vg -wi-a----- 5.00g lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g - All lvs were activated - the raw lvs used as guest pc is active and open - guest created lvs are active On this system, disabling lvmetad fixes the issue with open lvs: 1. Disable lvmetad systemctl stop lvm2-lvmetad.service lvm2-lvmetad.socket systemctl mask lvm2-lvmetad.service lvm2-lvmetad.socket 2. Edit /etc/lvm/lvm.conf: use_lvmetad = 0 Note: I did not set auto_activation_volume_list, since this host won't boot with this setting. Boot fail with dependcy for /home 3. Move host to maintenance 4. Reboot host 5. Activate host After boot, all disk lvs are inactive, and guest lvs do not show in lvm commands: 3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m 4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 20.00g d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m ids bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m inbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m leases bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 2.00g master bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 1.00g metadata bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m outbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g 6. Starting the vm using the raw disk with guest lvs The guest lvs show when the raw lv is activated (opened by qemu) 3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m 4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 20.00g d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m ids bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m inbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m leases bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 2.00g master bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 1.00g metadata bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m outbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m guest-lv guest-vg -wi------- 10.00g guest-lv-2 guest-vg -wi------- 5.00g lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g 7. Shutting down the vm hide the guest lvs again 3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m 4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 20.00g d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m ids bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m inbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m leases bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 2.00g master bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 1.00g metadata bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m outbox bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g To hide both guest lvs and rhev lvs from the host, I tried this filter in lvm.conf: filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|" ] With this filter, lvs show only local lvs: lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g But when starting the vm using the raw lv with guest pv, the guest lvs appear again: guest-lv guest-vg -wi------- 10.00g guest-lv-2 guest-vg -wi------- 5.00g lv_home vg0 -wi-ao---- 736.00m lv_root vg0 -wi-ao---- 7.37g lv_swap vg0 -wi-ao---- 7.36g To keep the guest lvs hidden, I tried this filter: filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|", "r|^/dev/disk/by-id/dm-uuid-LVM-.*|" ] lvs show now only the local lvs. This filter may not work with RHEVH, you may need to add the internal lvs to the filter. Vdsm is not effected by this filter since it overrides the filter in all lvm commands, using --config option. Note that sos commands need to override this filter to be able to see shared rhev lvs: lvs --config 'devices { filter = [ "a|.*|" ] }' So it seems that filter is the best way now. Garmano, see also also comment 47. Peter had a good explanation of the options above, so I'll just repeat some of that. Ideally, you don't want RHEV PVs/LVs to be scanned or activated by lvm run from the host (by which I mean lvm commands not run explicitly by RHEV). This means that the host's lvm.conf/global_filter (on root fs and initramfs) should exclude RHEV PVs. (Instead of excluding RHEV PVs, you could also whitelist only non-RHEV PVs. I'm not sure if whitelist or blacklist for the host would work better here.) Without using the global_filter, the next best option as you've found may be to disable autoactivation in the host's lvm.conf (on root fs and initramfs). This has limitations: - It will not protect RHEV PVs/LVs from being seen by the host's lvm. - It will not protect RHEV LVs from being activated by an unknown vgchange/lvchange -ay command run from the host that doesn't include the extra 'a' flag. - It will protect RHEV LVs from being autoactivated by the host's own vgchange/lvchange -aay commands. If you use one of these two methods, and RHEV LVs are still being activated by the host, outside of your own control, then those methods are not set up correctly, or there is a rogue vgchange/lvchange being run, or there's an lvm bug. Peter also mentioned some more exotic options (e.g. system ID) which would probably take more effort to get working, but may be worth trying in a future version. For now, global_filter or auto_activation_volume_list should be able to solve the problem of unwanted activation. This bug reveals two issues: 1. systemd and lvm are too eager to activate anything on a system - this is a regression in rhel 7 compared with rhel 6. 2. vdsm startup deactivation does not handle ovirt lvs with guest lvs The root cause is 1. We will work on configuring lvm during vdsm configuration. This seems to be very delicate, requiring special filter and regenerating initramfs. For 4.0 we can improve vdsm deactivation to handle lvs which are used as guest pvs. Workarounds: - setting up a lvm.conf filter (see comment 48) and regenerating initramfs - or avoiding creating pvs directly on guest devices (without creating partition table). Nir / David, Is it possible to filter by lvm tags? If yes - maybe it will give us more flexibility? global_filter/filter in lvm.conf operate at the device level, and only take device path names. tags operate at the VG level and can be used with autoactivation. lol, I just now looked into Nir's commits. I am sorry. He is using the tags. To add to the uses cases this bug affects, I believe direct lun attached to the guest with a VG on top of it, will have the same issue, right? (In reply to Nir Soffer from comment #56) > This bug reveals two issues: > 1. systemd and lvm are too eager to activate anything on a system - this is > a regression in rhel 7 compared with rhel 6. > 2. vdsm startup deactivation does not handle ovirt lvs with guest lvs > > The root cause is 1. We will work on configuring lvm during vdsm > configuration. > This seems to be very delicate, requiring special filter and regenerating > initramfs. > > For 4.0 we can improve vdsm deactivation to handle lvs which are used as > guest pvs. > > Workarounds: > - setting up a lvm.conf filter (see comment 48) and regenerating initramfs > - or avoiding creating pvs directly on guest devices (without creating > partition > table). Thank you Nir. I'm also happy about 2, quite sure it will prevent related issues in the future! I have created a customer facing solution explicitly for the workarounds, both on RHEL/RHV-H and RHEV-H https://access.redhat.com/solutions/2662261 * Will do some tests regarding the required filter on RHEV-H and update the solution as well. * I'll suggest the customer to also evaluate the workaround in a test host. Cheers, (In reply to Nir Soffer from comment #56) > Workarounds: > - setting up a lvm.conf filter (see comment 48) and regenerating initramfs I'm testing this in RHEV-H, to come up with a proper filter. 1. Added this to a RHEV-H host: filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|", "r|^/dev/disk/by-id/dm-uuid-LVM-.*|", "a|^/dev/disk/by-id/dm-name-HostVG-.*|", "a|^/dev/disk/by-id/dm-name-live-.*|"] 2. ovirt-node-rebuild-initramfs 3. Reboot 4. lv-guest is active right after boot, but not open. lvs --config 'devices { filter = [ "a|.*|" ] }' | grep lv-guest lv-guest vg-guest -wi-a----- 4.00m 5. VDSM did not deactivate it because it was open(!?)? storageRefresh::DEBUG::2016-09-28 04:27:09,074::lvm::661::Storage.LVM::(bootstrap) Skipping open lv: vg=76dfe909-20a6-4627-b6c4-7e16656e89a4 lv=6aacc711-0ecf-4c68-b64d-990ae33a54e3 Just to confirm it's the Guest one being seen in the host: vg--guest-lv--guest (253:52) `-76dfe909--20a6--4627--b6c4--7e16656e89a4-6aacc711--0ecf--4c68--b64d--990ae33a54e3 (253:47) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) Thoughts: A) Is this happening exclusively in RHEV-H? B) Should the workaround in RHEV-H also include some of the previously discussed options? Like the volume activation one? Just to complement with more data: Even after applying that filter + regenerating initramfs Disk LV boots up open. lvs --config 'devices { filter = [ "a|.*|" ] }' | grep 6aacc 6aacc711-0ecf-4c68-b64d-990ae33a54e3 76dfe909-20a6-4627-b6c4-7e16656e89a4 -wi-ao---- 1.00g Guest LV is active # lvs --config 'devices { filter = [ "a|.*|" ] }' | grep lv-guest lv-guest vg-guest -wi-a----- 4.00m vg--guest-lv--guest (253:52) `-76dfe909--20a6--4627--b6c4--7e16656e89a4-6aacc711--0ecf--4c68--b64d--990ae33a54e3 (253:47) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) lvm.conf seems to be working as intended: # lvs LV VG Attr LSize Config HostVG -wi-ao---- 8.00m Data HostVG -wi-ao---- 39.92g Logging HostVG -wi-ao---- 2.00g Swap HostVG -wi-ao---- 17.68g lv_home vg_rhevh1 -wi------- 39.68g lv_root vg_rhevh1 -wi------- 50.00g lv_swap vg_rhevh1 -wi------- 9.83g Probably initramfs was not updated properly? OK, got it. ovirt-node-rebuild-initramfs is not pulling the modified lvm.conf, seems to be using it's own. # cp /run/initramfs/live/initrd0.img . # mv initrd0.img initrd0.gz # gunzip unitrd0.gz # file initrd0 initrd0: ASCII cpio archive (SVR4 with no CRC) # cpio -i -F initrd0 # find ./ -name lvm.conf ./etc/lvm/lvm.conf # cat etc/lvm/lvm.conf global { locking_type = 4 use_lvmetad = 0 } Explains my results... Any ideas on how to update this in a supported way that customers can use? Or the only option is to do it manually (mount, dracut, umount)? You always want to use the standard lvm.conf as the starting point and modify fields in that. Reply to comment 69: My idea would be to allow passing some arguments to dracut, i.e. --lvmconf in this case. We can work on such a solution once we know that the filtering is working correct. Btw I do not see a technical reason why the filters should not work in RHEV-H 3.6's initrd. From discussion with Nir Soffer - here are couple points for better functionality of this vdsm system: Every host in the system should have lvm.conf 'filter'&'global_filter' setting set in a way - it will NOT see a 'SharedVG' mpath device. (Such change needs to be reflected in initramdisk - so regeneration is needed) This is IMHO the most 'complex' step - since lvm2 does not support some 'filter' chaining - the filter has to be properly configured on every host. I'd always advise 'white-list' logic - so if the host knows it's using only 'sda' & 'sdb' as a PV - only those 2 device should be 'accepted' and all other devices 'rejected'. But I've already seen way more complex filters - so this part of advice is not 'trivial' to be easily automated. It's always possible to check via look at 'lvs -vvvv' output whether devices are rejected accordingly. To validate which settings are in-use by command - see 'man lvmconfig'. Once the 'host' is set to never ever see SharedVG mpath device - it's mostly done. Since easier part is now to ensure every executed 'vdsm' command comes with special --config option which DOES make only SharedVG mpath visible and reject every other device - and it should also go with 'locking_type=4' to ensure host is not able to accidentally modify anything on a VG (even it would be some internal lvm2 bug issue) This should lead to a system - where 'individual' lvm2 commands executed on a host in such system - DO NEVER influence state of 'SharedVG' - as well as they will never try to auto-active LVs, never try to fix invalid metadata and so on.... Also 'vdsm' will clearly provide his FULL control - which command across whole system may be working with sharedVG metadata. I'd like to emphasize - while 'vdsm' is running i.e. activation command on any host - it should NOT try to modify VG metadata anywhere else - especially if such VG consists of multiple PV - there is possibly to hit the 'race' where read-only metadata user could see partially updated metadata. So 'update' of VG metadata require exclusive access. And a final comment - IMHO such configured host system then could possibly use 'lvmetad' locally for locally available devices - since there shall be no interference. Just vdsm commands needs to go with --config lvmetad=0 (In reply to Zdenek Kabelac from comment #74) > Every host in the system should have lvm.conf 'filter'&'global_filter' > setting set in a way - it will NOT see a 'SharedVG' mpath device. > (Such change needs to be reflected in initramdisk - so regeneration is > needed) Unfortunately we cannot use global_filter with current vdsm, sine it will override vdsm filter, and vdsm will not be able to use shared storage. We will consider switching to global_filter in future release. The reason we are using filter is to allow an admin to reject certain devices from our filter. (In reply to Zdenek Kabelac from comment #74) > And a final comment - IMHO such configured host system then could possibly > use 'lvmetad' locally for locally available devices - since there shall be > no interference. > > Just vdsm commands needs to go with --config lvmetad=0 This requires modifying global_filter, this is not compatible with current vdsm (3.5, 3.6, 4.0). Fabian, maybe we should open a separate bug (RFE?) for node? I think it will be easier to get good filtering on node baked into node, before we can get a general purpose solution for rhel or other systems. Yes - if you seek solution without mods on vdsm side - then just skip 'global_filter' & 'locking_type=4' advise. One has to just make sure that any host is NOT masking mpath device needed by vdsm in its local lvm.conf file. And since it's not possible to exclude vdsm mpath devices via global_filter user as well MAY NOT use lvmetad locally. Hi Nir, Can we consider steps 1 - 6 from comment #48 as a steps to reproduce this bug? If not, can you please provide it? The old patches were trying to cleanup after lvm during vdsm bootstrap. The new patch <https://gerrit.ovirt.org/66893> is handling the root cause by adding a global filter rejecting ovirt lvs. The configuration is generic and should work on any host, assuming that only ovirt uses uuids for vg names. After testing this, we will work on installing this file during vdsm configuration. Germano, can we test this configuration in some the relevant cases? (In reply to Raz Tamir from comment #84) > Can we consider steps 1 - 6 from comment #48 as a steps to reproduce this > bug? Yes (In reply to Nir Soffer from comment #85) > The old patches were trying to cleanup after lvm during vdsm bootstrap. > > The new patch <https://gerrit.ovirt.org/66893> is handling the root cause by > adding a global filter rejecting ovirt lvs. > > The configuration is generic and should work on any host, assuming that only > ovirt uses uuids for vg names. > > After testing this, we will work on installing this file during vdsm > configuration. > > Germano, can we test this configuration in some the relevant cases? Hi Nir, Absolutely. We currently have a Sev4 case open, I can check with that customer is he wants to help testing. We can also use our own reproducer from comment #54 (re-installing with RHEL). Or even better, do both. But first, I have two questions: 1) You want me to cherry-pick both new and old patches, not just the new right? (all the gerrits attached to the bz into latest stable vdsm + any dependency) Not sure if we should go for latest master, especially if we are asking for a customer to help testing. 2) Does that filter in the new patch needs to go into initrd as well? Thanks, Germano (In reply to Germano Veit Michel from comment #87) > (In reply to Nir Soffer from comment #85) > 1) You want me to cherry-pick both new and old patches, not just the new > right? No, just the new configuration. This is trivial to deploy on a customer machine and compatible with any version of RHV. > 2) Does that filter in the new patch needs to go into initrd as well? It should, so guest lvs on RHV raw lvs are never active on the host. If we find that this configuration is a good solution for this issue we will integrate this in vdsm-tool configure later. Hi Nir, I added the global filter to both initrd's and etc's lvm.conf and the raw disks LVs are not activated in Host upon reboot. So it looks good. I am not sure if this will work for Direct LUNs though (Roman's bug). As AFAIK they don't follow the regex you specified in the global filter. Still, it looks like a solid step forward. Below is only the test data. # cat /etc/redhat-release Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160711.0.el7ev) initrd: [root@rhevh-2 ~]# dir=`mktemp -d` && cd $dir [root@rhevh-2 tmp.NShI7bl4lg]# cp /run/initramfs/live/initrd0.img . [root@rhevh-2 tmp.NShI7bl4lg]# mv initrd0.img initrd0.gz [root@rhevh-2 tmp.NShI7bl4lg]# gunzip initrd0.gz [root@rhevh-2 tmp.NShI7bl4lg]# cpio -i -F initrd0 236815 blocks [root@rhevh-2 tmp.NShI7bl4lg]# cat etc/lvm/lvm.conf global { locking_type = 4 use_lvmetad = 0 } devices { global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ] } etc: cat /etc/lvm/lvm.conf | grep global_filter | grep -v '#' global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ] # lvs | awk -F' ' '{print $1,$3}' | egrep '\-wi\-a' ids -wi-ao---- inbox -wi-a----- leases -wi-a----- master -wi-ao---- metadata -wi-a----- outbox -wi-a----- ids -wi-a----- inbox -wi-a----- leases -wi-a----- master -wi-a----- metadata -wi-a----- outbox -wi-a----- Config -wi-ao---- Data -wi-ao---- Logging -wi-ao---- Swap -wi-ao---- # dmsetup ls --tree HostVG-Logging (253:69) `-2a802d0e800d00000p4 (253:4) `-2a802d0e800d00000 (253:0) `- (8:0) 603f851b--7388--49f1--a8cc--095557ae0a20-ids (253:18) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) HostVG-Swap (253:67) `-2a802d0e800d00000p4 (253:4) `-2a802d0e800d00000 (253:0) `- (8:0) 76dfe909--20a6--4627--b6c4--7e16656e89a4-inbox (253:28) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 603f851b--7388--49f1--a8cc--095557ae0a20-master (253:20) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) HostVG-Data (253:70) `-2a802d0e800d00000p4 (253:4) `-2a802d0e800d00000 (253:0) `- (8:0) 603f851b--7388--49f1--a8cc--095557ae0a20-outbox (253:16) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) 603f851b--7388--49f1--a8cc--095557ae0a20-metadata (253:15) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) 603f851b--7388--49f1--a8cc--095557ae0a20-inbox (253:19) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) live-base (253:6) `- (7:1) 360014380125989a10000400000500000p2 (253:11) `-360014380125989a10000400000500000 (253:9) |- (8:96) `- (8:48) 76dfe909--20a6--4627--b6c4--7e16656e89a4-leases (253:26) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 360014380125989a10000400000500000p1 (253:10) `-360014380125989a10000400000500000 (253:9) |- (8:96) `- (8:48) 76dfe909--20a6--4627--b6c4--7e16656e89a4-ids (253:27) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 76dfe909--20a6--4627--b6c4--7e16656e89a4-metadata (253:24) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 603f851b--7388--49f1--a8cc--095557ae0a20-leases (253:17) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) HostVG-Config (253:68) `-2a802d0e800d00000p4 (253:4) `-2a802d0e800d00000 (253:0) `- (8:0) live-rw (253:5) |- (7:2) `- (7:1) 2a802d0e800d00000p3 (253:3) `-2a802d0e800d00000 (253:0) `- (8:0) 76dfe909--20a6--4627--b6c4--7e16656e89a4-master (253:29) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 2a802d0e800d00000p2 (253:2) `-2a802d0e800d00000 (253:0) `- (8:0) 76dfe909--20a6--4627--b6c4--7e16656e89a4-outbox (253:25) `-360014380125989a10000400000480000 (253:7) |- (8:64) `- (8:16) 2a802d0e800d00000p1 (253:1) `-2a802d0e800d00000 (253:0) `- (8:0) Nir, we have a problem. In a customer environment we deployed that latest filter from comment #85 and #89. I have checked both lvm.conf and initrd and the filter is there. Problem: we still have Guest LVs activated. The following are all VM internal. Same story, RAW disk, VM PV directly on top of the Image LV. MESSAGE=1 logical volume(s) in volume group "vg_cache_pulp" now active MESSAGE=1 logical volume(s) in volume group "vg_pulp" now active MESSAGE=1 logical volume(s) in volume group "vg_mongodb" now active MESSAGE=1 logical volume(s) in volume group "vg_apps1" now active cat etc/lvm/lvm.conf | grep global_filter | grep -v '#' global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ] The lvm.conf other settings are RHEV-H default. And this is in the initrd (extracted manually to confirm). global { locking_type = 4 use_lvmetad = 0 } devices { global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ] } The filter does match (with egrep) the underlying device of vg_pulp-lv_pulp.. vg_pulp-lv_pulp (253:289) `-b442d48e--0398--4cad--b9bf--992a3e663573-0f902453--5caf--473f--8df7--f3c4a4c... `-36000144000000010706222888cc3684d (253:104) Ideas? Is the initrd correct? By the way, it was on: Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160711.0.el7ev) *** Bug 1326828 has been marked as a duplicate of this bug. *** *** Bug 1374549 has been marked as a duplicate of this bug. *** *** Bug 1398918 has been marked as a duplicate of this bug. *** (In reply to Germano Veit Michel from comment #91) > In a customer environment we deployed that latest filter from comment #85 > and #89. I have checked both lvm.conf and initrd and the filter is there. What is the difference between this environment and the one form comment 89 that seems to work? (In reply to Nir Soffer from comment #98) > What is the difference between this environment and the one form comment 89 > that seems to work? I can't see any big differences. Both are late 3.6 7.2 RHV-H with just the filter added. This one that it did not work is a real production environment, with many more disks though, possibly the boot sequence regarding the FC devices is slightly different (could timing be related?) > locking_type = 4 I will fix this. You think it might be worth trying again? So please provide version of lvm2 in use here. Please attach to this BZ lvm.conf. Also please attach debug trace from such system for this command: pvscan -vvvv --aay --cache /dev/mapper/your_problematic_device (make sure LVs from this device are NOT active before running this command (i.e. deactive them with 'dmsetup remove xxxxxxx') Hi Zdenek, (In reply to Zdenek Kabelac from comment #103) > So please provide version of lvm2 in use here. lvm2-2.02.130-5.el7_2.5.x86_64 > Please attach to this BZ lvm.conf. Attaching now. > Also please attach debug trace from such system for this command: > > pvscan -vvvv --aay --cache /dev/mapper/your_problematic_device > I cannot ask the customer to run this, as it may activate everything, unless you can confirm nothing will be activated. They just lost another VM over the holidays due to a RHV LSM over a disk affected by this BZ. Any other command that would help? If there isn't another option then I will have to ask them for a maintenance window in one of the hosts to perform this. Thanks Why is this bug in POST if it is unresolved? Do we have an in-house reproducer for this issue so we can get the debug trace requested in comment 103? *** Bug 1412900 has been marked as a duplicate of this bug. *** (In reply to Germano Veit Michel from comment #104) > Hi Zdenek, > > (In reply to Zdenek Kabelac from comment #103) > > So please provide version of lvm2 in use here. > lvm2-2.02.130-5.el7_2.5.x86_64 > > > Please attach to this BZ lvm.conf. > Attaching now. > > > Also please attach debug trace from such system for this command: > > > > pvscan -vvvv --aay --cache /dev/mapper/your_problematic_device > > > > I cannot ask the customer to run this, as it may activate everything, unless > you can confirm nothing will be activated. We really cannot processed unless we see the exact trace of problematic behavior. Try to capture full lvmdump. And possibly system logs from the moment customer is noticing some unwanted activating. > If there isn't another option then I will have to ask them for a maintenance > window in one of the hosts to perform this. Yep - we need to get the info - we can't guess.. Hi Zdenek, Thanks for the help. Two things: 1. We have a maintenance window to troubleshoot this with the customer. They have separated a Host to troubleshoot this and we created a test VM (with PV directly on top of Disk LV), all production VMs have been moved to PVs on top of partitions on top of Disk LV. 2. In our labs, I've got a similar system to the customers and I managed to confirm it's not working on FC (but it seems fine on iSCSI). It looks to me the filter is fine, but there is still something wrong at boot time which scans activates these LVs. They seem to be activated on boot and then filtered. And it doesn't happen anymore if I deactivate them and scan again. This matches my iscsi/FC observation I guess, since iscsi storage attachment is at a much later stage. So, this one is coming up active, and it shouldn't: (lvs): LV VG Attr LSize baa40da9-d2e7-4b56-9e4d-7654bb854ec1 603f851b-7388-49f1-a8cc-095557ae0a20 -wi-ao---- 1.00g myvg is a VG on a PV directly on top of baa40da9, no partitions. This VG has 3 LVs, (lvol0, 1 and 2). (dmsetup ls --tree): myvg-lvol2 (253:81) `-603f851b--7388--49f1--a8cc--095557ae0a20-baa40da9--d2e7--4b56--9e4d--7654bb854ec1 (253:26) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) myvg-lvol1 (253:80) `-603f851b--7388--49f1--a8cc--095557ae0a20-baa40da9--d2e7--4b56--9e4d--7654bb854ec1 (253:26) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) myvg-lvol0 (253:79) `-603f851b--7388--49f1--a8cc--095557ae0a20-baa40da9--d2e7--4b56--9e4d--7654bb854ec1 (253:26) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) As expected, vdsm fails to deactivate it on boot, because of those 3 LVs: # lvchange -an /dev/603f851b-7388-49f1-a8cc-095557ae0a20/baa40da9-d2e7-4b56-9e4d-7654bb854ec1 Logical volume 603f851b-7388-49f1-a8cc-095557ae0a20/baa40da9-d2e7-4b56-9e4d-7654bb854ec1 is used by another device. /dev/myvg/lvol0 -> ../dm-79 /dev/myvg/lvol1 -> ../dm-80 /dev/myvg/lvol2 -> ../dm-81 Interestingly I don't see these 3 LVs on 'lvs'. Because the filter is actually working? The command you requested always returns empty output (because of lvmetad?): I've grabbed the following for you: - etc/lvm.conf - initrd lvm.conf - lvm2-2.02.130-5.el7_2.5.x86_64 - kernel-3.10.0-327.22.2.el7.x86_64 - systemd-219-19.el7_2.11.x86_64 - Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160711.0.el7ev) - pvscan -vvvv - pvscan -vvvv -aay --cache (empty) - pvscan -vvvv -aay --cache /dev/mapper/360014380125989a100004000004c0000 (empty) - journalctl --all --this-boot --no-pager -o verbose (pvscan on 253:26?) I also changed locking_type to 1 as per comment #97 and #99, just to confirm it's not related. Same results. I'm still wondering what is difference between this and comment #89, I even used the same system and the outputs of the commands are clearly different from what I pasted at that time. Any idea what is wrong? What else do you need to troubleshoot this? So from logs in comment 111 - there is disabled lvmetad - in which case the auto-activation is handled via lvm2-activation-early.service which calls: 'vgchange -aay --ignoreskippedcluster' global_filter|filter does CONTROL only visibility of block device for lvm2 command. Auto activation is DEFAULT behavior for any LV you have. If users want to disable it completely - as been already said here in comment 10 - in lvm.conf use: activation { auto_activation_volume_list = [] } I'm really lost here what is the problem being solved: 1. Are we solving problem that LV is activating volumes from devices it should not see ? (in this case we need trace from 'vgchange -aay -vvvv') 2. Or there is just misunderstanding how to control auto activation ? (if it's unwanted it should be disabled by config option). (In reply to Zdenek Kabelac from comment #112) > 1. Are we solving problem that LV is activating volumes from devices it > should not see ? (in this case we need trace from 'vgchange -aay -vvvv') Yes, but on boot only. Let me try to summarize and give it current status: With current configuration (just the filter), after removing (dmsetup) deactivating (lvchange) all the unwanted volumes, vgchange -aay -vvvv don't seem to activate anything we don't want it to. I suppose this is because our filters are working fine here. This is good. Our problem seems to be restricted to boot time. So are you saying lvm2-activation-early.service doesn't care about our filter? (dmsetup ls --tree after boot): myvg-lvol0 (253:79) `-603f851b--7388--49f1--a8cc--095557ae0a20-baa40da9.... (253:26) `-360014380125989a100004000004c0000 (253:8) |- (8:80) `- (8:32) After boot all LVs are active and vdsm is supposed to deactivate them. But vdsm fails to deactivate a few ones[1]. In the case above, myvg-lvol0 prevents 603f851b from being deactivated by vdsm on startup. We don't want that myvg-lvol0 to be found or activated. [1] ones that have a PV/VG directly on top of them (no partitions), which were scanner and had internal LVs activated. As you found out, we actually tried auto_activation_volume_list = [] in a previous version of the solution, but AFAIK it's not present in the latest fix, that's why I did not include it. I have now added auto_activation_volume_list = [] to both initrd and lvm.conf and it seems to be working! No unwanted LVs are active and it did not find that myvg-lvol0. Once Nir confirms this might be the fix, I will forward it to one of our customers to give it another round of testing. @Nir, don't we also need auto_activation_volume_list = [] in https://gerrit.ovirt.org/66893? See Zdenek comment #112 above. I do think there are couple 'terms' mixed together and the bad results are taken out of it. So from 'comment 113' - let me explain: lvm2 filters do serve only the purpose to 'hide' devices from being SEEN by lvm2 command - so the volumes which cannot be seen cannot be used for i.e. activation or listing or.... Auto-activation is a feature which is turned ON by default for LVs. If user does not like/does not want it - lvm2 must be configured to not do it (it cannot guess it). What I can see here is likely the customer does not want auto-activation - so he must configure lvm.conf to disable it. Now the interesting bit is - out of where 'vgchange -aay' got into ramdisk. AFAIK this command is executed AFTER ramdisk switches to /rootfs. Standard RHEL dracut boot process does ONLY activate rootLV and shall not activate anything else - unless instrumented on grub config line. So since here we are being told the initramdisk lvm.conf must have been also configured to NOT make any auto activation - it's quite weird. Since such thing should NOT really be needed - nothing should be using auto activation from ramdisk - unless there has been some nonstandard 'tweaking' of ramdisk boot sequence! In such case - please attach 'ramdisk' in-use and the history how it's been created... (In reply to Zdenek Kabelac from comment #114) > I do think there are couple 'terms' mixed together and the bad results are > taken out of it. > > So from 'comment 113' - let me explain: > > lvm2 filters do serve only the purpose to 'hide' devices from being SEEN by > lvm2 command - so the volumes which cannot be seen cannot be used for i.e. > activation or listing or.... > > Auto-activation is a feature which is turned ON by default for LVs. > If user does not like/does not want it - lvm2 must be configured to not do > it (it cannot guess it). Zdenek, are you saying that having a filter hiding a device, will *not* prevent activation of the device and other devices inside this device? This does not make sense. Can you point me to the documentation specifying this behavior? > What I can see here is likely the customer does not want auto-activation - > so he must configure lvm.conf to disable it. But we cannot auto_activation_list, since we vdsm cannot guess what are the host devices, it knows only what are ovirt devices. auto_activation_list does not support excluding, only including. If it will support exclusion, e.g. (r"@RHT_STORAGE_DOMAIN") we can use it to prevent auto activation. > Now the interesting bit is - out of where 'vgchange -aay' got into ramdisk. > AFAIK this command is executed AFTER ramdisk switches to /rootfs. > Standard RHEL dracut boot process does ONLY activate rootLV and shall not > activate anything else - unless instrumented on grub config line. > > So since here we are being told the initramdisk lvm.conf must have been > also configured to NOT make any auto activation - it's quite weird. Since > such thing should NOT really be needed - nothing should be using auto > activation from ramdisk - unless there has been some nonstandard 'tweaking' > of ramdisk boot sequence! > > In such case - please attach 'ramdisk' in-use and the history how it's been > created... (In reply to Germano Veit Michel from comment #113) > @Nir, don't we also need auto_activation_volume_list = [] in > https://gerrit.ovirt.org/66893? See Zdenek comment #112 above. We cannot, this will prevent activation of lvs needed by the host. When I tested this, vg0-lv_home was not activated after boot. This can be configured with the host lvs (or using tags) on user machine manually as a temporary workaround until we find a better solution. For example: auto_activation_volume_list = ["vg0"] This should probably update the support article about configuring hypervisors with FC storage. (In reply to Nir Soffer from comment #115) > (In reply to Zdenek Kabelac from comment #114) > > I do think there are couple 'terms' mixed together and the bad results are > > taken out of it. > > > > So from 'comment 113' - let me explain: > > > > lvm2 filters do serve only the purpose to 'hide' devices from being SEEN by > > lvm2 command - so the volumes which cannot be seen cannot be used for i.e. > > activation or listing or.... > > > > Auto-activation is a feature which is turned ON by default for LVs. > > If user does not like/does not want it - lvm2 must be configured to not do > > it (it cannot guess it). > > Zdenek, are you saying that having a filter hiding a device, will *not* > prevent > activation of the device and other devices inside this device? This does not > make sense. You are misreading my comment. Hiding device will OF COURSE avoid auto activation of ANY LV on such device. Command will simply not see such VG so cannot activate LV on it. But the auto activated devices presented in multiple comments in this BZ were clearly activated from devices (PVs) which have passed through filters. So either filters are wrong - or user has 'devices' which are not excluded from filtering and thus are candidates for auto activation. > > > What I can see here is likely the customer does not want auto-activation - > > so he must configure lvm.conf to disable it. > > But we cannot auto_activation_list, since we vdsm cannot guess what are > the host devices, it knows only what are ovirt devices. Your 'vdsm' commands simply have to be running with any auto activation disabled (assuming this logic is unwanted for you vdsm master server) + filtering out all users devices. While users local guest commands must filter our all vdsm devices. Here could be an idea - lvm2 fully supports 'separate' configuration with envvar - so you could your own /etc/lvm directory set via LVM_SYSTEM_DIR (see man lvm.8) So all commands running via 'vdsm' remotely may 'enjoy' your special prepared configuration which must 'white-list' only your storage devices and reject everything else. While the guest/vdsm-slave (well not sure what terminology is used here) is continuing using his OWN /etc/lvm config. The ONLY mandatory condition he needs to ensure is - he will reject/exclude/not touch/not see/.... ANY vdsm device. So admin must 'update' his local lvm.conf (unavoidable). So when such user runs lvm2 command on his box, he will NEVER see vdsm LVs. When 'vdsm' runs its commands on user's box (With its own $LVM_SYSTEM_DIR) - it will NEVER see/touch/interact with any user local disks. Is this making it more clear? (In reply to Zdenek Kabelac from comment #117) > (In reply to Nir Soffer from comment #115) > > (In reply to Zdenek Kabelac from comment #114) > > > I do think there are couple 'terms' mixed together and the bad results are > > > taken out of it. > > > > > > So from 'comment 113' - let me explain: > > > > > > lvm2 filters do serve only the purpose to 'hide' devices from being SEEN by > > > lvm2 command - so the volumes which cannot be seen cannot be used for i.e. > > > activation or listing or.... > > > > > > Auto-activation is a feature which is turned ON by default for LVs. > > > If user does not like/does not want it - lvm2 must be configured to not do > > > it (it cannot guess it). > > > > Zdenek, are you saying that having a filter hiding a device, will *not* > > prevent > > activation of the device and other devices inside this device? This does not > > make sense. > > > You are misreading my comment. > > Hiding device will OF COURSE avoid auto activation of ANY LV on such device. > Command will simply not see such VG so cannot activate LV on it. > > But the auto activated devices presented in multiple comments in this BZ > were clearly activated from devices (PVs) which have passed through > filters. > > So either filters are wrong - or user has 'devices' which are not excluded > from filtering and thus are candidates for auto activation. OK, thanks for clarifying this, we will debug the filter. > > > > > What I can see here is likely the customer does not want auto-activation - > > > so he must configure lvm.conf to disable it. > > > > But we cannot auto_activation_list, since we vdsm cannot guess what are > > the host devices, it knows only what are ovirt devices. > > Your 'vdsm' commands simply have to be running with any auto activation > disabled (assuming this logic is unwanted for you vdsm master server) + > filtering out all users devices. > > While users local guest commands must filter our all vdsm devices. We don't have any issue in vdsm, we are using --config devices/filter including only the shared storage relevant for the operation, and rejecting anything else. (In reply to Zdenek Kabelac from comment #118) > Here could be an idea - > > lvm2 fully supports 'separate' configuration with envvar - > so you could your own /etc/lvm directory set via LVM_SYSTEM_DIR > (see man lvm.8) > > So all commands running via 'vdsm' remotely may 'enjoy' your special > prepared configuration which must 'white-list' only your storage devices > and reject everything else. > > While the guest/vdsm-slave (well not sure what terminology is used here) is > continuing using his OWN /etc/lvm config. The ONLY mandatory condition he > needs to ensure is - he will reject/exclude/not touch/not see/.... ANY > vdsm device. > So admin must 'update' his local lvm.conf (unavoidable). This is the issue, how to configure the system automatically. To add a host to RHV system, you click "add host", fill in the host address and root password, and click "ok". The system does the rest for you. We don't want to require manual configuration of each host in the cluster. We don't ask lvm to guess, only provide the necessary configuration options that will allow vdsm to automatically configure the host in safe way. > So when such user runs lvm2 command on his box, he will NEVER see vdsm LVs. > > When 'vdsm' runs its commands on user's box (With its own $LVM_SYSTEM_DIR) - > it will NEVER see/touch/interact with any user local disks. > > Is this making it more clear? We are using --config for this, maybe using our own lvm directory is a better idea, will keep the vdsm commands more clear. Here is example vdsm command (taken from vdsm.log): lvchange --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ '\''a|/dev/mapper/360014057ce1a1afffd744dc8c34643d7|'\'', '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --refresh 40825394-bb03-4b66-8a82-e6ddbb789ec3/leases Germano, please check my reply in comment 116. Germano, please also provide the info requested by Zdenek in comment 114. Updating bug name, this is not about auto activation of ovirt lvs, but about auto activation of guest lvs inside ovirt raw lvs. We want to solve also the activation of ovirt lvs, the current solution we have, deactivating them in vdsm during startup is a workaround, but this issue is not in the scope of this bug. So let's make clear further more: Auto activation is the feature which is running from udev rules using 'pvscan -aay....' command when 'new' device appears. You will NOT get auto activation from lvm2 command by some 'magical' autonomous daemon. So if your guest system has devices auto-activated from your 'vdsm' active volumes - then such guest system is not filtering out properly vdsm devices. i.e. you activate vdsm LV - and such device is scanned by udev and pvscan can see it via /dev/dm-XXX device. This can be rather very easily 'verified' - when 'guest' runs a simple command: 'pvs -a' - it MUST NOT report any of your 'vdsm' PV volumes and it also MUST NOT report any of your activate volume/LV from vdsm system. If you can see some /dev/XXX/YYY listed which belongs to 'vdsm' the filter is set wrong. And here I'm always 'advocating' for 'white-list' filter logic - which is IMHO the most human understandable. Means - you list explicitly devices you want to see on guest's lvm.conf (where using filters like a|/dev/mapper/*| is really BAD idea) and then you reject everything else (r|.*| being the last rule after 'a' rules first) NB: creating correctly working reject/(aka black-list) filter is way more complicated as you would have to preserve some 'strict' naming convention rules within vdsm. Hi Zdenek and Nir! Right, I think I have digested most of your feedback. After reading your comments I decided to go for 9 different config tests on the system I was previously using to run these tests (RHEV-H). I ran them (several variations of initrd and root/lvm.conf) and got some very confusing results, which wouldn't make sense. Then I decided to repeat them and to my surprise many configs gave different results when compared to the first round! I have no idea what causes this, maybe due to the way etc/lvm/lvm.conf is "written" in RHEV-H? It's actually mounted there (this is a Read-Only RHEL based appliance): # mount | grep lvm.conf /dev/mapper/HostVG-Config on /etc/lvm/lvm.conf type ext4 (rw,noatime,seclabel,data=ordered) This was on: - RHEV-H 7.2 (20160711.0.el7ev) - kernel-3.10.0-327.22.2.el7.x86_64 - lvm2-2.02.130-5.el7_2.5.x86_64 - systemd-219-19.el7_2.11.x86_64 My conclusion on RHEV-H. - There is some weird behavior or maybe a race condition on RHEV-H. Maybe sometimes lvm.conf is mounted before vgchange -aay, sometimes after? Sometimes the filter works and sometimes not? Most of the times it doesn't. I'm sorry but I'm deeply confused on what is happening on RHEV-H. I can't get consistent results from it. Most of the times it seems the filter doesn't work on boot. Then I gave up on that RHEV-H and I moved to RHEL. I got consistent results and also in line with what Zdenek is saying. And everything is working just fine just with Nir's global_filter. So there is indeed something different on RHEV-H. - RHEL 7.2 - kernel-3.10.0-327.el7.x86_64 - lvm2-2.02.130-5.el7_2.5.x86_64 - systemd-219-19.el7.x86_64 Test 10------------------------------------------------------------------------ lvm.conf: auto_activation_volume_list = [ "rhel_rhevh-1" ] global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9]..... initrd: auto_activation_volume_list = [ "rhel_rhevh-1" ] global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9]..... Result = GOOD Test 11------------------------------------------------------------------------ lvm.conf: global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9]..... initrd: global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9]..... Result = GOOD Test 12------------------------------------------------------------------------ lvm.conf: global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9]..... initrd: default Result = GOOD Test 13------------------------------------------------------------------------ lvm.conf: auto_activation_volume_list = [ "rhel_rhevh-1" ] global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9]..... initrd: default Result = GOOD. Test 14------------------------------------------------------------------------ lvm.conf: auto_activation_volume_list = [ "rhel_rhevh-1" ] initrd: default Result = GOOD. Conclusion and notes on RHEL: - 14 was just for fun :), doesn't sound like a good idea as the ovirt LVs are not filtered. - Consistent results after rebooting many times (RHEL) - The filter seems to be enough to prevent activation of Guest LVs on RAW ovirt LVs, which is the aim of this bug. So I think it's indeed solved - on RHEL. And Zdenek is correct, auto_activation_volume_list on initrd makes no difference. It looks like we are all good on RHEL. But on RHEV-H things are not that simple. Hopefully RHVH 4.0 behaves the same way as RHEL. TODO: - Find a solution for legacy RHEV-H * maybe inject Nir's filter on root/lvm.conf in the base image? - Check behavior on RHVH 4 @Fabian 1. Is there any easy way to modify RHEV-H (legacy) appliance to inject a custom lvm.conf? I want to run a test without the persist mechanism. 2. I also found we have two lvm.conf in the squashfs. /etc/lvm.conf and /etc/lvm/lvm.conf, not sure if this is correct. 3. Do you have any idea at what time (during boot) the HostVG-Config files are mounted? I couldn't find it from the journal logs. Zdenek, base on comment 124, do we really need to include the lvmlocal.conf in initrd? We are using these settings: global { use_lvmetad = 0 } devices { global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ] } Hi Nir It doesn't really matter where you place the filter - it gets all merged. The important thing here is - the setting is REPLACING previous setting in this configuration merging (overriding). So when user has set some 'global_filter' in his lvm.conf - and then you 'just override' such filter in lvmlocal.conf - then you ensure your matching device gets filter out BUT all previously preset filters by user in his lvm.conf are simply gone/lost/vanished. There is no such thing as 'combing' setting' or applying filter in chain. It's always - the latest setting wins. The purpose of 'lvmlocal.conf' is different and mainly targeted for cluster where you wand to set local difference but the 1 single admin has still full control over all nodes and there is not some dedicated 'local life' on cluster nodes. So I'd here rather avoid using lvmlocal.conf - and rather always properly configure lvm.conf. The next important issue which might deserve explicit sentence - there is NOT any 'transitivity' applied on filters. So if you 'filter-out' your multipath device for local usage - but 'vdsm' then activates some LV from this multipath device (with --config on cmdline) then such device becomes visible local - and since your filter is excluding PURELY a /dev/mapper/some-special-number-device then all other DM devices will be locally process (so auto activate normally applies to them) In general you are stacking VG on top of another VG - there is NO support in lvm2 to handle this - we are not stopping users to use such configuration - but we are not 'advertising' such stacking either - as it has many complex issues which are ATM out-of-scope for single lvm2 commands. So if skilled admin can handle rough edges - fine - but it needs very careful handle otherwise there are unavoidable deadlocks. Technically I've forget to explicitly answer the question whether there is need of lvmlocal.conf inside initramdisk. If the lvm command executed inside ramdisk shall not see such device - and such device is already present in ramdisk (assuming it's multipath started in ramdisk) - then yes - needs to be there. But as said in comment 128 - it's likely better to 'combine' filter settings in a single file lvm.conf which is then copied into ramdisk. As a useful hint lvm2 provides command 'lvmconfig' (man lvmconfig) So at anytime you could 'query' what will be some lvm2 setting set to. I.e. in dracut you can use rd.break=pre-mount and check: 'lvm lvmconfig devices/global_filter' (In reply to Zdenek Kabelac from comment #128) > So I'd here rather avoid using lvmlocal.conf - and rather always properly > configure lvm.conf. But configuring lvm.conf means that changes to lvm.conf during upgrades will never be applied, and the system administrator will have merge the conf changes manually using lvm.conf.rpmnew files. Also this means we have to modify existing lvm.conf. There is no way to do that in a clean way with a proper comments. I tried to use augtool to edit lvm.conf, but it seems impossible to have your edit in the write place in the file. So this leave the option to replace the entire file with a new file as we do for /etc/multipath.conf. This means that vdsm will have follow every change in lvm.conf, and include a new file each time lvm change this file. I don't want this dependency in vdsm. Using lvmlocal.conf, we can override in a clean way only the changes that matter to us, and we can document properly each override. See https://github.com/oVirt/vdsm/blob/master/static/usr/share/vdsm/lvmlocal.conf So I don't plan to touch lvm.conf, and I expect lvm to use lvmlocal.conf as documented in lvm.conf(5). > The next important issue which might deserve explicit sentence - there is > NOT any 'transitivity' applied on filters. > > So if you 'filter-out' your multipath device for local usage - but 'vdsm' > then activates some LV from this multipath device (with --config on cmdline) > then such device becomes visible local - and since your filter is > excluding PURELY a /dev/mapper/some-special-number-device then all other > DM devices will be locally process (so auto activate normally applies to > them) We don't filter out multipath devices, we filter out ovirt-lvs, see See https://github.com/oVirt/vdsm/blob/master/static/usr/share/vdsm/lvmlocal.conf > In general you are stacking VG on top of another VG - there is NO support in > lvm2 to handle this - we are not stopping users to use such configuration - > but we are not 'advertising' such stacking either - as it has many complex > issues which are ATM out-of-scope for single lvm2 commands. > So if skilled admin can handle rough edges - fine - but it needs very > careful handle otherwise there are unavoidable deadlocks. We are not stacking vg on top of another vg - it is the guest vm. This may happen on raw volumes when the guest vm is using the raw volume as a pv. The flow is: 1. run a vm with a raw block volume (/dev/vgname/lvname) 2. in the guest admin uses the volume e.g. /dev/vdb as a pv, adding it a a vg and creating lvs 3. on the host, lvm scan the vm volume, find the new pv/vg/lvs and activate them. There needs to be 'someone' to combine filters. Otherwise when you just basically 'replace' user's filter setting with yours - then previously working user commands starts to fail as filter from lvm.conf is not applied and is overruled by lvmlocal.conf setting. So effectively what you do with lvmlocal.conf could be equally well provided by 'sed' scripting where you replace one setting in lvm.conf with your new setting. As said - there is no cascade of filters applied from different config files - it's always one single value for setting which is applied. --- Now to explain stacking issue - when you 'pass' /dev/vgname/lvname to guest VM running on this vdsm guest system - the content of this LV is still FULLY visible on guest. Nothing stops you can to do i.e. 'dd' on such device while it's in use by guest VM (and cause some major discrepancy and data instability issue for VM for obvious reason...) With the very same logic when 'guestVM' will create a VG & LV inside on it's /dev/vdb device - you could easily see such 'stacked' VG on your vdsm guest machine. That's why it is important on your vdsm guest to precisely filter out such device. Lvm2 itself is not guessing disk hierarchy and not estimating which devices it should see or not see. So back to your filter in comment 127 - it will filter out likely your uniquely named multipath device - but it will accept everything else (so very other i.e. /dev/dm-XXX except your multipath will PASS) So when you run on vdsm guest lvm2 command - it will not read data out of your multipath device - but there is nothing stopping lvm2 from reading data out of /dev/vgname/lvname running on top if this device. Probably repeating myself here again - but lvm2 is not doing any 'device stack' in-depth analysis - so it has no idea that /dev/vgname/lvname has backend on multipath device which you filtered out by the explicit reject filter. So with such filter settings - if on the guest inside guestVM someone makes a VG + LV - such LV could be then seen and auto/activated on guest - I'm almost 99.999% you do not want this (unless guestVM is killed and you want to inspect content) (Hopefully I still remember properly your description of 'vdms' master system and satellite guest systems) Created attachment 1242551 [details]
Script for generating lvm filter
Germano, can you test the attached script (attachment 1242551 [details]) for generating lv filter automatically? Example usage - on a hypervisor: # ./gen-lvm-filter filter = [ "a|^/dev/vda2$|", "r|.*|" ] This filter replaces the blacklist filter suggested earlier. Using a whitelist we prevent: - auto activation of lvs on shared storage - auto activation of guest lvs (since lv cannot access ovirt lvs) - accidental access of shared storage on a hypervisor Using this should also improve boot time of hypervisors with a lot of devices, storage domains and volumes. See also https://gerrit.ovirt.org/70899 if you have ideas how to imporve this. The generated filter may not be good enough, we need to test it in the field before we use it as part of vdsm configuration. Hi Nir, Sounds like the best idea so far. Not sure if you want this to work on 4.0 node, I suppose you do as now the user can have some sort of customization over the lvm layout. I added some comments on gerrit about it. I'm having trouble to run this on 3.6 Hosts. The import from vdsm is failing on 'commands'. I copied it over and then it started failing o compat/CPopen. Our reproducers are currently on 3.6 and the customers who are willing to help are also on 3.6. Can we have a version (or make sure the same code) runs on 3.6 too? I ran it on a 4.0 host attached to iSCSI and it seems to work fine, but it's a simpler case then those we have on 3.6 @Fabian please see comment #125. @Nir please see comment #135. (In reply to Germano Veit Michel from comment #125) > @Fabian > > 1. Is there any easy way to modify RHEV-H (legacy) appliance to inject a > custom lvm.conf? I want to run a test without the persist mechanism. Not easy, but possible. The initrd is stored in a writable partition so you can modify or regenerate it. > 2. I also found we have two lvm.conf in the squashfs. /etc/lvm.conf and > /etc/lvm/lvm.conf, not sure if this is correct. That's a good question. I am not sure why there are two. needs to be investigated. > 3. Do you have any idea at what time (during boot) the HostVG-Config files > are mounted? I couldn't find it from the journal logs. Pretty early - What are you looking for? For all questions I'd suggets to reach out to dougsland@ in an email and CC sbonazzo@ and myself. (In reply to Germano Veit Michel from comment #135) Germano, thanks for testing this! Latest version should work with any rhev version on rhel 7. Created attachment 1242784 [details]
Script for generating lvm filter
This version remove the unneeded dependencies on vdsm code that limit the usage to 4.1.
Thanks Nir! I added two more details on gerrit. I fixed those with a "dumb patch" just to be able to run on our reproducer, it it gives the correct filter for that host, which is great: # ./gen-lvm-filter filter = [ "a|^/dev/mapper/2a802d0e800d00000p4$|", "r|.*|" ] # pvs PV VG Fmt Attr PSize PFree /dev/mapper/2a802d0e800d00000p4 HostVG lvm2 a-- 60.00g 404.00m /dev/mapper/360014380125989a10000400000480000 76dfe909-20a6-4627-b6c4-7e16656e89a4 lvm2 a-- 99.62g 5.38g /dev/mapper/360014380125989a100004000004c0000 603f851b-7388-49f1-a8cc-095557ae0a20 lvm2 a-- 99.62g 79.50g I think that 2a802d0e800d00000p2 is a quite specific problem (EFI) and the live-rw only affects vintage RHEV-H. But nothing prevents someone from hitting similar issues on RHEL with custom layouts so I guess we need to fix them anyway. Once we have this fixed I'll forward to a customer who is willing to help. Created attachment 1243580 [details]
Script for generating lvm filter
Created attachment 1243605 [details]
Script for generating lvm filter
(In reply to Germano Veit Michel from comment #140) Please test next version. I simplified the way devices are found - we don't look now at mounted filesystems, but instead we find all vgs which are not ovirt vgs (do not have RHAT_storge_domain tag), and we include the devices included by these vgs in the filter, except ovirt lvs (used as pv on a guest). I think this should be pretty safe, limiting the chance of breaking esoteric user setup. After this filter is applied, if the user want to add a new device, the device must be added manually to the filter. Nir, thanks again. I agree the new logic is safer, but it looks like we still have problems. 1) The regex is not working anymore. Please check my comment on gerrit for more details. I am also attaching here the output of the vgs command in an affected host to help you develop this. This host is our reproducer and its got 2 ovirt LVs as PVs. 2) Not sure how we will handle Direct LUNs, but currently it seems to be a problem (not sure if you want to handle this later though). In the attached vgs you will see /dev/mapper/360014380125989a10000400000500000, that's the "Direct LUN". In fact its not really a direct LUN on RHEV, this is a blade system which sees the other blade LUN. But from our perspective it could well be a direct lun for RHV that is presented via FC. We need to find out it's not in used and filter it out too I guess. Hopefully this helps. Created attachment 1243821 [details]
vgs -o vg_name,pv_name --noheading --select 'vg_tags != {RHAT_storage_domain}'
output of command for debugging/developing
> After boot all LVs are active and vdsm is supposed to deactivate them. But
> vdsm fails to deactivate a few ones[1]. In the case above, myvg-lvol0
> prevents 603f851b from being deactivated by vdsm on startup. We don't want
> that myvg-lvol0 to be found or activated.
>
> [1] ones that have a PV/VG directly on top of them (no partitions), which
> were scanner and had internal LVs activated.
As it is, the analysis of this problem is incomplete. Why should vdsm fail to deactivate VGs on raw disks, as opposed to those on parititoned disks? To wit, are you certain that putting VGs on a partition is an actual work-around, as opposed to a happy accident?
(In reply to Otheus from comment #148) > As it is, the analysis of this problem is incomplete. Why should vdsm fail > to deactivate VGs on raw disks, as opposed to those on parititoned disks? > To wit, are you certain that putting VGs on a partition is an actual > work-around, as opposed to a happy accident? First of all vdsm doesn't deactivate VGs, it deactivates some ovirt LVs. The partitioned disks prevent auto activation of Guests LVs internal to the ovirt LVs which are raw disks because the internal VGs are not found. This allows vdsm to deactivate the ovirt LVs as intended. (In reply to Germano Veit Michel from comment #149) > ... The > partitioned disks prevent auto activation of Guests LVs internal to the > ovirt LVs which are raw disks because the internal VGs are not found. This smells like danger. I admit there's a lot I don't understand here, but this I understand the least: how is it "internal VGs" (you mean, internal to a Guest VM, right?) are NOT found on partitioned disks that are (at LVM2-scan time) visible to the (host) system? > This allows vdsm to deactivate the ovirt LVs as intended. Sorry, but B does not logically follow from A. If somehow a partitioned disk prevents auto-activation of an LV, then vdsm cannot deactivate it -- it was never activated. However, if vdsm is deactivating LVs of VGs that came from partitioned disks, as opposed to VGs that came from non-partitioned disks, I ask: why/how the hell does it do one but not the other? Regardless of the previous question, I think the first is the most pertinent. (In reply to Otheus from comment #150) > This smells like danger. I admit there's a lot I don't understand here, but > this I understand the least: how is it "internal VGs" (you mean, internal to > a Guest VM, right?) are NOT found on partitioned disks that are (at > LVM2-scan time) visible to the (host) system? From what I understand it simply finds a partition table signature and skips the device. > Sorry, but B does not logically follow from A. If somehow a partitioned disk > prevents auto-activation of an LV, then vdsm cannot deactivate it -- it was > never activated. However, if vdsm is deactivating LVs of VGs that came from > partitioned disks, as opposed to VGs that came from non-partitioned disks, I > ask: why/how the hell does it do one but not the other? I think you did not understand the bug. Maybe because most of the comments are private. The partitioned Guest disk prevents the host from discovering AND activating the internal guest LVs. Regardless of the Guest LVM, the ovirt LVs are always activated on block storage on connection/boot, and vdsm deactivates the ovirt LVs when it starts. Now, if a guest LV was scanned and is active vdsm will always fail to deactivate the ovirt LV because it's open. This is our problem as it leads to numerous other issues. I suggest you to read our customer portal article [1], especially the diagnostic steps which shows what holds what from being deactivated. [1] https://access.redhat.com/solutions/2662261 (In reply to Otheus from comment #148) > > After boot all LVs are active and vdsm is supposed to deactivate them. But > > vdsm fails to deactivate a few ones[1]. In the case above, myvg-lvol0 > > prevents 603f851b from being deactivated by vdsm on startup. We don't want > > that myvg-lvol0 to be found or activated. > > > > [1] ones that have a PV/VG directly on top of them (no partitions), which > > were scanner and had internal LVs activated. > > As it is, the analysis of this problem is incomplete. Why should vdsm fail > to deactivate VGs on raw disks, as opposed to those on parititoned disks? To > wit, are you certain that putting VGs on a partition is an actual > work-around, as opposed to a happy accident? Hi Otheus, When a guest is using ovirt raw volume, the guest see the same device seen by the host. If the guest create a pv on the lv directly, both the guest and the host are seeing a device with a pv header. It looks like this (filtered to show the interesting items): # lvs -o vg_name,lv_name,devices,attr VG LV Devices Attr 7df95b16-1bd3-4c23-bbbe-b21d403bdcd8 e288d725-f620-457e-b9e9-ea8a4edf89c4 /dev/mapper/360014052c81462a280847e8a6e3af8cd(239) -wi-ao---- guest-vg-vol guest-lv-1 /dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/e288d725-f620-457e-b9e9-ea8a4edf89c4(0) -wi------- guest-vg-vol guest-lv-2 /dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/e288d725-f620-457e-b9e9-ea8a4edf89c4(512) -wi------- (Note that the guest lvs are not active, this system run with lvmetad disabled) When ovirt lv is activated (e.g, when starting a vm), udev rules trigger a pvscan of the new device (/dev/vgname/lvname). If the device is used as a pv (e.g. guest created a pv from the device), lvm will activate the lvs inside the ovirt raw lvs. The active guest lvs are "holding" the ovirt raw lvs: # realpath /dev/7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/e288d725-f620-457e-b9e9-ea8a4edf89c4 /dev/dm-38 # ls /sys/block/dm-38/holders/ dm-39 dm-40 # realpath /dev/guest-vg-vol/guest-lv-* /dev/dm-39 /dev/dm-40 When you stop the vm, deactivation of the raw ovirt volume will fail (from vdsm log): CannotDeactivateLogicalVolume: Cannot deactivate Logical Volume: ('General Storage Exception: ("5 [] [\' Logical volume 7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/e288d725-f620-457e-b9e9-ea8a4edf89c4 is used by another device.\']\\n7df95b16-1bd3-4c23-bbbe-b21d403bdcd8/[\'e288d725-f620-457e-b9e9-ea8a4edf89c4\']",)',) To deactivate it, you must deactivate first the lvs "holding it": # vgchange -an guest-vg-vol 0 logical volume(s) in volume group "guest-vg-vol" now active For iSCSI storage, the fix to this issue is to disable lvmetad service, which is doing event based auto activation. Guest lvs will still be seen on the host when running lvm commands, but lvm will not try to activate the lvs. For FC storage, activation is done during boot. The fix is to have a lvm filter blacklisting ovirt lvs, or better, whitelisting the devices needed by the host. A whitelist filter will also solve the issue with guest lvs on luns not used as ovirt lvs. These luns may be used by vms directly or unused luns that were just added to the system. *** Bug 1421424 has been marked as a duplicate of this bug. *** Ryan - are you sure it's a dup? Don't we need to do any changes in RHVH (initrd?) to ensure there's no auto-activation? Is the work on VDSM enough? I'm positive this is a dup -- confirmed with Sahina as well. The bug against RHVH is adding the host to the engine, which has spurious output from lvmetad. Installation itself is ok, so we don't need any changes to the initrd. In this bug we track only the issue of dynamic activation of guest logical volumes on RHV raw volumes when using iSCSI storage. This issue is caused by lvmetad, which is not compatible with RHV shared storage and causes also other trouble. This service is disabled in 4.1.1. The issue of activation of guest logical volumes during boot on when using FC storage will be track in another bug (we have several lvm bugs related to this). Nir, can you please add some doctext explain what was done here? Thanks! Nir, I saw in comment 48 a way to partially reproduce this issue, should I use this scenario for verification procedure, or is there something else? and what is the expected result after the fix? another thing - should it be tested on both RHEL 7.3 and RHV-H? or one of them is enough? (In reply to Natalie Gavrielov from comment #159) > Nir, > I saw in comment 48 a way to partially reproduce this issue, should I use > this scenario for verification procedure, or is there something else? Yes, this is the best way to reproduce the issue of guest lvs visible on a host. > and what is the expected result after the fix? After the fix, guests lvs will not be visible on the host when the vm disk lv is not active. With iSCSI storage, the vm disk lv should be active only when the vm is running, or when performing storage operations on this lv (e.g. copy disk to another storage domain). With FC storage, all lvs are activated during boot, and vdsm deactivate all unused lvs during vdsm startup. After a host becomes "UP" in engine, all unused lvs should be in active. Note that special lvs used by vdsm (e.g. inbox, outbox) are always active. When starting a vm, the vm lvs will be activated. At this point, you will be able to see guest when using "lvs" command. But the guest lvs should not be active. Also, since the guest lvs are not active, when stopping the vm, the vm lvs should be deactivated. Previously if guest lv was active, the vm lv would fail to deactivate. > another thing - should it be tested on both RHEL 7.3 and RHV-H? or one of > them is enough? You should test with both so make sure we son't have any surprises. Scenario used: described in comment 48, (the first one) Environment: Red Hat Enterprise Linux Server release 7.3 (Maipo) Step 6, output: ---------------------------------------------------------------------------- [root@storage-ge9-vdsm2 images]# pvscan --cache [root@storage-ge9-vdsm2 images]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert 2c09d589-01a9-4a86-9342-95fb2b6f57e7 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 11.00g <-- This one is attached to the vm 3ab6a9ae-63b3-4624-8b08-1cf59e8c1698 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 6.00g 5c291905-f617-46ce-b167-2351aeeebf1d 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 1.00g cc23e1a4-0df0-4d97-962b-2b39503e2ee9 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 10.00g d8ddae5e-bf6c-4eef-b0e9-9ac570ed927e 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 128.00m db636fb0-caaa-4764-bfe0-c202dae20328 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 128.00m ids 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-ao---- 128.00m inbox 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 128.00m leases 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 2.00g master 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 1.00g metadata 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 512.00m outbox 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 128.00m xleases 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 6.00g 79f3acdf-66ad-4836-87c2-37c21d7871b8 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 128.00m aa0a43d7-4652-4203-a3eb-86199e6c7b77 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 128.00m ids 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-ao---- 128.00m inbox 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 128.00m leases 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 2.00g master 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 1.00g metadata 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 512.00m outbox 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 128.00m xleases 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 6.00g a0b0fd34-f61f-4563-8260-7bc160f186b1 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 128.00m aabd877e-c506-4e95-897e-6d472eeeb529 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 128.00m ids 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-ao---- 128.00m inbox 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 128.00m leases 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 2.00g master 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 1.00g metadata 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 512.00m outbox 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 128.00m xleases 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 1.00g root VolGroup01 -wi-ao---- 17.31g guest-lv guest-vg -wi-a----- 10.00g guest-lv-2 guest-vg -wi-a----- 5.00g guest-lv-3 guest-vg-2 -wi-a----- 10.00g guest-lv-4 guest-vg-2 -wi-a----- 5.00g ------------------------------------------------------------------------------ 1. 2c09d589-01a9-4a86-9342-95fb2b6f57e7 is the volume that's attached to the vm. When the vm is not running it shows: -wi------- and when the vm is running: -wi-ao---- 2. The lvs I created: guest-lv, guest-lv-2, guest-lv-3, guest-lv-4 show: -wi-a----- (regardless of the vm state) when running lvs on the same host I created the lvs on. and show -wi------- when running lvs command from another host (regardless of the vm state). In comment 160 it says: > After the fix, guests lvs will not be visible on the host when the vm disk lv is not active. I don't really understand why should I expect any lvs to not be visible? and also: > When starting a vm, the vm lvs will be activated. At this point, you will be able to see guest when using "lvs" command. But the guest lvs should not be active. When creating a lv it's automatically activated.. so I see it as active regardless of the vm state (-wi-a-----). Anything I'm missing here? (In reply to Natalie Gavrielov from comment #161) > Scenario used: described in comment 48, (the first one) This first scenario is how to reproduce this *without* the fix. Which version of vdsm did you test? Can you show the output of: systemctl status lvm2-lvmetad.service > Step 6, output: > ---------------------------------------------------------------------------- > [root@storage-ge9-vdsm2 images]# pvscan --cache > [root@storage-ge9-vdsm2 images]# lvs > LV VG > Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert > guest-lv guest-vg > -wi-a----- 10.00g > guest-lv-2 guest-vg > -wi-a----- 5.00g > guest-lv-3 guest-vg-2 > -wi-a----- 10.00g > guest-lv-4 guest-vg-2 > -wi-a----- 5.00g Can you share the output of: lvs -o vg_name,lv_name,attr,devices guest-vg guest-vg-2 These lvs are active, this is what we expect when lvmetad is active. > > When starting a vm, the vm lvs will be activated. At this point, you will be > able to see guest when using "lvs" command. But the guest lvs should not be > active. > > When creating a lv it's automatically activated.. so I see it as active > regardless of the vm state (-wi-a-----). If you create the lv inside the guest, it should not be activated on the host. Maybe you created the lvs on the host? It will be best if I can check the host myself. Scenario used: 1. Create a preallocated disk on iscsi storage domain, and attach it to a vm with an OS (this disk will not be running the OS). 2. Start vm, connect to it and create a pv,vg and lv on the disk attached from step 1: root@localhost ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 12G 0 disk sr0 11:0 1 1024M 0 rom vda 253:0 0 10G 0 disk ├─vda1 253:1 0 200M 0 part /boot ├─vda2 253:2 0 2G 0 part [SWAP] └─vda3 253:3 0 7.8G 0 part / [root@localhost ~]# pvcreate /dev/sda Physical volume "/dev/sda" successfully created. [root@localhost ~]# vgcreate bug-verification-guest-vg /dev/sda Volume group "bug-verification-guest-vg" successfully created [root@localhost ~]# lvcreate -n bug-verification-guest-lv-1 -L 4g bug-verification-guest-vg Logical volume "bug-verification-guest-lv-1" created. [root@localhost ~]# lvcreate -n bug-verification-guest-lv-2 -L 6g bug-verification-guest-vg Logical volume "bug-verification-guest-lv-2" created. 3. Shutdown vm. 4. Put host to maintenance. 5. Activate host, connect to it and run: [root@storage-ge9-vdsm1 ~]# pvscan --cache [root@storage-ge9-vdsm1 ~]# lvs ------------------------------------------------------------------------------- LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert 2c09d589-01a9-4a86-9342-95fb2b6f57e7 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 11.00g 3ab6a9ae-63b3-4624-8b08-1cf59e8c1698 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 6.00g 5c291905-f617-46ce-b167-2351aeeebf1d 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 1.00g 660cd2c3-88ee-48ac-ae13-74ca1b485052 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 12.00g cc23e1a4-0df0-4d97-962b-2b39503e2ee9 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 10.00g d8ddae5e-bf6c-4eef-b0e9-9ac570ed927e 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 128.00m db636fb0-caaa-4764-bfe0-c202dae20328 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 128.00m ids 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-ao---- 128.00m inbox 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 128.00m leases 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 2.00g master 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 1.00g metadata 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 512.00m outbox 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 128.00m xleases 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 6.00g 79f3acdf-66ad-4836-87c2-37c21d7871b8 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 128.00m aa0a43d7-4652-4203-a3eb-86199e6c7b77 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 128.00m ids 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-ao---- 128.00m inbox 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 128.00m leases 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 2.00g master 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 1.00g metadata 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 512.00m outbox 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 128.00m xleases 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 6.00g a0b0fd34-f61f-4563-8260-7bc160f186b1 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 128.00m aabd877e-c506-4e95-897e-6d472eeeb529 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 128.00m ids 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-ao---- 128.00m inbox 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 128.00m leases 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 2.00g master 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 1.00g metadata 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 512.00m outbox 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 128.00m xleases 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 1.00g root VolGroup01 -wi-ao---- 17.31g ------------------------------------------------------------------------------- * The 2 lvs created, bug-verification-guest-lv-1 and bug-verification-guest-lv-2 are not displayed in the list ** The preallocated disk created in step 1, appears as inactive (-wi-------). 660cd2c3-88ee-48ac-ae13-74ca1b485052 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 12.00g 6. Start the vm, and run the following commands on the host: [root@storage-ge9-vdsm1 ~]# pvscan --cache [root@storage-ge9-vdsm1 ~]# lvs ------------------------------------------------------------------------------- LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert 2c09d589-01a9-4a86-9342-95fb2b6f57e7 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 11.00g 3ab6a9ae-63b3-4624-8b08-1cf59e8c1698 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 6.00g 5c291905-f617-46ce-b167-2351aeeebf1d 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 1.00g 660cd2c3-88ee-48ac-ae13-74ca1b485052 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-ao---- 12.00g cc23e1a4-0df0-4d97-962b-2b39503e2ee9 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 10.00g d8ddae5e-bf6c-4eef-b0e9-9ac570ed927e 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 128.00m db636fb0-caaa-4764-bfe0-c202dae20328 4742316d-4091-4aad-99c9-6521c94ac3fd -wi------- 128.00m ids 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-ao---- 128.00m inbox 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 128.00m leases 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 2.00g master 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 1.00g metadata 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 512.00m outbox 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 128.00m xleases 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-a----- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 6.00g 79f3acdf-66ad-4836-87c2-37c21d7871b8 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 128.00m aa0a43d7-4652-4203-a3eb-86199e6c7b77 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi------- 128.00m ids 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-ao---- 128.00m inbox 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 128.00m leases 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 2.00g master 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 1.00g metadata 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 512.00m outbox 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 128.00m xleases 577b4f73-fa4d-4ead-b614-30c3ec3e9fdc -wi-a----- 1.00g 568226c0-8343-48f8-a74b-dbdf7d8a4105 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 6.00g a0b0fd34-f61f-4563-8260-7bc160f186b1 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 128.00m aabd877e-c506-4e95-897e-6d472eeeb529 800983a3-51f8-4977-ab08-b8d391c9da77 -wi------- 128.00m ids 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-ao---- 128.00m inbox 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 128.00m leases 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 2.00g master 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 1.00g metadata 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 512.00m outbox 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 128.00m xleases 800983a3-51f8-4977-ab08-b8d391c9da77 -wi-a----- 1.00g root VolGroup01 -wi-ao---- 17.31g bug-verification-guest-lv-1 bug-verification-guest-vg -wi------- 4.00g bug-verification-guest-lv-2 bug-verification-guest-vg -wi------- 6.00g ------------------------------------------------------------------------------- * The 2 lvs created, bug-verification-guest-lv-1 and bug-verification-guest-lv-2 are displayed, and are not shown as active: -wi-------. ** The preallocated disk created in step 1, should appear as active and open (with -wi-ao----). 660cd2c3-88ee-48ac-ae13-74ca1b485052 4742316d-4091-4aad-99c9-6521c94ac3fd -wi-ao---- 12.00g Environment: Red Hat Enterprise Linux Server release 7.3 (Maipo) Builds: vdsm-4.19.10.1-1.el7ev.x86_64 rhevm-4.1.1.7-0.1.el7.noarch Moving on to testing this on RHV-H (In reply to Nir Soffer from comment #162) > (In reply to Natalie Gavrielov from comment #161) Natalie, this looks fine so far. Scenario performed using RHV-H 1. [root@camel-vdsb ~]# cat /etc/lvm/lvm.conf | grep "use_lvmetad = 1" use_lvmetad = 1 2. Create a preallocated disk on iscsi storage domain, and attach it to a vm with an OS (this disk will not be running the OS). 3. Start vm, connect to it and create a pv,vg and lv on the disk attached from step 1. (will attach snapshot of the issued commands) 4. Shutdown vm. 5. Put host to maintenance. 6. Activate host, connect to it and run: [root@camel-vdsb ~]# pvscan --cache [root@camel-vdsb ~]# lvs ----------------------------------------------------------------------------- LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert 6a2c2fcd-bcc3-457a-b7dd-059ca14422e6 5b723c72-2428-4230-a77d-2580637fa263 -wi------- 8.00g 8ea07646-035b-4987-bed3-30a230686ced 5b723c72-2428-4230-a77d-2580637fa263 -wi------- 128.00m c227148a-3487-42cf-901a-709ef0cdd5c1 5b723c72-2428-4230-a77d-2580637fa263 -wi------- 128.00m ids 5b723c72-2428-4230-a77d-2580637fa263 -wi-ao---- 128.00m inbox 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 128.00m leases 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 2.00g master 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 1.00g metadata 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 512.00m outbox 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 128.00m xleases 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 1.00g 200ae2b3-11af-460a-bbe1-49967dd6d0cf c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m 4d423b3f-caf2-4690-ace7-40a99967c53a c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m 9dcb10bb-72bd-4c9c-89e3-4a3ed1a8e733 c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 7.00g ids c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m inbox c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m leases c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 2.00g master c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 1.00g metadata c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 512.00m outbox c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m xleases c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 1.00g pool00 rhvh_camel-vdsb twi-aotz-- 106.05g 4.80 0.33 rhvh-4.1-0.20170403.0 rhvh_camel-vdsb Vwi---tz-k 91.05g pool00 root rhvh-4.1-0.20170403.0+1 rhvh_camel-vdsb Vwi-aotz-- 91.05g pool00 rhvh-4.1-0.20170403.0 3.91 root rhvh_camel-vdsb Vwi-a-tz-- 91.05g pool00 3.88 swap rhvh_camel-vdsb -wi-ao---- 13.68g var rhvh_camel-vdsb Vwi-aotz-- 15.00g pool00 3.72 ----------------------------------------------------------------------------- * The 2 lvs created, bug-ver-guest-lv-1 and bug-ver-guest-lv-2 are not displayed in the list ** The preallocated disk created in step 1, appears as inactive (-wi-------). 6a2c2fcd-bcc3-457a-b7dd-059ca14422e6 5b723c72-2428-4230-a77d-2580637fa263 -wi------- 8.00g 6. Start the vm, and run the following commands on the host: [root@camel-vdsb ~]# pvscan --cache [root@camel-vdsb ~]# lvs ----------------------------------------------------------------------------- LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert 6a2c2fcd-bcc3-457a-b7dd-059ca14422e6 5b723c72-2428-4230-a77d-2580637fa263 -wi-ao---- 8.00g 8ea07646-035b-4987-bed3-30a230686ced 5b723c72-2428-4230-a77d-2580637fa263 -wi------- 128.00m c227148a-3487-42cf-901a-709ef0cdd5c1 5b723c72-2428-4230-a77d-2580637fa263 -wi------- 128.00m ids 5b723c72-2428-4230-a77d-2580637fa263 -wi-ao---- 128.00m inbox 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 128.00m leases 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 2.00g master 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 1.00g metadata 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 512.00m outbox 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 128.00m xleases 5b723c72-2428-4230-a77d-2580637fa263 -wi-a----- 1.00g bug-ver-guest-lv-1 bug-ver-guest-vg -wi------- 3.00g bug-ver-guest-lv-2 bug-ver-guest-vg -wi------- 2.00g 200ae2b3-11af-460a-bbe1-49967dd6d0cf c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m 4d423b3f-caf2-4690-ace7-40a99967c53a c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m 9dcb10bb-72bd-4c9c-89e3-4a3ed1a8e733 c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 7.00g ids c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m inbox c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m leases c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 2.00g master c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 1.00g metadata c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 512.00m outbox c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 128.00m xleases c874fd8d-6d69-4882-9ac9-61eff12d7a9b -wi------- 1.00g pool00 rhvh_camel-vdsb twi-aotz-- 106.05g 4.80 0.33 rhvh-4.1-0.20170403.0 rhvh_camel-vdsb Vwi---tz-k 91.05g pool00 root rhvh-4.1-0.20170403.0+1 rhvh_camel-vdsb Vwi-aotz-- 91.05g pool00 rhvh-4.1-0.20170403.0 3.91 root rhvh_camel-vdsb Vwi-a-tz-- 91.05g pool00 3.88 swap rhvh_camel-vdsb -wi-ao---- 13.68g var rhvh_camel-vdsb Vwi-aotz-- 15.00g pool00 3.72 ----------------------------------------------------------------------------- * The two lv's now displayed, bug-ver-guest-lv-1 and bug-ver-guest-lv-2, and are not shown as active: -wi-------. ** The preallocated disk appears as active and open: 6a2c2fcd-bcc3-457a-b7dd-059ca14422e6 5b723c72-2428-4230-a77d-2580637fa263 -wi-ao---- 8.00g Environment: hosted engine with 2 hosts Red Hat Virtualization Host 4.1 (el7.3) redhat-release-virtualization-host-content-4.1-0.14.el7.x86_64 redhat-virtualization-host-image-update-placeholder-4.1-0.14.el7.noarch redhat-release-virtualization-host-4.1-0.14.el7.x86_64 vdsm-4.19.10.1-1.el7ev.x86_64 rhevm-4.1.1.8-0.1.el7.noarch Created attachment 1271442 [details] command issued for creating pv,vg and lv in verification process, step 3 comment 165 The command to activate LVs is "pvscan --cache -aay", not "pvscan --cache". (In reply to David Teigland from comment #167) > The command to activate LVs is "pvscan --cache -aay", not "pvscan --cache". Thanks David, but we were not trying to activate lvs. pvscan --cache was needed to update lvmetad with changes made by vdsm using --config "global {use_lvmetad = 0}". It is not needed now that we disable lvmetad. |