Bug 1398918 - [z-stream clone - 4.0.7] All LVs are auto activated on the hypervisor in RHEL 7
Summary: [z-stream clone - 4.0.7] All LVs are auto activated on the hypervisor in RHEL 7
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.6.7
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.0.7
: ---
Assignee: Nir Soffer
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
Depends On: 1374545
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-27 10:04 UTC by rhev-integ
Modified: 2020-07-16 09:01 UTC (History)
27 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, the lvmetad daemon dynamically activated logical volumes on multipath devices including logical volumes created inside the virtual machine on top of Red Hat Virtualization Manager’s (RHV) logical volumes. This caused many issues, including the following: - Increasing the number of devices on the Red Hat Virtualization Host, slowing down any operation that needed to enumerate or scan devices. - Failure to deactivate RHV logical volumes because the guest logical volumes on top of them were active. This could lead to data corruption. - Errors when running lvm commands on the host, because guest logical volumes may use physical volumes not available on the Red Hat Virtualization Host. In this release, vdsm disables the lvmetad service and logical volumes are no longer activated dynamically. vdsm activates and deactivates the logical volumes as needed.
Clone Of: 1374545
Environment:
Last Closed: 2017-03-16 15:35:29 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2610081 0 None None None 2016-11-27 10:19:14 UTC
Red Hat Knowledge Base (Solution) 2662261 0 None None None 2016-11-27 10:19:14 UTC
Red Hat Product Errata RHBA-2017:0544 0 normal SHIPPED_LIVE vdsm 4.0.7 bug fix and enhancement update 2017-03-16 19:25:18 UTC
oVirt gerrit 64328 0 master ABANDONED tests: Rename lvmTests to new naming convention 2017-08-20 01:58:35 UTC
oVirt gerrit 64329 0 master MERGED tests: Add loopback module 2017-05-28 11:36:11 UTC
oVirt gerrit 64330 0 master POST guest-lvs: Add failing test for guest lvs 2017-03-25 16:04:17 UTC
oVirt gerrit 64367 0 master POST guest-lvs: Add lvm bootstrap tests 2017-03-25 16:04:05 UTC
oVirt gerrit 64368 0 master POST guest-lvs: Deactivate guest lvs during bootstrap 2017-03-25 16:04:30 UTC
oVirt gerrit 64369 0 master POST guest-lvs: Deactivate guest lvs during deactivation 2017-03-25 16:04:36 UTC
oVirt gerrit 64370 0 master POST guest-lvs: Skip foreign vgs during bootstrap 2017-03-25 16:04:24 UTC
oVirt gerrit 66893 0 None None None 2016-11-27 10:19:14 UTC

Comment 1 rhev-integ 2016-11-27 10:05:01 UTC
Description of problem:
When a hypervisor with FC storage is rebooted, the host sees all the LVs of the storage domain, fine. But then it LVM PV scan scans all these devices and finds the VM's LVM metadata in case it's raw disk! So we end up with dm-xxx in the host pointing to the VM's internal LVs. These devices are never cleared/removed so it ends up with lot's of stale LVs and mapper devices, which can lead to disk corruption.

Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160711.0.el7ev)

How reproducible:
100% on customer side

Steps to Reproduce:
1. Reboot host

Actual results:
- vdsm will fail to lvchange -an
- stale LVs in all hosts

Expected results:
- LV only open in the host running the VM
- No child dm-xxx devices from VM internal LVM.

Additional info:

Logs Host booting:
lvm: 140 logical volume(s) in volume group "b442d48e-0398-4cad-b9bf-992a3e663573" now active <--- storage domain
systemd: Started LVM2 PV scan on device 253:249.  <----- dm-249 is the LV of the VM disk (raw, prealloc)
lvm: 1 logical volume(s) in volume group "vg_pgsql" now active <--- this is VM internal!

Having this VM internal child, will make vdsm's lvchange -an fail when the VM stops.

$ cat sos_commands/devicemapper/dmsetup_info_-c | egrep 'pgsql|5853cd' | awk -F' ' '{print $1" dm-"$3}'
b442d48e--0398--4cad--b9bf--992a3e663573-5853cdf8--7b84--487e--ab70--827bf5b00140 dm-249 <--- LV of VM image
vg_pgsql-lv_pgsql dm-272                                                                 <--- internal VM stuff

vg_pgsql-lv_pgsql (253:272)
 `-b442d48e--0398--4cad--b9bf--992a3e663573-5853cdf8--7b84--487e--ab70--827bf5b...
    `-36000144000000010706222888cc3683f (253:107)
       |- (131:880)
       |- (67:992)
       |- (132:592)
       |- (68:704)
       |- (133:304)
       |- (69:416)
       |- (134:16)
       `- (70:128)

So in the HOST we have dm-272 which is VMs internal business relying on dm-249, which is the VM disk (LV). vdsm fails to deactivate dm-249 as well. The result is a full Data Center where ALL hosts have these LVs active, asking for trouble.

jsonrpc.Executor/4::ERROR::2016-09-06 19:53:08,144::dispatcher::76::Storage.Dispatcher::(wrapper) {'status': {'message': 'Cannot deactivate Logical Volume: (\'General Storage Exception: ("5 [] [\\\'  Logical volume b442d48e-0398-4cad-b9bf-992a3e663573/5853cdf8-7b84-487e-ab70-827bf5b00140 is used by another device.\\\']\\\\nb442d48e-0398-4cad-b9bf-992a3e663573/[\\\'ffd27b7d-5525-4126-9f59-5a26dedad157\\\', \\\'5853cdf8-7b84-487e-ab70-827bf5b00140\\\']",)\',)', 'code': 552}}

I think the root cause here is that this https://gerrit.ovirt.org/#/c/21291/ will fail to deactivate LVs in case the Disk LV already has children (VM internal).

This comment was originaly posted by gveitmic

Comment 12 rhev-integ 2016-11-27 10:06:32 UTC
(In reply to Germano Veit Michel from comment #0)

Germano, thanks for this detailed report.

I don't know if we can prevent systemd from scanning 
lvs inside other lvs, but we can prevent it from auto
activating lvs.

Can you check if disabling auto activation fixes this issue?

Edit /etc/lvm/lvm.conf:

    auto_activation_volume_list = []

This comment was originaly posted by nsoffer

Comment 13 rhev-integ 2016-11-27 10:06:41 UTC
> systemd: Started LVM2 PV scan on device 253:249

Peter, can you explain why systemd is looking for lvs inside another lv, and
why it is automatically activates these lvs?

Can we configure lvm to avoid this scan?

This comment was originaly posted by nsoffer

Comment 14 rhev-integ 2016-11-27 10:06:49 UTC
See also bug 1253640, both seems to be caused by the lvm auto-activation.

This comment was originaly posted by nsoffer

Comment 15 rhev-integ 2016-11-27 10:06:59 UTC
(In reply to Nir Soffer from comment #10)
> Can you check if disabling auto activation fixes this issue?
> 
> Edit /etc/lvm/lvm.conf:
> 
>     auto_activation_volume_list = []

Hi Nir,

I just asked the customer to try it. I will keep you updated.

Cheers

This comment was originaly posted by gveitmic

Comment 16 rhev-integ 2016-11-27 10:07:09 UTC
(In reply to Nir Soffer from comment #11)
> > systemd: Started LVM2 PV scan on device 253:249
> 
> Peter, can you explain why systemd is looking for lvs inside another lv, and
> why it is automatically activates these lvs?
> 

It's because the internal LV, if found, is just like any other LV. Unless you mark that somehow, LVM has no way to know whether this the LV is the one that should not be activated (...you might as well have a stack of LVs - LV on top of another LV without any VMs so from this point of view nothing is "internal" and you want to activate the LV in this case).

By default, LVM autoactivates all VGs/LVs it finds.

> Can we configure lvm to avoid this scan?

You have several ways:

  - You can set devices/global_filter to include only PVs which should be scanned and any LVs activated on the host and reject everything else (this also prevents any scans for all the other devices/LVs which contain further PVs/VGs inside.

  - You can mark LVs with tags and then set activation/auto_actiavation_volume_list to activate only LVs with certain tag. Or, without tagging, directly listing the VGs/LVs which should be autoactivated only. But this way, the VMs PVs inside are going to be scanned still, just the VGs/LVs not autoactivated.

  - You can mark individual LVs to be skipped on autoactivation (lvchange -K|--setactivationskip y). But this way, you will also prevent the autoactivation within guest system too as the flag to skip activation is stored directly in VG metadata!!! So then you'd need someone to call "vgchange/lvchagne -ay" - the direct activation (in contrast to "vgchange/lvchange -aay" - the autoactivation, which is used by boot scripts) inside the guest to activate the LV.

  - You can use new "LVM system id" (see man lvmsystemid) feature which marks VGs with system id automatically and then only the VGs created on that system are visible/accessible (again, in this case, the PVs inside are going to be scanned still because we need to get the "ID" from VG metadata to be able to do the comparison of system IDs.


If you really want to avoid scanning of internal PVs which are inside VG/LV by chance, the best is probably to use the global_filter to include only the PVs you know that are safe to access.

This comment was originaly posted by prajnoha

Comment 17 rhev-integ 2016-11-27 10:07:19 UTC
(In reply to Peter Rajnoha from comment #14)
> (In reply to Nir Soffer from comment #11)
> > > systemd: Started LVM2 PV scan on device 253:249
> > 
> > Peter, can you explain why systemd is looking for lvs inside another lv, and
> > why it is automatically activates these lvs?
> > 
> 
> It's because the internal LV, if found, is just like any other LV. Unless
> you mark that somehow, LVM has no way to know whether this the LV is the one
> that should not be activated (...you might as well have a stack of LVs - LV
> on top of another LV without any VMs so from this point of view nothing is
> "internal" and you want to activate the LV in this case).
> 
> By default, LVM autoactivates all VGs/LVs it finds.

(Also, when using autoactivation, the activation is based on udev events, so if the PV appears, it's scanned for VG metadata and LVs are autoactivated. Each LV activation generates another udev event and the procedure repeates - if there's any PV found inside the LV, the autoactivation triggers for that PV. It's because the is like any other block device and as such, it can contain further PVs/VGs/LVs stacked inside... So when it comes to autoactivation, it's domino effect - the activation triggers another activation and so on unless you stop LVM from doing that by the means I described in coment #14.)

This comment was originaly posted by prajnoha

Comment 18 rhev-integ 2016-11-27 10:07:29 UTC
(In reply to Peter Rajnoha from comment #14)
> (In reply to Nir Soffer from comment #11)
> > > systemd: Started LVM2 PV scan on device 253:249
> > 
> > Peter, can you explain why systemd is looking for lvs inside another lv, and
> > why it is automatically activates these lvs?
> > 
> 
> It's because the internal LV, if found, is just like any other LV. Unless
> you mark that somehow, LVM has no way to know whether this the LV is the one
> that should not be activated (...you might as well have a stack of LVs - LV
> on top of another LV without any VMs so from this point of view nothing is
> "internal" and you want to activate the LV in this case).
> 
> By default, LVM autoactivates all VGs/LVs it finds.
> 
> > Can we configure lvm to avoid this scan?
> 
> You have several ways:
> 
>   - You can set devices/global_filter to include only PVs which should be
> scanned and any LVs activated on the host and reject everything else (this
> also prevents any scans for all the other devices/LVs which contain further
> PVs/VGs inside.

This is an issue since we don't know what are the pvs that must be scanned
on this host.

We want to avoid scanning any pv created by vdsm, but there is no easy way
to detect these - basically anything under /dev/mapper/guid may be pv owned
by vdsm.

I don't think we can change multipath configuration / udev rules to link
devices elsewhere, since it can break other software using multipath devices.

Also we cannot use global_filter since it overrides filter used by vdsm
commands.

>   - You can mark LVs with tags and then set
> activation/auto_actiavation_volume_list to activate only LVs with certain
> tag. Or, without tagging, directly listing the VGs/LVs which should be
> autoactivated only. But this way, the VMs PVs inside are going to be scanned
> still, just the VGs/LVs not autoactivated.

We plan to disable auto activation (see comment 4), so this seems to be
the best option. Can you confirm that this should resolve this issue?

>   - You can use new "LVM system id" (see man lvmsystemid) feature which
> marks VGs with system id automatically and then only the VGs created on that
> system are visible/accessible (again, in this case, the PVs inside are going
> to be scanned still because we need to get the "ID" from VG metadata to be
> able to do the comparison of system IDs.

This will not work for shared storage, the vg/lvs created on the spm host
must be accessible on other hosts.

This comment was originaly posted by nsoffer

Comment 19 rhev-integ 2016-11-27 10:07:39 UTC
(In reply to Nir Soffer from comment #16)
> We plan to disable auto activation (see comment 4), so this seems to be
> the best option. Can you confirm that this should resolve this issue?
> 

The disks (LVs inside which the PV is found) is still going to be scanned. Only activation/global_filter prevents this scan. But yes, the LVs found inside won't get activated. However, if you disable autoactivation completely, no LV will get activated at boot, not even the ones on the host, if you have any LVs there.

> >   - You can use new "LVM system id" (see man lvmsystemid) feature which
> > marks VGs with system id automatically and then only the VGs created on that
> > system are visible/accessible (again, in this case, the PVs inside are going
> > to be scanned still because we need to get the "ID" from VG metadata to be
> > able to do the comparison of system IDs.
> 
> This will not work for shared storage, the vg/lvs created on the spm host
> must be accessible on other hosts.

You can share the same ID for all the hosts where you need the VGs/LVs to be visible and accessible (see also "lvmlocal" or "file" system_id_source in man lvmsystemid).

This comment was originaly posted by prajnoha

Comment 20 rhev-integ 2016-11-27 10:07:47 UTC
(In reply to Peter Rajnoha from comment #17)
> (In reply to Nir Soffer from comment #16)
> > We plan to disable auto activation (see comment 4), so this seems to be
> > the best option. Can you confirm that this should resolve this issue?
> > 
> 
> The disks (LVs inside which the PV is found) is still going to be scanned.
> Only activation/global_filter prevents this scan. But yes, the LVs found
> inside won't get activated. However, if you disable autoactivation
> completely, no LV will get activated at boot, not even the ones on the host,
> if you have any LVs there.

The admin can fix this by adding the needed lvs to the
auto_activation_volume_list, right?

We cannot work with auto activate everything policy, only with auto activate
only the volumes specified by the admin.

> > >   - You can use new "LVM system id" (see man lvmsystemid) feature which
> > > marks VGs with system id automatically and then only the VGs created on that
> > > system are visible/accessible (again, in this case, the PVs inside are going
> > > to be scanned still because we need to get the "ID" from VG metadata to be
> > > able to do the comparison of system IDs.
> > 
> > This will not work for shared storage, the vg/lvs created on the spm host
> > must be accessible on other hosts.
> 
> You can share the same ID for all the hosts where you need the VGs/LVs to be
> visible and accessible (see also "lvmlocal" or "file" system_id_source in
> man lvmsystemid).

Ok, this way looks good to solve bug 1202595 and . Can you confirm on that bug?

This comment was originaly posted by nsoffer

Comment 21 rhev-integ 2016-11-27 10:07:58 UTC
(In reply to Nir Soffer from comment #18)
> The admin can fix this by adding the needed lvs to the
> auto_activation_volume_list, right?
> 
> We cannot work with auto activate everything policy, only with auto activate
> only the volumes specified by the admin.
> 

Sure, if that's what the configuration setting is for... (...the only issue is that it requires some manual actions/configuration from admins).

> > > >   - You can use new "LVM system id" (see man lvmsystemid) feature which
> > > > marks VGs with system id automatically and then only the VGs created on that
> > > > system are visible/accessible (again, in this case, the PVs inside are going
> > > > to be scanned still because we need to get the "ID" from VG metadata to be
> > > > able to do the comparison of system IDs.
> > > 
> > > This will not work for shared storage, the vg/lvs created on the spm host
> > > must be accessible on other hosts.
> > 
> > You can share the same ID for all the hosts where you need the VGs/LVs to be
> > visible and accessible (see also "lvmlocal" or "file" system_id_source in
> > man lvmsystemid).
> 
> Ok, this way looks good to solve bug 1202595 and . Can you confirm on that
> bug?

Yes, this should resolve the issue (see also bug #867333) as far as VG metadata is readable so we can read the ID and then decide whether the VG is allowed or not on that system.

This comment was originaly posted by prajnoha

Comment 27 rhev-integ 2016-11-27 10:08:50 UTC
I renamed the bug to reflect the important issue in this bug. We already know that
guest created lvs are accessible via lvm, see bug 1202595.

This comment was originaly posted by nsoffer

Comment 36 rhev-integ 2016-11-27 10:10:10 UTC
David, in this bug we recommended to disable lvm auto activation by setting:

    auto_activation_volume_list = []

Based on your comment:
https://bugzilla.redhat.com/show_bug.cgi?id=1303940#c52

Do you think we should also recommend disabling lvmetad by setting

    use_lvmetad = 0

The results are not clear yet, see comment 32.

This comment was originaly posted by nsoffer

Comment 37 rhev-integ 2016-11-27 10:10:21 UTC
Sorry, my comment in the other bz was misleading.  You still want to set

auto_activation_volume_list = []

to prevent the system (lvm/systemd/udev) from automatically activating LVs, whether use_lvmetad is 0 or 1.  So in your case, you want to both disable caching by setting use_lvmetad=0 in lvm.conf, and disable autoactivation by setting auto_activation_volume_list = [ ] in lvm.conf.

This comment was originaly posted by teigland

Comment 49 rhev-integ 2016-11-27 10:12:08 UTC
Is auto_activation_volume_list = [] set in the copy of lvm.conf used during boot (initramfs)?  Something must be calling vgchange or lvchange to activate LVs, but nothing is coming to mind at the moment.  Will have to look into that more on Monday.

This comment was originaly posted by teigland

Comment 50 rhev-integ 2016-11-27 10:12:18 UTC
I could partially reproduce this issue on rhel 7.3 beta and vdsm master with
iscsi storage.

1. Setup standard lvm.conf:
   - use_lvmetad=1
   - no auto_activation_volume_list option

2. Crate preallocated disk on iscsi storage domain and attach to vm
   (4df47a96-8a1b-436e-8a3e-3a638f119b48)

3. In the guest, create pv, vg and lvs:

   pvcreate /dev/vdb
   vgcreate guest-vg /dev/vdb
   lvcreate -n guest-lv -L 10g guest-vg
   lvcreate -n guest-lv-2 -L 5g guest-vg

4. Shutdown vm

5. Put host to maintenance

6. Activate host

   pvscan --cache
   lvs

  3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao----  20.00g                                                    
  d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  ids                                  bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m                                                    
  inbox                                bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  leases                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a-----   2.00g                                                    
  master                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a-----   1.00g                                                    
  metadata                             bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m                                                    
  outbox                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  guest-lv                             guest-vg                             -wi-a-----  10.00g                                                    
  guest-lv-2                           guest-vg                             -wi-a-----   5.00g                                                    
  lv_home                              vg0                                  -wi-ao---- 736.00m                                                    
  lv_root                              vg0                                  -wi-ao----   7.37g                                                    
  lv_swap                              vg0                                  -wi-ao----   7.36g                                                    

   - All lvs were activated
   - the raw lvs used as guest pc is active and open
   - guest created lvs are active


On this system, disabling lvmetad fixes the issue with open lvs:

1. Disable lvmetad

    systemctl stop lvm2-lvmetad.service lvm2-lvmetad.socket
    systemctl mask lvm2-lvmetad.service lvm2-lvmetad.socket
 
2. Edit /etc/lvm/lvm.conf:

   use_lvmetad = 0
   
   Note: I did not set auto_activation_volume_list, since this host 
   won't boot with this setting. Boot fail with dependcy for /home

3. Move host to maintenance

4. Reboot host

5. Activate host

After boot, all disk lvs are inactive, and guest lvs do not show
in lvm commands:

  3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m                                                    
  4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-------  20.00g                                                    
  d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m                                                    
  ids                                  bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m                                                    
  inbox                                bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  leases                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a-----   2.00g                                                    
  master                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a-----   1.00g                                                    
  metadata                             bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m                                                    
  outbox                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  lv_home                              vg0                                  -wi-ao---- 736.00m                                                    
  lv_root                              vg0                                  -wi-ao----   7.37g                                                    
  lv_swap                              vg0                                  -wi-ao----   7.36g     

6. Starting the vm using the raw disk with guest lvs

The guest lvs show when the raw lv is activated (opened by qemu)

  3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m                                                    
  4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao----  20.00g                                                    
  d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m                                                    
  ids                                  bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m                                                    
  inbox                                bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  leases                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a-----   2.00g                                                    
  master                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a-----   1.00g                                                    
  metadata                             bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m                                                    
  outbox                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  guest-lv                             guest-vg                             -wi-------  10.00g                                                    
  guest-lv-2                           guest-vg                             -wi-------   5.00g                                                    
  lv_home                              vg0                                  -wi-ao---- 736.00m                                                    
  lv_root                              vg0                                  -wi-ao----   7.37g                                                    
  lv_swap                              vg0                                  -wi-ao----   7.36g                                                    
  
7. Shutting down the vm hide the guest lvs again

  3628a407-ef01-417f-8b5e-e88e87896477 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m                                                    
  4df47a96-8a1b-436e-8a3e-3a638f119b48 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-------  20.00g                                                    
  d502b6ad-e623-472f-bdc7-453765089f55 bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi------- 128.00m                                                    
  ids                                  bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-ao---- 128.00m                                                    
  inbox                                bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  leases                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a-----   2.00g                                                    
  master                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a-----   1.00g                                                    
  metadata                             bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 512.00m                                                    
  outbox                               bb85ee2f-d674-489f-9377-3eb1f176e8fb -wi-a----- 128.00m                                                    
  lv_home                              vg0                                  -wi-ao---- 736.00m                                                    
  lv_root                              vg0                                  -wi-ao----   7.37g                                                    
  lv_swap                              vg0                                  -wi-ao----   7.36g                                                    


To hide both guest lvs and rhev lvs from the host, I tried this
filter in lvm.conf:

    filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|" ] 

With this filter, lvs show only local lvs:

  lv_home vg0 -wi-ao---- 736.00m                                                    
  lv_root vg0 -wi-ao----   7.37g                                                    
  lv_swap vg0 -wi-ao----   7.36g                                                    

But when starting the vm using the raw lv with guest pv,
the guest lvs appear again:

  guest-lv   guest-vg -wi-------  10.00g                                                    
  guest-lv-2 guest-vg -wi-------   5.00g                                                    
  lv_home    vg0      -wi-ao---- 736.00m                                                    
  lv_root    vg0      -wi-ao----   7.37g                                                    
  lv_swap    vg0      -wi-ao----   7.36g                                                    

To keep the guest lvs hidden, I tried this filter:

    filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|", "r|^/dev/disk/by-id/dm-uuid-LVM-.*|" ]

lvs show now only the local lvs.

This filter may not work with RHEVH, you may need to add the
internal lvs to the filter.

Vdsm is not effected by this filter since it overrides the filter 
in all lvm commands, using --config option.

Note that sos commands need to override this filter to be able to
see shared rhev lvs:

    lvs --config 'devices { filter = [ "a|.*|" ] }'

So it seems that filter is the best way now.

Garmano, see also also comment 47.

This comment was originaly posted by nsoffer

Comment 53 rhev-integ 2016-11-27 10:12:46 UTC
Peter had a good explanation of the options above, so I'll just repeat some of that.

Ideally, you don't want RHEV PVs/LVs to be scanned or activated by lvm run from the host (by which I mean lvm commands not run explicitly by RHEV).  This means that the host's lvm.conf/global_filter (on root fs and initramfs) should exclude RHEV PVs.  (Instead of excluding RHEV PVs, you could also whitelist only non-RHEV PVs.  I'm not sure if whitelist or blacklist for the host would work better here.)

Without using the global_filter, the next best option as you've found may be to disable autoactivation in the host's lvm.conf (on root fs and initramfs).  This has limitations:
- It will not protect RHEV PVs/LVs from being seen by the host's lvm.
- It will not protect RHEV LVs from being activated by an unknown vgchange/lvchange -ay command run from the host that doesn't include the extra 'a' flag.
- It will protect RHEV LVs from being autoactivated by the host's own vgchange/lvchange -aay commands.

If you use one of these two methods, and RHEV LVs are still being activated by the host, outside of your own control, then those methods are not set up correctly, or there is a rogue vgchange/lvchange being run, or there's an lvm bug.

Peter also mentioned some more exotic options (e.g. system ID) which would probably take more effort to get working, but may be worth trying in a future version.  For now, global_filter or auto_activation_volume_list should be able to solve the problem of unwanted activation.

This comment was originaly posted by teigland

Comment 58 rhev-integ 2016-11-27 10:13:33 UTC
This bug reveals two issues:
1. systemd and lvm are too eager to activate anything on a system - this is
   a regression in rhel 7 compared with rhel 6.
2. vdsm startup deactivation does not handle ovirt lvs with guest lvs

The root cause is 1. We will work on configuring lvm during vdsm configuration.
This seems to be very delicate, requiring special filter and regenerating 
initramfs.

For 4.0 we can improve vdsm deactivation to handle lvs which are used as guest pvs.

Workarounds:
- setting up a lvm.conf filter (see comment 48) and regenerating initramfs
- or avoiding creating pvs directly on guest devices (without creating partition
  table).

This comment was originaly posted by nsoffer

Comment 59 rhev-integ 2016-11-27 10:13:44 UTC
Nir / David,
Is it possible to filter by lvm tags?
If yes - maybe it will give us more flexibility?

This comment was originaly posted by mkalinin

Comment 60 rhev-integ 2016-11-27 10:13:53 UTC
global_filter/filter in lvm.conf operate at the device level, and only take device path names.  tags operate at the VG level and can be used with autoactivation.

This comment was originaly posted by teigland

Comment 61 rhev-integ 2016-11-27 10:14:02 UTC
lol, I just now looked into Nir's commits. I am sorry. He is using the tags.

This comment was originaly posted by mkalinin

Comment 62 rhev-integ 2016-11-27 10:14:12 UTC
To add to the uses cases this bug affects, I believe direct lun attached to the guest with a VG on top of it, will have the same issue, right?

This comment was originaly posted by mkalinin

Comment 63 rhev-integ 2016-11-27 10:14:23 UTC
(In reply to Nir Soffer from comment #56)
> This bug reveals two issues:
> 1. systemd and lvm are too eager to activate anything on a system - this is
>    a regression in rhel 7 compared with rhel 6.
> 2. vdsm startup deactivation does not handle ovirt lvs with guest lvs
> 
> The root cause is 1. We will work on configuring lvm during vdsm
> configuration.
> This seems to be very delicate, requiring special filter and regenerating 
> initramfs.
> 
> For 4.0 we can improve vdsm deactivation to handle lvs which are used as
> guest pvs.
> 
> Workarounds:
> - setting up a lvm.conf filter (see comment 48) and regenerating initramfs
> - or avoiding creating pvs directly on guest devices (without creating
> partition
>   table).

Thank you Nir. I'm also happy about 2, quite sure it will prevent related issues in the future!

I have created a customer facing solution explicitly for the workarounds, both on RHEL/RHV-H and RHEV-H
https://access.redhat.com/solutions/2662261

* Will do some tests regarding the required filter on RHEV-H and update the solution as well.
* I'll suggest the customer to also evaluate the workaround in a test host.

Cheers,

This comment was originaly posted by gveitmic

Comment 65 rhev-integ 2016-11-27 10:14:42 UTC
(In reply to Nir Soffer from comment #56)
> Workarounds:
> - setting up a lvm.conf filter (see comment 48) and regenerating initramfs

I'm testing this in RHEV-H, to come up with a proper filter.

1. Added this to a RHEV-H host:

filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|", "r|^/dev/disk/by-id/dm-uuid-LVM-.*|", "a|^/dev/disk/by-id/dm-name-HostVG-.*|", "a|^/dev/disk/by-id/dm-name-live-.*|"]

2. ovirt-node-rebuild-initramfs

3. Reboot

4. lv-guest is active right after boot, but not open.

lvs --config 'devices { filter = [ "a|.*|" ] }' | grep lv-guest
  lv-guest                             vg-guest                             -wi-a-----   4.00m 

5. VDSM did not deactivate it because it was open(!?)?

storageRefresh::DEBUG::2016-09-28 04:27:09,074::lvm::661::Storage.LVM::(bootstrap) Skipping open lv: vg=76dfe909-20a6-4627-b6c4-7e16656e89a4 lv=6aacc711-0ecf-4c68-b64d-990ae33a54e3

Just to confirm it's the Guest one being seen in the host:

vg--guest-lv--guest (253:52)
 `-76dfe909--20a6--4627--b6c4--7e16656e89a4-6aacc711--0ecf--4c68--b64d--990ae33a54e3 (253:47)
    `-360014380125989a10000400000480000 (253:7)
       |- (8:64)
       `- (8:16)

Thoughts:

A) Is this happening exclusively in RHEV-H? 
B) Should the workaround in RHEV-H also include some of the previously discussed options? Like the volume activation one?

This comment was originaly posted by gveitmic

Comment 66 rhev-integ 2016-11-27 10:14:52 UTC
Just to complement with more data:

Even after applying that filter + regenerating initramfs

Disk LV boots up open.

lvs --config 'devices { filter = [ "a|.*|" ] }' | grep 6aacc
  6aacc711-0ecf-4c68-b64d-990ae33a54e3 76dfe909-20a6-4627-b6c4-7e16656e89a4 -wi-ao----   1.00g 

Guest LV is active

# lvs --config 'devices { filter = [ "a|.*|" ] }' | grep lv-guest
  lv-guest   vg-guest -wi-a----- 4.00m

vg--guest-lv--guest (253:52)
 `-76dfe909--20a6--4627--b6c4--7e16656e89a4-6aacc711--0ecf--4c68--b64d--990ae33a54e3 (253:47)
    `-360014380125989a10000400000480000 (253:7)
       |- (8:64)
       `- (8:16)

lvm.conf seems to be working as intended:
# lvs
  LV      VG        Attr       LSize  
  Config  HostVG    -wi-ao----  8.00m
  Data    HostVG    -wi-ao---- 39.92g
  Logging HostVG    -wi-ao----  2.00g
  Swap    HostVG    -wi-ao---- 17.68g
  lv_home vg_rhevh1 -wi------- 39.68g
  lv_root vg_rhevh1 -wi------- 50.00g
  lv_swap vg_rhevh1 -wi-------  9.83g

Probably initramfs was not updated properly?

This comment was originaly posted by gveitmic

Comment 67 rhev-integ 2016-11-27 10:15:03 UTC
OK, got it. ovirt-node-rebuild-initramfs is not pulling the modified lvm.conf, seems to be using it's own.

# cp /run/initramfs/live/initrd0.img .
# mv initrd0.img initrd0.gz
# gunzip unitrd0.gz
# file initrd0 
initrd0: ASCII cpio archive (SVR4 with no CRC)
# cpio -i -F initrd0
# find ./ -name lvm.conf
./etc/lvm/lvm.conf
# cat etc/lvm/lvm.conf              
global {
locking_type = 4
use_lvmetad = 0
}

Explains my results... Any ideas on how to update this in a supported way that customers can use? Or the only option is to do it manually (mount, dracut, umount)?

This comment was originaly posted by gveitmic

Comment 68 rhev-integ 2016-11-27 10:15:14 UTC
You always want to use the standard lvm.conf as the starting point and modify fields in that.

This comment was originaly posted by teigland

Comment 75 rhev-integ 2016-11-27 10:16:23 UTC
Reply to comment 69: My idea would be to allow passing some arguments to dracut, i.e. --lvmconf in this case.

We can work on such a solution once we know that the filtering is working correct. Btw I do not see a technical reason why the filters should not work in RHEV-H 3.6's initrd.

This comment was originaly posted by fdeutsch

Comment 76 rhev-integ 2016-11-27 10:16:35 UTC
From discussion with  Nir Soffer  - here are couple points
for better functionality of this vdsm system:

Every host in the system should have lvm.conf 'filter'&'global_filter' setting  set in a way - it will NOT see a  'SharedVG' mpath device.
(Such change needs to be reflected in initramdisk - so regeneration is needed)

This is IMHO the most 'complex' step - since lvm2 does not support some 'filter' chaining - the filter has to be properly configured on every host.

I'd always advise  'white-list' logic - so if the host knows it's using only  'sda' & 'sdb'  as a PV -   only those 2 device should be 'accepted' and all other devices 'rejected'.  But I've already seen way more complex filters - so this part of advice is not 'trivial' to be easily automated.

It's always possible to check via look at  'lvs -vvvv' output whether devices are rejected accordingly.

To validate which settings are in-use by command - see 'man lvmconfig'.

Once the  'host' is set to never ever see  SharedVG mpath device -  it's mostly done. Since easier part is now to ensure  every executed 'vdsm' command comes with special --config  option which  DOES make only SharedVG mpath visible and reject every other device  -   and it should also go with  'locking_type=4'  to ensure  host is not able to accidentally  modify anything on a VG (even it would be some internal lvm2 bug issue)

This should lead to a system -  where  'individual' lvm2 commands executed on a host in such system - DO NEVER influence state of 'SharedVG' - as well as they will never try to auto-active LVs, never try to fix invalid metadata and so on....

Also 'vdsm' will clearly provide his FULL control - which command across whole system may be working with sharedVG metadata.

I'd like to emphasize -  while  'vdsm' is running i.e. activation command on any host  - it should NOT try to modify  VG metadata anywhere else - especially if such VG consists of multiple PV - there is possibly to hit the 'race' where   read-only  metadata user could see partially updated metadata.
So 'update' of VG metadata  require exclusive access.


And a final comment - IMHO such configured host system then could possibly use 'lvmetad' locally for locally available devices - since there shall be no interference.

Just vdsm commands needs to go with --config  lvmetad=0

This comment was originaly posted by zkabelac

Comment 77 rhev-integ 2016-11-27 10:16:45 UTC
(In reply to Zdenek Kabelac from comment #74)
> Every host in the system should have lvm.conf 'filter'&'global_filter'
> setting  set in a way - it will NOT see a  'SharedVG' mpath device.
> (Such change needs to be reflected in initramdisk - so regeneration is
> needed)

Unfortunately we cannot use global_filter with current vdsm, sine it will
override vdsm filter, and vdsm will not be able to use shared storage.

We will consider switching to global_filter in future release. The reason we are
using filter is to allow an admin to reject certain devices from our filter.

This comment was originaly posted by nsoffer

Comment 78 rhev-integ 2016-11-27 10:16:55 UTC
(In reply to Zdenek Kabelac from comment #74)
> And a final comment - IMHO such configured host system then could possibly
> use 'lvmetad' locally for locally available devices - since there shall be
> no interference.
> 
> Just vdsm commands needs to go with --config  lvmetad=0

This requires modifying global_filter, this is not compatible with current vdsm
(3.5, 3.6, 4.0).

This comment was originaly posted by nsoffer

Comment 79 rhev-integ 2016-11-27 10:17:04 UTC
Fabian, maybe we should open a separate bug (RFE?) for node? I think it will be 
easier to get good filtering on node baked into node, before we can get a general
purpose solution for rhel or other systems.

This comment was originaly posted by nsoffer

Comment 80 rhev-integ 2016-11-27 10:17:15 UTC
Yes - if you seek solution without mods on vdsm side - then just skip 'global_filter' & 'locking_type=4' advise.

One has to just make sure that any host  is NOT masking mpath device needed by vdsm in its local lvm.conf file.

And since it's not possible to exclude vdsm mpath devices via global_filter user as well MAY NOT use lvmetad locally.

This comment was originaly posted by zkabelac

Comment 86 rhev-integ 2016-11-27 10:18:09 UTC
Hi Nir,
Can we consider steps 1 - 6 from comment #48 as a steps to reproduce this bug?
If not, can you please provide it?

This comment was originaly posted by ratamir

Comment 87 rhev-integ 2016-11-27 10:18:19 UTC
The old patches were trying to cleanup after lvm during vdsm bootstrap.

The new patch <https://gerrit.ovirt.org/66893> is handling the root cause by
adding a global filter rejecting ovirt lvs.

The configuration is generic and should work on any host, assuming that only
ovirt uses uuids for vg names.

After testing this, we will work on installing this file during vdsm configuration.

Germano, can we test this configuration in some the relevant cases?

This comment was originaly posted by nsoffer

Comment 88 rhev-integ 2016-11-27 10:18:28 UTC
(In reply to Raz Tamir from comment #84)
> Can we consider steps 1 - 6 from comment #48 as a steps to reproduce this
> bug?
Yes

This comment was originaly posted by nsoffer

Comment 89 rhev-integ 2016-11-27 10:18:39 UTC
(In reply to Nir Soffer from comment #85)
> The old patches were trying to cleanup after lvm during vdsm bootstrap.
> 
> The new patch <https://gerrit.ovirt.org/66893> is handling the root cause by
> adding a global filter rejecting ovirt lvs.
> 
> The configuration is generic and should work on any host, assuming that only
> ovirt uses uuids for vg names.
> 
> After testing this, we will work on installing this file during vdsm
> configuration.
> 
> Germano, can we test this configuration in some the relevant cases?

Hi Nir,

Absolutely. 

We currently have a Sev4 case open, I can check with that customer is he wants to help testing. We can also use our own reproducer from comment #54 (re-installing with RHEL). Or even better, do both.

But first, I have two questions:

1) You want me to cherry-pick both new and old patches, not just the new right?
(all the gerrits attached to the bz into latest stable vdsm + any dependency)

Not sure if we should go for latest master, especially if we are asking for a customer to help testing.

2) Does that filter in the new patch needs to go into initrd as well?

Thanks,
Germano

This comment was originaly posted by gveitmic

Comment 90 rhev-integ 2016-11-27 10:18:48 UTC
(In reply to Germano Veit Michel from comment #87)
> (In reply to Nir Soffer from comment #85)
> 1) You want me to cherry-pick both new and old patches, not just the new
> right?

No, just the new configuration. This is trivial to deploy on a customer machine
and compatible with any version of RHV.

> 2) Does that filter in the new patch needs to go into initrd as well?

It should, so guest lvs on RHV raw lvs are never active on the host.

If we find that this configuration is a good solution for this issue we will 
integrate this in vdsm-tool configure later.

This comment was originaly posted by nsoffer

Comment 91 rhev-integ 2016-11-27 10:18:56 UTC
Hi Nir,

I added the global filter to both initrd's and etc's lvm.conf and the raw disks LVs are not activated in Host upon reboot. So it looks good.

I am not sure if this will work for Direct LUNs though (Roman's bug). As AFAIK they don't follow the regex you specified in the global filter. Still, it looks like a solid step forward.

Below is only the test data.

# cat /etc/redhat-release 
Red Hat Enterprise Virtualization Hypervisor release 7.2 (20160711.0.el7ev)

initrd:
[root@rhevh-2 ~]# dir=`mktemp -d` && cd $dir
[root@rhevh-2 tmp.NShI7bl4lg]# cp /run/initramfs/live/initrd0.img .
[root@rhevh-2 tmp.NShI7bl4lg]# mv initrd0.img initrd0.gz
[root@rhevh-2 tmp.NShI7bl4lg]# gunzip initrd0.gz
[root@rhevh-2 tmp.NShI7bl4lg]# cpio -i -F initrd0
236815 blocks
[root@rhevh-2 tmp.NShI7bl4lg]# cat etc/lvm/lvm.conf
global {
locking_type = 4
use_lvmetad = 0
}
devices {
global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ]
}

etc:
cat /etc/lvm/lvm.conf | grep global_filter | grep -v '#'
	global_filter = [ "r|^/dev/mapper/[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9]--[a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9][a-f0-9]-.+|" ]

# lvs | awk -F' ' '{print $1,$3}' | egrep '\-wi\-a'
ids -wi-ao----
inbox -wi-a-----
leases -wi-a-----
master -wi-ao----
metadata -wi-a-----
outbox -wi-a-----
ids -wi-a-----
inbox -wi-a-----
leases -wi-a-----
master -wi-a-----
metadata -wi-a-----
outbox -wi-a-----
Config -wi-ao----
Data -wi-ao----
Logging -wi-ao----
Swap -wi-ao----

# dmsetup ls --tree
HostVG-Logging (253:69)
 `-2a802d0e800d00000p4 (253:4)
    `-2a802d0e800d00000 (253:0)
       `- (8:0)
603f851b--7388--49f1--a8cc--095557ae0a20-ids (253:18)
 `-360014380125989a100004000004c0000 (253:8)
    |- (8:80)
    `- (8:32)
HostVG-Swap (253:67)
 `-2a802d0e800d00000p4 (253:4)
    `-2a802d0e800d00000 (253:0)
       `- (8:0)
76dfe909--20a6--4627--b6c4--7e16656e89a4-inbox (253:28)
 `-360014380125989a10000400000480000 (253:7)
    |- (8:64)
    `- (8:16)
603f851b--7388--49f1--a8cc--095557ae0a20-master (253:20)
 `-360014380125989a100004000004c0000 (253:8)
    |- (8:80)
    `- (8:32)
HostVG-Data (253:70)
 `-2a802d0e800d00000p4 (253:4)
    `-2a802d0e800d00000 (253:0)
       `- (8:0)
603f851b--7388--49f1--a8cc--095557ae0a20-outbox (253:16)
 `-360014380125989a100004000004c0000 (253:8)
    |- (8:80)
    `- (8:32)
603f851b--7388--49f1--a8cc--095557ae0a20-metadata (253:15)
 `-360014380125989a100004000004c0000 (253:8)
    |- (8:80)
    `- (8:32)
603f851b--7388--49f1--a8cc--095557ae0a20-inbox (253:19)
 `-360014380125989a100004000004c0000 (253:8)
    |- (8:80)
    `- (8:32)
live-base (253:6)
 `- (7:1)
360014380125989a10000400000500000p2 (253:11)
 `-360014380125989a10000400000500000 (253:9)
    |- (8:96)
    `- (8:48)
76dfe909--20a6--4627--b6c4--7e16656e89a4-leases (253:26)
 `-360014380125989a10000400000480000 (253:7)
    |- (8:64)
    `- (8:16)
360014380125989a10000400000500000p1 (253:10)
 `-360014380125989a10000400000500000 (253:9)
    |- (8:96)
    `- (8:48)
76dfe909--20a6--4627--b6c4--7e16656e89a4-ids (253:27)
 `-360014380125989a10000400000480000 (253:7)
    |- (8:64)
    `- (8:16)
76dfe909--20a6--4627--b6c4--7e16656e89a4-metadata (253:24)
 `-360014380125989a10000400000480000 (253:7)
    |- (8:64)
    `- (8:16)
603f851b--7388--49f1--a8cc--095557ae0a20-leases (253:17)
 `-360014380125989a100004000004c0000 (253:8)
    |- (8:80)
    `- (8:32)
HostVG-Config (253:68)
 `-2a802d0e800d00000p4 (253:4)
    `-2a802d0e800d00000 (253:0)
       `- (8:0)
live-rw (253:5)
 |- (7:2)
 `- (7:1)
2a802d0e800d00000p3 (253:3)
 `-2a802d0e800d00000 (253:0)
    `- (8:0)
76dfe909--20a6--4627--b6c4--7e16656e89a4-master (253:29)
 `-360014380125989a10000400000480000 (253:7)
    |- (8:64)
    `- (8:16)
2a802d0e800d00000p2 (253:2)
 `-2a802d0e800d00000 (253:0)
    `- (8:0)
76dfe909--20a6--4627--b6c4--7e16656e89a4-outbox (253:25)
 `-360014380125989a10000400000480000 (253:7)
    |- (8:64)
    `- (8:16)
2a802d0e800d00000p1 (253:1)
 `-2a802d0e800d00000 (253:0)
    `- (8:0)

This comment was originaly posted by gveitmic

Comment 92 Raz Tamir 2016-12-26 14:28:44 UTC

*** This bug has been marked as a duplicate of bug 1374545 ***

Comment 93 Nir Soffer 2017-02-27 14:54:06 UTC
In this bug we track only the issue of dynamic activation of guest logical volumes
on RHV raw volumes when using iSCSI storage.

This issue is caused by lvmetad, which is not compatible with RHV shared storage
and causes also other trouble. This service is disabled in 4.0.7.

The issue of activation of guest logical volumes during boot on when using FC 
storage will be track in another bug (we have several lvm bugs related to this).

Comment 95 Emma Heftman 2017-03-02 10:06:12 UTC
Nir, please define whether or not this bug requires doc text.

Comment 96 Nir Soffer 2017-03-02 14:46:00 UTC
Add doc text.

Comment 98 Nir Soffer 2017-03-06 19:05:50 UTC
We have 2 issues that we can reproduce and verify here:

A. All lvs are activated after connecting to iSCSI storage domain

1. Setup system with one host, and one iSCSI storage domain.
2. Put host to maintenance
3. Activate host
4. Check the state of all lvs on this storage domain

on 4.0 - all lvs will be active

on 4.1 - only the special lvs, OVF_STORE lvs and lvs used by running
vms will be active

B. Guest lvs created inside a vm on a raw volume are activated on the host
   this will cause failure to deactivate lvs when shutting a vm.

1. Setup system with one host, and one iSCSI storage domain.
2. Create and start one vm running linux
3. Create a raw volume and attach it to the running vm
4. Login to the vm, and create a pv, vg and lv using the new raw disk
   assuming that the disk is connected using virtio-scsi at /dev/sdb:

    pvcreate /dev/sdb
    vgcreate guest-vg /dev/sdb
    lvcreate --name test-lv --size 1g guest-vg

5. On the host, check if the guest lv is active

On 4.0, run:

    pvscan --cache
    lvs guest-vg

expected results:
guest-lv will be active

On 4.1 run:

    lvs guest-vg

(pvscan --cache is not needed, since lvmetad is disabled)

expected result:
guest-lv will not be active

6. Stop the vm

On 4.0:

expected result:
- you should see error in vdsm log about deactivating the raw volume,
  "logical volume xxxyyy in use"
- Both the raw volume lv and guest-lv will remain active.

On 4.1:

expected results:
- the raw volume lv should be deactivated without error

Comment 99 Kevin Alon Goldblatt 2017-03-07 16:40:05 UTC
Verified with code:
-----------------------
vdsm-4.18.24-3.el7ev.x86_64
ovirt-engine-4.0.7.4-0.1.el7ev.noarch
rhevm-4.0.7.4-0.1.el7ev.noarch

Verified with the scenario above:

Moving to VERIFIED!

Comment 101 errata-xmlrpc 2017-03-16 15:35:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0544.html


Note You need to log in before you can comment on or make changes to this bug.