Bug 1411197 - Guest LVM gets wiped on RHEL6 Hosts
Summary: Guest LVM gets wiped on RHEL6 Hosts
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.7
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.1.0-beta
: ---
Assignee: Nir Soffer
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On: 1374545
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-01-09 04:32 UTC by Germano Veit Michel
Modified: 2020-03-11 15:43 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-16 08:16:35 UTC
oVirt Team: Storage
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2859661 None None None 2017-01-16 05:47:39 UTC

Description Germano Veit Michel 2017-01-09 04:32:16 UTC
Description of problem:

VM Internal's LVM VG goes missing. This was initially reported as a LVM bug via [2]. After adding filters to the Host, as per [1], the problem goes away. So the Host is involved - therefore opening this BZ to track it.

But [1] is only known to cause problems on RHEL7 Hosts. This is happening on RHEL6 Hosts and the symptoms are slightly different.
* On RHEL6, no Guest LVs are seen or active.
* But adding the filters makes the problem go away.

So there must still be some sort of interaction/unprotected access between RHEL6 Hosts and Guests with that Disk configuration. Different from RHEL7 but apparently still an issue.

The environment to hit this is exactly the same as [1]
* RAW+Preallocated Disks
* No partitions from guest side. (VM's PV on top of RHV SD LV)

The customer can easily reproduce the issue by running pvs in a loop within the VM.

AFAIK, RHEL6 has pretty much EOL'd in RHV from bugfix perspective. So the questions here are: 

1. Should we apply [1] filters on RHEL6 Hosts too and "fix" this via knowledge base, then once [1] is out recommend the upgrade?

2. Any pointers to what might be actually happening? Because we don't fully understand it yet.

[1] BZ #1374545 (RHEL7 Host BZ)
[2] BZ #1387819 (Guest LVM BZ - see this BZ to see in more detail how it happens)

Versions:
kernel-2.6.32-504.23.4.el6.x86_64
vdsm-4.16.20-1.el6ev.x86_64
lvm2-2.02.111-2.el6_6.3.x86_64

Comment 1 Germano Veit Michel 2017-01-09 04:33:57 UTC
Filter used is an initial version from BZ #1374545

filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|", "r|^/dev/disk/by-id/dm-uuid-LVM-.*|" ]

Comment 3 Nir Soffer 2017-01-09 08:07:28 UTC
(In reply to Germano Veit Michel from comment #1)
> Filter used is an initial version from BZ #1374545
> 
> filter = [ "r|^/dev/disk/by-id/dm-uuid-mpath-.*|",
> "r|^/dev/disk/by-id/dm-uuid-LVM-.*|" ]

This filter should not be used according to lvm developers.

The best would be to filter out ovirt lvs, see 
https://gerrit.ovirt.org/#/c/66893/1/static/etc/lvm/lvmlocal.conf

Or, if we can, have a filter whitelisting only the devices used
by the host - this is always the best solution if we can deploy it.


Zdenek, can you confirm that this filter should work on rhel 6?

Comment 4 Germano Veit Michel 2017-01-09 23:54:16 UTC
(In reply to Nir Soffer from comment #3)
> This filter should not be used according to lvm developers.
> 
> The best would be to filter out ovirt lvs, see 
> https://gerrit.ovirt.org/#/c/66893/1/static/etc/lvm/lvmlocal.conf
> 
> Or, if we can, have a filter whitelisting only the devices used
> by the host - this is always the best solution if we can deploy it.

Indeed. We are just using that old filter because the filter was suggested to the customer at a point in time we did not know about the newer filter yet. Since I'm having problems with the new filter in a different case, I decided to leave it this way.

But if you say whitelisting is better, I will go with your suggestion, it's possible on this specific case.

Thanks (again!) Nir

Comment 5 Germano Veit Michel 2017-01-13 01:36:06 UTC
Nir,

All good with white-listing. 

Any plans to have filters on RHEL6 hosts via vdsm at this point? Probably not right. We can handle this via Knowledge base, closing the BZ. Just please confirm.

Thanks

Comment 6 Nir Soffer 2017-01-13 13:44:07 UTC
(In reply to Germano Veit Michel from comment #5)
> Any plans to have filters on RHEL6 hosts via vdsm at this point? Probably
> not right. We can handle this via Knowledge base, closing the BZ. Just
> please confirm.

We plan to introduce a blacklist filter for ovirt volumes on 4.1 and 4.z. There
are no plans to backport this to older versions.

We also want to explore the option of automatically generated whitelist based on
fstab contents.

Comment 7 Germano Veit Michel 2017-01-16 05:47:39 UTC
(In reply to Nir Soffer from comment #6)
> We plan to introduce a blacklist filter for ovirt volumes on 4.1 and 4.z.
> There
> are no plans to backport this to older versions.

Thank you. So I've written a knowledge base solution for this.

> We also want to explore the option of automatically generated whitelist
> based on fstab contents.

Interesting. Let me know when we have it as we have a few customers who would benefit and possibly be interested in helping to develop/test it.

Feel free to close this BZ.

Comment 8 Allon Mureinik 2017-01-16 08:16:35 UTC
(In reply to Germano Veit Michel from comment #7)
> (In reply to Nir Soffer from comment #6)
> > We plan to introduce a blacklist filter for ovirt volumes on 4.1 and 4.z.
> > There
> > are no plans to backport this to older versions.
> 
> Thank you. So I've written a knowledge base solution for this.
> 
> > We also want to explore the option of automatically generated whitelist
> > based on fstab contents.
> 
> Interesting. Let me know when we have it as we have a few customers who
> would benefit and possibly be interested in helping to develop/test it.
> 
> Feel free to close this BZ.

As noted above, RHV 3.5.z is the last version to support EL6 hosts. RHV 3.6 and above only support EL7 hosts.
With RHV 4.0 up to its seventh zstream and RHV 4.1 coming out soon, we won't ship any more fixes in the 3.5 generation.
Closing as suggested above.


Note You need to log in before you can comment on or make changes to this bug.