Bug 1371939

Summary: thin_check run on VM disks by host (Guest LVs visible by host - LVM auto-activation bug?)
Product: [oVirt] vdsm Reporter: Rik Theys <rik.theys>
Component: CoreAssignee: Nir Soffer <nsoffer>
Status: CLOSED CURRENTRELEASE QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.18.15CC: amureini, bugs, ebenahar, mjankula, nsoffer, rik.theys, tomas.vanderka, ylavi
Target Milestone: ovirt-4.1.2Flags: amureini: ovirt-4.1?
rule-engine: planning_ack?
rule-engine: devel_ack+
rule-engine: testing_ack+
Target Release: 4.19.10   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-23 08:13:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1374545    
Bug Blocks:    
Attachments:
Description Flags
journalctl none

Description Rik Theys 2016-08-31 13:17:35 UTC
Description of problem:

[As requested on the oVirt users mailing list I'm filing this bug. For details, see the "thin_check run on VM disk by host on startup ?!" thread in the archives.]

We have VM's that use LVM thin provisioning. During maintenance on our oVirt hosts we migrated the VM to another node and rebooted the node. During startup I noticed it was running a "thin_check" process on the LV of the VM. When the VM was rebooted it failed to bring up the thin provisioned LV's and they needed recovery.

We experienced this problem before but were unable to pinpoint what the cause was at the time.

When looking at the lvs output on the host, the LV's of the VM (the LV's from the VG inside the VM) were active on the host (the VM was running on another host at the time).

It seems the host scans LV's (of the storage domain) for physical volumes and tries to activate them.

A workaround that seems to work is to disable lvmetad and disable auto_activation_volume_list in lvm.conf.

systemctl stop lvm2-lvmetad.service lvm2-lvmetad.socket
systemctl mask lvm2-lvmetad.service lvm2-lvmetad.socket

edit /etc/lvm/lvm.conf
   use_lvmetad = 0
   auto_activation_volume_list = []

To be sure I've also configured a filter in lvm.conf to only whitelist the physical volume needed to boot the host.


Version-Release number of selected component (if applicable):
vdsm from ovirt 4.0.3 but in the past we had a similar issue (ovirt 3.5 or 3.6)

How reproducible:
Not sure, we rebooted the hosts more frequently without triggering it. But we have experienced this issue at least twice so far.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Nir Soffer 2016-08-31 15:43:49 UTC
The thread in the user list mailing list:
http://lists.ovirt.org/pipermail/users/2016-August/042333.html

Like bug 1202595.

Comment 3 Allon Mureinik 2016-09-04 06:30:09 UTC
(In reply to Nir Soffer from comment #2)
> The thread in the user list mailing list:
> http://lists.ovirt.org/pipermail/users/2016-August/042333.html
> 
> Like bug 1202595.

Can we close one as a duplicate of the other?

Comment 4 Rik Theys 2016-09-05 07:14:27 UTC
Hi,

(In reply to Allon Mureinik from comment #3)
> (In reply to Nir Soffer from comment #2)
> > The thread in the user list mailing list:
> > http://lists.ovirt.org/pipermail/users/2016-August/042333.html
> > 
> > Like bug 1202595.
> 
> Can we close one as a duplicate of the other?

I'm not authorized to see that bug. Can you add me to the CC list of that bug so I can check if it matches my issue?

Rik

Comment 5 Allon Mureinik 2016-09-05 08:08:37 UTC
(In reply to Rik Theys from comment #4)
> Hi,
> 
> (In reply to Allon Mureinik from comment #3)
> > (In reply to Nir Soffer from comment #2)
> > > The thread in the user list mailing list:
> > > http://lists.ovirt.org/pipermail/users/2016-August/042333.html
> > > 
> > > Like bug 1202595.
> > 
> > Can we close one as a duplicate of the other?
> 
> I'm not authorized to see that bug. Can you add me to the CC list of that
> bug so I can check if it matches my issue?
> 
> Rik

Apologies for that.
Bug 1202595 seems to be a RHEV-M bug (with customer details, so I'm not sure I should share it), and as such, these can't be duplicates anyway.
If anything, the RHEVM bug should depend on the oVirt bug.

Comment 6 Yaniv Lavi 2017-02-23 11:25:18 UTC
Moving out all non blocker\exceptions.

Comment 7 Nir Soffer 2017-02-27 15:22:28 UTC
Rik, I believe this issue should be solved in 4.1. We disable lvmetad service
by default.

The fix is available in vdsm >= 4.19.6.

Can you test that this works for you?

Comment 8 Rik Theys 2017-02-28 10:08:05 UTC
Hi Nir,

What I implemented was:

 1. disable lvmetad in /etc/lvm/lvm.conf by setting use_lvmetad=0
 2. configure auto_activation_volume_list in lvm.conf with the VG of the root disk, so only that one gets auto-activated.
 3. mask the lvm2-lvmetad service and socket unit

If the new implementation masks the lvm2-lvmetad service and socket it should also be sufficient as I understand that automatically finding the correct setting for setting the auto_activation_volume_list parameter is error-prone.

Unfortunately I don't have a host available to test your settings on right now. They already have those things disabled so I don't know if the upgrade to 4.1 changed it for me.

Whey you say "disable" the service, I assume it is now masked?

Regards,

Rik

Comment 9 Nir Soffer 2017-03-19 14:06:14 UTC
Rik, in 4.1 we disable and mask lvmetad, and we set use_lvmetad = 0 in
/etc/lvm/lvmlocal.conf (overriding lvm.conf).

We don't set auto_activation_volume_list since it cannot be done automatically, we
cannot guess how your host related vgs are named.

Comment 10 Rik Theys 2017-03-20 09:17:29 UTC
Hi Nir,

I missed the lvmlocal.conf part. I believe the changes you mention are sufficient to prevent this issue.

Regards,

Rik

Comment 11 Allon Mureinik 2017-03-30 11:32:00 UTC
Nir - was anything done here beyond the changes in bug 1374545? If not, this should be ON_QA alongside with 1374545.

Comment 12 Nir Soffer 2017-04-03 22:55:23 UTC
(In reply to Allon Mureinik from comment #11)
> Nir - was anything done here beyond the changes in bug 1374545? If not, this
> should be ON_QA alongside with 1374545.

No, updating to on qa.

Comment 13 Allon Mureinik 2017-04-03 22:57:01 UTC
(In reply to Nir Soffer from comment #12)
> (In reply to Allon Mureinik from comment #11)
> > Nir - was anything done here beyond the changes in bug 1374545? If not, this
> > should be ON_QA alongside with 1374545.
> 
> No, updating to on qa.
Thanks Nir.
I'm also setting requires-doctext- here based on this response.
Any doctext we may need should be handled in bug 1374545.

Comment 14 rhev-integ 2017-04-26 10:51:15 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[NO RELEVANT PATCHES FOUND]

For more info please contact: infra

Comment 18 Elad 2017-05-01 10:52:37 UTC
Nir, can we move to VERIFIED based on https://bugzilla.redhat.com/show_bug.cgi?id=1374545#c165 ?

Comment 19 Nir Soffer 2017-05-03 08:20:19 UTC
(In reply to Elad from comment #18)
> Nir, can we move to VERIFIED based on
> https://bugzilla.redhat.com/show_bug.cgi?id=1374545#c165 ?

In that bug we did not test thin lvm inside guests. Maybe we can reproduce this
using thin lvs inside guest?

Basically it is the same as verification for 1374545, but when creating guest lvs
create thin lvs.

Comment 20 Tomas Vanderka 2017-05-05 17:29:00 UTC
FYI this is still an issue on FC storage, or any block storage connected at boot, regardless of 1374545.

Is there an issue tracking this (or 1374545) specifically for FC storage? I could not find any.

Comment 21 Nir Soffer 2017-05-08 10:11:31 UTC
(In reply to Tomas Vanderka from comment #20)
> FYI this is still an issue on FC storage, or any block storage connected at
> boot, regardless of 1374545.
> 
> Is there an issue tracking this (or 1374545) specifically for FC storage? I
> could not find any.

We have these issues for FC related problems:

- bug 1253640
- bug 1130527
- bug 1400446

Can you give more info why this issue is still relevant with FC? Guest lvs
are activated during boot without proper lvm filter, but vdsm is deactivating
all lvs during boot.

If you have data to prove this, I suggest to create a new bug for this issue
with FC storage.

Comment 22 Rik Theys 2017-05-08 11:09:32 UTC
Hi Nir,

(In reply to Nir Soffer from comment #21)
> (In reply to Tomas Vanderka from comment #20)
> > FYI this is still an issue on FC storage, or any block storage connected at
> > boot, regardless of 1374545.
> > 
> > Is there an issue tracking this (or 1374545) specifically for FC storage? I
> > could not find any.
> 
> We have these issues for FC related problems:
> 
> - bug 1253640
> - bug 1130527
> - bug 1400446
> 
> Can you give more info why this issue is still relevant with FC? Guest lvs
> are activated during boot without proper lvm filter, but vdsm is deactivating
> all lvs during boot.

If I read this statement correct, the system will activate the lvs during boot of the host, but vdsm will deactivate them again?? Unless an LVM filter is configured?

If that is the case, then I don't think this fully fixes this issue as the system will run checks on lvs (especially on thin lvs) upon activation?

Regards,

Rik

Comment 23 Nir Soffer 2017-05-08 11:23:18 UTC
(In reply to Rik Theys from comment #22)
> If I read this statement correct, the system will activate the lvs during
> boot of the host, but vdsm will deactivate them again?? Unless an LVM filter
> is configured?
> 
> If that is the case, then I don't think this fully fixes this issue as the
> system will run checks on lvs (especially on thin lvs) upon activation?

Yes this is the case with FC storage, when LUNs are connected during boot.

The only way to avoid this issue is to use lvm filter that allow lvm to use
only the devices needed for boot.

For example:

   devices {
       filter = [ "a|^/dev/sda2$|", "r|.*|" ] 
   }

This filter may be different on different hosts, depending on the deployment.

With iSCSI, the LUNs are not available during boot, and the checks and activation
was triggered by lvmetad, which is disabled since 4.0.7 and 4.1.0.

Comment 24 Rik Theys 2017-05-08 11:37:51 UTC
Hi,

(In reply to Nir Soffer from comment #23)
> (In reply to Rik Theys from comment #22)
> > If I read this statement correct, the system will activate the lvs during
> > boot of the host, but vdsm will deactivate them again?? Unless an LVM filter
> > is configured?
> > 
> > If that is the case, then I don't think this fully fixes this issue as the
> > system will run checks on lvs (especially on thin lvs) upon activation?
> 
> Yes this is the case with FC storage, when LUNs are connected during boot.
> 
> The only way to avoid this issue is to use lvm filter that allow lvm to use
> only the devices needed for boot.

Wow, that's bad. We've experienced data loss in the past were a VM was using LVM thin provisioning and was running on a host, and we rebooted a different host in the cluster. The rebooting host ran an thin check on the lvs during boot which it should not do.

I would not consider this a fixed bug unless the system is configured to use an lvm filter (or auto activation list)[1], or at least indicate very strongly that this should be done by the admin for FC storage.

[1] I understand this is very hard to automate correctly for all situations.

Rik

Comment 25 Elad 2017-05-08 11:43:44 UTC
Nir, based on the latest comments here, is the reproduction scenario from comment #19 still relevant?

Comment 26 Nir Soffer 2017-05-08 12:44:51 UTC
(In reply to Elad from comment #25)
> Nir, based on the latest comments here, is the reproduction scenario from
> comment #19 still relevant?

Yes, for iSCSI based storage, where this should be fixed now. We will have 
a separate bug for FC case, since it requires different solution.

Comment 27 Tomas Vanderka 2017-05-08 17:59:28 UTC
FYI LVM autoactivation is an issue with shared storage since forever (bug 1009812). This is not really an issue with vdsm, more of a dangerous LVM default behavior in RHEL (made worse by thinp volumes -> dataloss).

Our current workaround on RHVH is to disable autoactivation by setting "auto_activation_volume_list = []" and adding "rd.lvm.lv=hostvg/var" to kernel commandline. Anaconda/imgbase does this for root and swap volumes allready. This avoids using filters.

Comment 28 Nir Soffer 2017-05-09 13:34:32 UTC
(In reply to Tomas Vanderka from comment #27)
> FYI LVM autoactivation is an issue with shared storage since forever (bug
> 1009812). This is not really an issue with vdsm, more of a dangerous LVM
> default behavior in RHEL (made worse by thinp volumes -> dataloss).

Indeed.

> Our current workaround on RHVH is to disable autoactivation by setting
> "auto_activation_volume_list = []" and adding "rd.lvm.lv=hostvg/var" to
> kernel commandline. Anaconda/imgbase does this for root and swap volumes
> allready. This avoids using filters.

auto_activation_volume_list helps, but filtering is better. With a filter,
ovirt or guest lvs do not appear in lvm commands, and a host or a user cannot
shoot itself in the foot without overriding the filter in the command line
(using --config, what vdsm does).

It also more efficient, if you have 300 LUNs, the host will not try to pvscan
them during boot or when you run lvm commands.

In both cases, vdsm cannot guess what are the devices needed by lvm, or what are
the vgs that should be activated on a host, only the system admin can decide.

Comment 29 Tomas Vanderka 2017-05-10 10:44:10 UTC
(In reply to Nir Soffer from comment #21)
> Can you give more info why this issue is still relevant with FC? Guest lvs
> are activated during boot without proper lvm filter, but vdsm is deactivating
> all lvs during boot.

The data corruption of thin volumes happens when activated at boot by lvm2-activation*.service. It does not matter if vdsm deactivates them later.

Also VDSM fails to deactivate them anyway because there are plenty of duplicated LVM names/guids caused by cloned VMs. There can also be completely broken or even maliciously setup LVM volumes inside any VM.

> 
> If you have data to prove this, I suggest to create a new bug for this issue
> with FC storage.

We requested a new bug through Redhat support case we have open, I can provide more data there.

Comment 30 Nir Soffer 2017-05-10 10:51:02 UTC
(In reply to Tomas Vanderka from comment #29)
> (In reply to Nir Soffer from comment #21)
> > Can you give more info why this issue is still relevant with FC? Guest lvs
> > are activated during boot without proper lvm filter, but vdsm is deactivating
> > all lvs during boot.
> 
> The data corruption of thin volumes happens when activated at boot by
> lvm2-activation*.service. It does not matter if vdsm deactivates them later.
>...
> We requested a new bug through Redhat support case we have open, I can
> provide more data there.

Tomas, if you can provide instructions for reproducing this it would be great.

Comment 31 Elad 2017-05-10 11:47:33 UTC
Tested the following (with iSCSI):

- Created a VM with OS installed (RHEL7.3)
- Attached 2 iSCSI connected direct LUNs to the VM
- Created a PV on top of each of the LUNs
- Created a VG on top of the 2 PVs
- Created a thin LV in the new VG:

[root@localhost ~]# lvcreate -L 100M -T test-vg/thinpool -V1G -n thinvolume   
 Using default stripesize 64.00 KiB. 
 WARNING: Sum of all thin volume sizes (1.00 GiB) exceeds the size of thin pool test-vg/thinpool (100.00 MiB)! 
 For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100. 
 Logical volume "thinvolume" created. 
[root@localhost ~]# lvs 
 LV         VG      Attr       LSize   Pool     Origin Data%  Meta%  Move Log Cpy%Sync Convert 
 thinpool   test-vg twi-aotz-- 100.00m                 0.00   0.98                             
 thinvolume test-vg Vwi-a-tz--   1.00g thinpool        0.00                                    
[root@localhost ~]# vgs 2>/dev/null 
 VG      #PV #LV #SN Attr   VSize  VFree  
 test-vg   2   2   0 wz--n- 59.99g 59.89g 
[root@localhost ~]# pvs 2>/dev/null   
 PV         VG      Fmt  Attr PSize  PFree  
 /dev/sda   test-vg lvm2 a--  30.00g 29.89g 
 /dev/sdb   test-vg lvm2 a--  30.00g 29.99g 


- Created a ext4 file system on lv 'thinvolume' and mounted it 
- Created files in the new file system
- Got checksum for one of the files:

[root@localhost new-fs]# md5sum vdsm-4.20.0-768.git20a7209.el7.x86_64.rpm 
a93b3dae1cd0ccf7f234b00b541aa71d  vdsm-4.20.0-768.git20a7209.el7.x86_64.rpm

- Rebooted the VM several times and took checksum before each reboot. The checksum remained the same.


Used:

XtremIO as storage backend

Engine:
rhevm-4.1.2.1-0.1.el7.noarch

Hypervisor:
Red Hat Virtualization Host 4.1 (el7.3)
vdsm-4.19.12-1.el7ev.x86_64
libvirt-daemon-2.0.0-10.el7_3.5.x86_64
sanlock-3.4.0-1.el7.x86_64
selinux-policy-3.13.1-102.el7_3.16.noarch
qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64
iscsi-initiator-utils-6.2.0.873-35.el7.x86_64
lvm2-2.02.166-1.el7_3.4.x86_64

Guest:
Red Hat Enterprise Linux 7.3
lvm2-2.02.166-1.el7_3.3.x86_64


Nir, is this test correct? Can I verify?

Comment 32 Nir Soffer 2017-05-10 16:23:31 UTC
(In reply to Elad from comment #31)
Elad, can you attach output of journalctl for the last boot or recent boots?

Comment 33 Elad 2017-05-11 13:48:04 UTC
Created attachment 1277907 [details]
journalctl

Comment 34 Nir Soffer 2017-05-11 14:15:56 UTC
I think we can move this to verified. We will create a new bug for FC.

Comment 35 Nir Soffer 2017-05-11 14:25:16 UTC
Marian, this bug is not about FC storage. On FC this issue require very different
solution.

This issue is fixed for user using iSCSI storage, but not for FC storage. We need
a separate bug for FC storage.

Please remove the customer case from this bug, and create a downstream bug for 
this case.

Comment 36 Marian Jankular 2017-05-12 11:07:13 UTC
Hi Nir,

I have already created new bugzilla and removed case from this one.

link to new bugzilla - https://bugzilla.redhat.com/show_bug.cgi?id=1449968