Bug 1523152 - [downstream clone - 4.2.1] Guest LVs created on raw volumes are auto activated on the hypervisor with FC storage (lvm filter?)
Summary: [downstream clone - 4.2.1] Guest LVs created on raw volumes are auto activate...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Nir Soffer
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
Depends On: 1449968
Blocks: 1524500
TreeView+ depends on / blocked
 
Reported: 2017-12-07 10:19 UTC by rhev-integ
Modified: 2022-02-17 17:21 UTC (History)
26 users (show)

Fixed In Version: vdsm v4.20.13
Doc Type: Bug Fix
Doc Text:
Cause: LVM scan and activate RHV raw volumes during boot. Then it scan and activate guest logical volumes created inside a guest on top of RHV raw volumes. It also scan and activate guest logical volumes inside LUNs which are not part of RHV storge domain. Consequence: Thousands of active logical volumes on a host, which should not be active. This leads to very slow boot and may lead to data corruption later if a logical volume active on the host was extended on another host. Fix: User should configure lvm filter using vdsm-tool config-lvm-filter. The LVM filter prevent scanning and activation of logical volumes not required by the host. Result: Fast boot, LVM activate only logical volumes required by the host.
Clone Of: 1449968
: 1524500 (view as bug list)
Environment:
Last Closed: 2018-02-01 15:54:14 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 83524 0 master MERGED lvmfilter: Introduce the storage.lvmfilter module 2017-12-07 10:23:08 UTC
oVirt gerrit 84396 0 master MERGED tool: Add config-lvm-filter command 2017-12-07 10:23:08 UTC
oVirt gerrit 84694 0 master MERGED spec: Require python-augeas 2017-12-07 10:23:08 UTC
oVirt gerrit 84695 0 master MERGED lvmconf: Introduce the storage lvmconf module 2017-12-07 10:23:08 UTC
oVirt gerrit 84826 0 master MERGED lvmfilter: Analyze current LVM filter 2017-12-07 10:23:08 UTC
oVirt gerrit 84829 0 master MERGED config-lvm-filter: Configure LVM filter 2017-12-07 10:23:08 UTC
oVirt gerrit 84918 0 ovirt-4.1 MERGED lvmfilter: Introduce the storage.lvmfilter module 2017-12-11 16:33:19 UTC
oVirt gerrit 84919 0 ovirt-4.1 MERGED tool: Add config-lvm-filter command 2017-12-11 16:47:58 UTC
oVirt gerrit 84920 0 ovirt-4.1 MERGED spec: Require python-augeas 2017-12-11 17:36:10 UTC
oVirt gerrit 84921 0 ovirt-4.1 MERGED lvmconf: Introduce the storage lvmconf module 2017-12-11 18:04:52 UTC
oVirt gerrit 84922 0 ovirt-4.1 MERGED lvmfilter: Analyze current LVM filter 2017-12-12 08:49:53 UTC
oVirt gerrit 84923 0 ovirt-4.1 MERGED config-lvm-filter: Configure LVM filter 2017-12-12 09:10:13 UTC
oVirt gerrit 85126 0 master MERGED lvmfilter: Filter out the master LV on the SPM 2017-12-07 10:23:08 UTC
oVirt gerrit 85127 0 master MERGED lvmfilter: Remove double [yes/No]? 2017-12-07 10:23:08 UTC
oVirt gerrit 85233 0 ovirt-4.1 MERGED tests: Add minimal storage test infrastructure 2017-12-11 15:25:19 UTC
oVirt gerrit 85234 0 ovirt-4.1 MERGED storaqge: fix failing lvmfilter test 2017-12-12 09:19:02 UTC
oVirt gerrit 85235 0 ovirt-4.1 MERGED lvmfilter: Remove double [yes/No]? 2017-12-12 11:01:53 UTC
oVirt gerrit 85236 0 ovirt-4.1 MERGED lvmfilter: Filter out the master LV on the SPM 2017-12-17 07:52:19 UTC

Description rhev-integ 2017-12-07 10:19:45 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1449968 +++
======================================================================

Description of problem:
Guest LVs created on raw volumes are auto activated on the hypervisor with FC storage

The host still autoactivates everything at boot, LVM freaks out seeing all the guest VGs/LVs many of which have duplicated names/guids (because of cloned VMs). This breaks:
1. RHVH installation - anaconda fails on LVM errors if storage is connected to server
2. Upgrade - nodectl/imgbase used to fail on LVM errors, not sure if still fails or ignores the errors
3. VDSM - cannot correctly deactivate powered off VM disks, because guest LVs are still active on them
4. boot/shutdown issues as lvm services fail to start/stop correctly

Version-Release number of selected component (if applicable):
rhvh-4.1-0.20170421

How reproducible:


Steps to Reproduce:
1.install host and add it to manager
2.add fc storage to datacenter
3.create vm on raw device with lvm on top of it
4.shutdown the vm, put host into maintenance
6, reboot the host


Actual results:
guest vg and lv are autoactivated on the boot

Expected results:
guest vg and lv are not autoactivated on the boot

Additional info:

(Originally by Marian Jankular)

Comment 1 rhev-integ 2017-12-07 10:19:58 UTC
There are several bugs here:

1) We use anaconda from platform, so I'd expect this bug to also be reproducible on RHEL. Can you please open a bug against Anacodna to track this?

2) nodectl/imgbase now ignores LVM errors.

3/4) These look like duplicates of rhbz#1374545, but that shipped with 4.1.1-1. Nir, can you please look at this? RHVH doesn't have any special/different handling for LVs, so I'd expect this to be vdsm/storage.

(Originally by Ryan Barry)

Comment 7 rhev-integ 2017-12-07 10:20:39 UTC
(In reply to Ryan Barry from comment #1)
> There are several bugs here:
> 
> 1) We use anaconda from platform, so I'd expect this bug to also be
> reproducible on RHEL. Can you please open a bug against Anacodna to track
> this?

I agree, issues with installing stuff on a host with connected FC storage
is platform issue, we need a RHEL bug for this.

> 2) nodectl/imgbase now ignores LVM errors.

I don't know what are these errors but failing on lvm errors sounds reasonable,
ignoring lvm errors is very risky.

> 3/4) These look like duplicates of rhbz#1374545, but that shipped with
> 4.1.1-1. Nir, can you please look at this? RHVH doesn't have any
> special/different handling for LVs, so I'd expect this to be vdsm/storage.

We handled only iSCSI issues so far, we have couple of bugs open for FC issues,
see bug 1450114.

(Originally by Nir Soffer)

Comment 8 rhev-integ 2017-12-07 10:20:50 UTC
Marian, I would like to focus on single issue in this bug. If you look in the
tracker bug 1450114 we already have bugs about:

- duplicate vg/lvs - bug 1130527
- slow boot - bug 1400446
- dirty luns not usable - bug 1253640

We lack a bug about thin_check breaking thin guest lvs with FC storage
(see bug 1371939). Do you have a relevant case for this issue? If not, maybe
the case you added is relevant to one of the other bugs?

(Originally by Nir Soffer)

Comment 9 rhev-integ 2017-12-07 10:20:58 UTC
(In reply to Nir Soffer from comment #6)
> (In reply to Ryan Barry from comment #1)
> I agree, issues with installing stuff on a host with connected FC storage
> is platform issue, we need a RHEL bug for this.

I'll open a platform bug

> I don't know what are these errors but failing on lvm errors sounds
> reasonable,
> ignoring lvm errors is very risky.

Well, we're not strictly ignoring LVM errors. Simply that with the lvmetad changes, stderr throws out a lot of messages. We're dropping that to get a clean parse, not non-zero return code.
 
> We handled only iSCSI issues so far, we have couple of bugs open for FC
> issues,
> see bug 1450114.

Since this bug is probably not RHVH specific, do you want to grab this for vdsm/storage?

(Originally by Ryan Barry)

Comment 10 rhev-integ 2017-12-07 10:21:06 UTC
(In reply to Ryan Barry from comment #8)
> Since this bug is probably not RHVH specific, do you want to grab this for
> vdsm/storage?

Lets understand first what is the unique issue in this bug that is not tracked
already in one of the other bugs. Waiting for Marian reply on comment 7.

(Originally by Nir Soffer)

Comment 11 rhev-integ 2017-12-07 10:21:14 UTC
Hello Nir,

i think none of the bugs in comment #7 is relevant for this one, guest lvs that are autoactivated are not direct luns assigned to the vms,
they are from data storage domain and they are raw devices.

(Originally by Marian Jankular)

Comment 13 rhev-integ 2017-12-07 10:21:29 UTC
Based on comment 10, moving this to vdsm.

(Originally by Nir Soffer)

Comment 14 rhev-integ 2017-12-07 10:21:37 UTC
4.1.4 is planned as a minimal, fast, z-stream version to fix any open issues we may have in supporting the upcoming EL 7.4.

Pushing out anything unrelated, although if there's a minimal/trival, SAFE fix that's ready on time, we can consider introducing it in 4.1.4.

(Originally by Allon Mureinik)

Comment 17 rhev-integ 2017-12-07 10:22:00 UTC
Hello Nir,

thank you for responding, appreciate that even more as you are on PTO.
Would it be possible that we would add this as default setting to our rhvh hosts? I think it should not break anything as we are activating rhv storage domains with vdsm. Or at we should highlight this in the documentation. 

What do you think?

(Originally by Marian Jankular)

Comment 18 rhev-integ 2017-12-07 10:22:10 UTC
I'm also on PTO, but from a RHVH perspective, we'd rather see this fixed from vdsm or platform, since this problem is not at all unique to RHVH.

Highlighting this in the documentation would be ideal, though

(Originally by Ryan Barry)

Comment 19 rhev-integ 2017-12-07 10:22:19 UTC
(In reply to Marian Jankular from comment #16)
> Hello Nir,
> 
> thank you for responding, appreciate that even more as you are on PTO.
> Would it be possible that we would add this as default setting to our rhvh
> hosts? I think it should not break anything as we are activating rhv storage
> domains with vdsm. Or at we should highlight this in the documentation. 
> 
> What do you think?

There's no magic solution here. The current solution is manual configuration. A better, longer term solution would be a script (Ansible?) that creates this LVM filter and keeps it up-to-date.

(Originally by Yaniv Kaul)

Comment 20 rhev-integ 2017-12-07 10:22:26 UTC
*** Bug 1507691 has been marked as a duplicate of this bug. ***

(Originally by Daniel Erez)

Comment 21 rhev-integ 2017-12-07 10:22:34 UTC
Current state:

Vdsm includes now a new vdsm-tool configure-lvm-filter command. This command
analyze the current host and recommends the lvm filter for this host.

Here is an example run on this command:

    # vdsm-tool config-lvm-filter
    Found these mounted logical volumes on this host:

      logical volume:  /dev/mapper/vg0-lv_home
      mountpoint:      /home
      devices:         /dev/vda2

      logical volume:  /dev/mapper/vg0-lv_root
      mountpoint:      /
      devices:         /dev/vda2

      logical volume:  /dev/mapper/vg0-lv_swap
      mountpoint:      [SWAP]
      devices:         /dev/vda2

    This is the recommended LVM filter for this host:

      filter = [ "a|^/dev/vda2$|", "r|.*|" ]

    To use this LVM filter please edit /etc/lvm/lvm.conf
    and set the 'filter' option in the 'devices' section.
    It is recommended to reboot the hypervisor to verify the
    filter.

    This filter will allow LVM to access the local devices used
    by the hypervisor, but not shared storage owned by Vdsm.
    If you want to add another local device you will have to
    add the device manually to the LVM filter.

The next step is analyzing the currently configured lvm filter, and configuring
the lvm filter when a user confirm.

We hope to perform this configuation automatically as part of host deployment.

(Originally by Nir Soffer)

Comment 22 rhev-integ 2017-12-07 10:22:43 UTC
I posted 2 additional patches, improving this configuration tool.

`vdsm-tool config-lvm-filter` analyzes the current LVM configuation to decide
if a filter should to be configured, and to configure the filter automatically
if possible.

If the host is already configured, the tool does nothing:

    # vdsm-tool config-lvm-filter
    Analyzing host...
    LVM filter is already configured for Vdsm

On a host not configured yet, the tool configure LVM automatically after
the user confirm the operation.

    # vdsm-tool config-lvm-filter
    Analyzing host...
    Found these mounted logical volumes on this host:

      logical volume:  /dev/mapper/vg0-lv_home
      mountpoint:      /home
      devices:         /dev/vda2

      logical volume:  /dev/mapper/vg0-lv_root
      mountpoint:      /
      devices:         /dev/vda2

      logical volume:  /dev/mapper/vg0-lv_swap
      mountpoint:      [SWAP]
      devices:         /dev/vda2

    This is the recommended LVM filter for this host:

      filter = [ "a|^/dev/vda2$|", "r|.*|" ]

    This filter will allow LVM to access the local devices used by the
    hypervisor, but not shared storage owned by Vdsm. If you add a new
    device to the volume group, you will need to edit the filter manually.

    Configure LVM filter? [yes,NO] ? [NO/yes] yes
    Configuration completed successfully!

    Please reboot to verify the LVM configuration.

If the host configuration does not match Vdsm required configuration,
the user need to configure LVM filter manually.

    # vdsm-tool config-lvm-filter
    Analyzing host...
    Found these mounted logical volumes on this host:

      logical volume:  /dev/mapper/vg0-lv_home
      mountpoint:      /home
      devices:         /dev/vda2

      logical volume:  /dev/mapper/vg0-lv_root
      mountpoint:      /
      devices:         /dev/vda2

      logical volume:  /dev/mapper/vg0-lv_swap
      mountpoint:      [SWAP]
      devices:         /dev/vda2

    This is the recommended LVM filter for this host:

      filter = [ "a|^/dev/vda2$|", "r|.*|" ]

    This filter will allow LVM to access the local devices used by the
    hypervisor, but not shared storage owned by Vdsm. If you add a new
    device to the volume group, you will need to edit the filter manually.

    This is the current LVM filter:

      filter = [ "a|^/dev/vda2$|", "a|^/dev/vdb1$|", "r|.*|" ]

    WARNING: The current LVM filter does not match the recommended filter,
    Vdsm cannot configure the filter automatically.

    Please edit /etc/lvm/lvm.conf and set the 'filter' option in the
    'devices' section to the recommended value.

    It is recommend to reboot after changing LVM filter.

This completes the work planned for 4.2.0.

After additional testing, we can integrate this into host deployment, so in the 
common cases, LVM filter will be configured automatically.

In some cases, when a user configured in compatible LVM filter the user will have
to modify the configuration manually.

(Originally by Nir Soffer)

Comment 23 rhev-integ 2017-12-07 10:22:52 UTC
2 Additional patches merged fixing issues we found in the last week.

(Originally by Nir Soffer)

Comment 24 Allon Mureinik 2017-12-07 13:41:35 UTC
Tal/Nir, can you please explain why we need this clone?

Comment 25 Allon Mureinik 2017-12-07 13:42:10 UTC
And is "NEW" the correct status?

Comment 26 Nir Soffer 2017-12-07 13:57:33 UTC
(In reply to Allon Mureinik from comment #24)
> Tal/Nir, can you please explain why we need this clone?

Need for backporting lvm filter patches for 4.1.9, as we discussed.

Comment 27 Sandro Bonazzola 2017-12-20 08:26:55 UTC
According to git log this bug may be included in 4.2.0. Can you please cross check?

Comment 28 Nir Soffer 2018-01-02 09:12:36 UTC
(In reply to Sandro Bonazzola from comment #27)
> According to git log this bug may be included in 4.2.0. Can you please cross
> check?

I already replied about this multiple times in the other related bugs. All the 
patches are included in 4.2.0, except 2 fixes for the new code introduce after
4.2.0 was built. So this will be available in 4.2.1.

Comment 29 RHV bug bot 2018-01-05 16:59:00 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 30 RHV bug bot 2018-01-12 14:41:24 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 31 RHV bug bot 2018-01-18 17:39:47 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 32 RHV bug bot 2018-01-24 22:08:21 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 33 RHV bug bot 2018-01-30 11:23:19 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{}', ]

For more info please contact: rhv-devops

Comment 35 Franta Kust 2019-05-16 12:54:41 UTC
BZ<2>Jira re-sync


Note You need to log in before you can comment on or make changes to this bug.