Bug 1337220 - Document mirrored LV does not get activated if one of its legs is missing
Summary: Document mirrored LV does not get activated if one of its legs is missing
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.2
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Peter Rajnoha
QA Contact: cluster-qe@redhat.com
Steven J. Levine
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 1420851 1546181
TreeView+ depends on / blocked
 
Reported: 2016-05-18 14:12 UTC by Marko Karg
Modified: 2018-07-11 15:27 UTC (History)
11 users (show)

(edit)
LVM does not support event-based autoactivation of incomplete volume groups


If a volume group is not complete and physical volumes are missing, LVM does not support automatic LVM event-based activation of that volume group. This implies a setting of "--activationmode complete" whenever autoactivation takes place. For information on the "--activationmode complete" option and automatic activation, see the "vgchange(8)" and "pvscan(8)" man pages.

Note that the event-driven autoactivation hooks are enabled when `lvmetad` is enabled with the `global/use_lvmetad=1` setting in the `/etc/lvm/lvm.conf` configuration file.  Also note that without autoactivation, there is a direct activation hook at the exact time during boot at which the volume groups are activated with only the physical volumes that are available at that time. Any physical volumes that appear later are not taken into account.

This issue does not affect early boot in `initramfs` (`dracut`) nor does this affect direct activation from the command line using "vgchange" and "lvchange" calls, which default to `degraded` activation mode.
Clone Of:
(edit)
Last Closed: 2018-07-11 14:29:34 UTC


Attachments (Terms of Use)

Description Marko Karg 2016-05-18 14:12:14 UTC
Description of problem:

On a RHEL 7.2 machine with lvmetad = 1 set in lvm.conf, a mirrored logical volume (1 mirror) does not get activated automatically during boot, even though activation_mode = degraded is set in lvm.conf

A manual activation once the system is booted on the other hand works.

Version-Release number of selected component (if applicable):

lvm2 
lvmetad

How reproducible:

always

Steps to Reproduce:
1. Set up a LV on a PV that lives on 2 (multipathed) block devices

[root@localhost ~]# pvs
pvs  PV                                  VG        Fmt  Attr PSize   PFree 
pvs  /dev/mapper/mpath-lvm-mirror_dg-004 mirror_dg lvm2 a--   10.00g  4.99g
pvs  /dev/mapper/mpath-lvm-mirror_dg-005 mirror_dg lvm2 a--   10.00g  4.99g

[root@localhost ~]# lvs
lvs  LV        VG        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
lvs  mirror_lv mirror_dg rwi-a-r---   5.00g                                    100.00          

LVM filters set like this:

global_filter = [
          "a|/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0.*|", # operating system
          "a|/dev/disk/by-id/dm-name-local-lvm-.*|", # configured single path application storage
          "a|/dev/disk/by-id/dm-name-mpath-lvm-.*|", # configured multipath application storage
          "r|.*|"
        ]

(customer setting)

The PV lives on 2 multipath devices:

[root@localhost ~]# multipath -v 1 -ll
mpath-lvm-mirror_dg-005
mpath-lvm-mirror_dg-004


2. Rename one of the aliases in /etc/multipath.conf 

multipaths {
        multipath {
                wwid 36001405dbfc95b79c2449f4adb0a5d6c
                alias mpath-lvm-mirror_dg-004
        }

        multipath {
                wwid 36001405361d3002adfb483f82d9ce75c
                alias Xmpath-lvm-mirror_dg-005
        }

Ensure to have use_lvmetad = 1 in /etc/lvm/lvm.conf

3. Reboot and check status of LVs:

[root@localhost ~]# lvs -a
lvs  WARNING: Device for PV vZV3Gt-TzT3-Rcd3-K46A-OSM7-umLv-XVJpid not found or rejected by a filter.
lvs  LV                   VG        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
lvs  mirror_lv            mirror_dg rwi---r-p-   5.00g                                                    
lvs  [mirror_lv_rimage_0] mirror_dg Iwi---r---   5.00g                                                    
lvs  [mirror_lv_rimage_1] mirror_dg Iwi---r-p-   5.00g                                                    
lvs  [mirror_lv_rmeta_0]  mirror_dg ewi---r---   4.00m                                                    
lvs  [mirror_lv_rmeta_1]  mirror_dg ewi---r-p-   4.00m  

Manual activation works:

[root@localhost ~]# lvchange -ay mirror_dg/mirror_lv
lvchange  WARNING: Device for PV vZV3Gt-TzT3-Rcd3-K46A-OSM7-umLv-XVJpid not found or rejected by a filter.
[root@localhost ~]# lvs
lvs  WARNING: Device for PV vZV3Gt-TzT3-Rcd3-K46A-OSM7-umLv-XVJpid not found or rejected by a filter.
lvs  LV        VG        Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
lvs  mirror_lv mirror_dg rwi-a-r-p-   5.00g                                    100.00          
lvs  home      rhel      -wi-ao---- 875.08g                                                    
lvs  root      rhel      -wi-ao----  50.00g                                                    
lvs  swap      rhel      -wi-ao----   5.88g   


Actual results:

The mirror_lv does not get activated. 

Expected results:

Automatic activation of mirror_lv in degraded mode.

Additional info:

I've also tried with moving the filter settings from global_filter to filter since global filter works for the lvmetad, but the result is the same.

Comment 2 Peter Rajnoha 2016-06-02 13:51:09 UTC
Currently, we don't support automatic LVM activation if the VG is not fully complete and PVs are missing - it only works if all the PVs making up the VG are present in the system.

To autoactivate LVs in partial/degraded mode, we'd need to enhance the autoactivation code so that it can detect whether the partial/degraded activation is safe or not (in case of mirrors, it may be still be considered safe as we have a complete copy/mirror of the mirror's leg).

Comment 4 Marko Karg 2016-07-26 10:16:45 UTC
I just re-tested to confirm the customer isn't doing anything wrong and I ended up with the same issue - although the raid1 LV has one leg fully available and activation_mode is set to "degraded" the LV does not get activated automatically. Please let me know if you need any details.

Comment 5 Marko Karg 2016-07-26 13:01:30 UTC
So, let me try to get this right: We do have an activation_policy called degraded, with a comment that says:

    #   degraded
    #     Like complete, but additionally RAID LVs of segment type raid1,
    #     raid4, raid5, radid6 and raid10 will be activated if there is no
    #     data loss, i.e. they have sufficient redundancy to present the
    #     entire addressable range of the Logical Volume.

Please correct me if my understanding is wrong, but I would say a VG is lacking one of two underlying PVs we do have enough redundancy to fulfil the above requirements and thus a RAID LV of type raid1 should get activated automatically. 

Assuming my understanding is correct then this is either a but or documented incorrectly.

Comment 6 Peter Rajnoha 2016-07-26 13:27:42 UTC
Currently, pvscan --cache -aay (that is called within lvm2-pvscan@major:minor.service) which is responsible for VG autoactivation triggers the autoactivation if it gets information from lvmetad that the VG is complete in a sense that it has all the PVs present. Currently, this procedure doesn't evaluate degraded RAID mode at all in which the VG/LV could still be activated even though PVs are missing.

It works only at the level where complete PV count in the VG is considered to trigger the autoactivation. This autoactivation hook needs to be enhanced to take into account a possibility of such degraded RAID LVs and for that we can't rely only on information sent from lvmetad as it is currently (lvmetad sends a flag in its response to pvscan once all the PVs are present for the VG and this triggers autoactivation).

Comment 7 Peter Rajnoha 2016-07-26 13:29:43 UTC
(In reply to Marko Karg from comment #5)
> So, let me try to get this right: We do have an activation_policy called
> degraded, with a comment that says:
> 
>     #   degraded
>     #     Like complete, but additionally RAID LVs of segment type raid1,
>     #     raid4, raid5, radid6 and raid10 will be activated if there is no
>     #     data loss, i.e. they have sufficient redundancy to present the
>     #     entire addressable range of the Logical Volume.
> 
> Please correct me if my understanding is wrong, but I would say a VG is
> lacking one of two underlying PVs we do have enough redundancy to fulfil the
> above requirements and thus a RAID LV of type raid1 should get activated
> automatically. 

It can still be activated manually by calling vgchange/lvchange -ay or -aay. But this doesn't work with automatic activation based on events via udev and pvscan --cache -aay call there with the additional help of lvmetad daemon.

Comment 8 Peter Rajnoha 2016-07-26 13:36:18 UTC
Since autoactivation works based on events where devices (PVs) are processed one by one as they appear in the system, we also need a policy to define how long should LVM wait for another device to appear to activate the VG/LV in degraded mode.

If we don't have this "timeout policy" in place, the very first device that appears in the system and makes the RAID possible for activation, it would be activated. And since udev events are processed one by one, we would always end up activating in degraded mode this way, no matter if more PVs appear in the system later on.

Also, autoactivation currently works per VG, not per LV - so either the VG is activated as a whole or not (honoring the activation/auto_activatino_volume_list and volume_list when iterating LVs in to autoactivate in certain VG). So there needs to be a solution for a case where the VG contains both RAIDs which could be activated in degraded mode and usual non-RAID LVs which require all the PVs to be present first.

All this area is still not covered by LVM autoactivation (the "waiting" policy, timeouts, partial/degraded activations etc.).

Comment 9 Marko Karg 2016-07-26 13:49:37 UTC
Peter are there any plans to make this work with lvmetad? If so, what's the timeline? Customer has valid requirements for degraded but complete (from a data perspective) LVs to be automatically activated during boot in combination with lvmetad.

If that's not planned we should at least get this documented.

Thanks!

Comment 10 Peter Rajnoha 2016-07-26 14:14:55 UTC
(In reply to Marko Karg from comment #9)
> Peter are there any plans to make this work with lvmetad? If so, what's the
> timeline? Customer has valid requirements for degraded but complete (from a
> data perspective) LVs to be automatically activated during boot in
> combination with lvmetad.
> 

We plan to cover this area in more robust way with a different activation scheme with a new specialized instantiation deamon where all those timeout and related policies on activation of partly available VGs could be defined, but that's not planned for RHEL7, but later releases.

For now, a quick solution which I can think of at the moment is either:

  - create systemd unit that runs vgchagne/lvchange directly for the VG/LV where we can expect the LV to be a degraded RAID and position this unit within boot sequence so that it makes the VG/LV activated this way (I could help with creating such unit if needed)

  - switch to use_lvmetad=0 in which case we disable autoactivatino and rely on direct vgchagne -aay calls during boot which honours the degraded activation for RAIDs (however, switching to use_lvmetad=0 could have its disadvantage in performance and disabling the autoactivation itself - so if any PV appears during system run, out of bootup sequence, the VG/LV is not activated unless vgchagne/lvchange is called directly)

It also depends on where customer needs such LV to be activated - if there's any mount point on it which is needed during boot sequence or whether it's just a standalone LV where there's no "system" mount point, but custom mount points with additional data (and hence not required during bootup and they can be activated during normal system run).

> If that's not planned we should at least get this documented.

Yes, indeed, I will document this better so it doesn't cause confusion.

Comment 16 Peter Rajnoha 2018-06-05 14:17:33 UTC
As I already said in comment #10, we don't support autoactivation of incomplete VGs (even if there are degraded RAID LVs which could be activated).

For now, we'll just document this issue.

Comment 20 Steven J. Levine 2018-07-10 19:59:41 UTC
Peter:

I did some editing of your description -- gave this a title for the release notes and made some other small edits.  Could you look this over to be sure to be sure it still reflects the issue accurately?

Comment 21 Peter Rajnoha 2018-07-11 08:39:42 UTC
(In reply to Steven J. Levine from comment #20)
> Peter:
> 
> I did some editing of your description -- gave this a title for the release
> notes and made some other small edits.  Could you look this over to be sure
> to be sure it still reflects the issue accurately?

Yes, the text looks good. Thanks!

Comment 22 Jonathan Earl Brassow 2018-07-11 14:29:34 UTC
This is a known limitation of LVM.  Once the machine is booted, if an incomplete volume group is discovered, it is not autoactivated.

Note that this is a bit different from the boot-up sequence, which does have the ability to activate volumes (e.g. the root volume) if there are missing PVs.

We will not be fixing this in RHEL7.


Note You need to log in before you can comment on or make changes to this bug.