Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1451459

Summary:	Attempting to activate a 7.4 cache metadata 2 format volume on a shared storage 7.3 machine results in the VG being deleted
Product:	Red Hat Enterprise Linux 7	Reporter:	Corey Marthaler <cmarthal>
Component:	lvm2	Assignee:	Zdenek Kabelac <zkabelac>
lvm2 sub component:	Cache Logical Volumes	QA Contact:	cluster-qe <cluster-qe>
Status:	CLOSED ERRATA	Docs Contact:
Severity:	high
Priority:	high	CC:	agk, heinzm, heri, jbrassow, mcsontos, msnitzer, prajnoha, teigland, zkabelac
Version:	7.4
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	lvm2-2.02.171-5.el7	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-08-01 21:54:18 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Corey Marthaler 2017-05-16 16:58:40 UTC

Description of problem:
# 7.4 machine

3.10.0-651.el7.x86_64
lvm2-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-libs-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017


[root@harding-03 ~]# pvscan --cache
[root@harding-03 ~]# pvscan
  PV /dev/sda2   VG rhel_harding-03   lvm2 [<92.16 GiB / 0    free]
  PV /dev/sdb1   VG rhel_harding-03   lvm2 [<93.16 GiB / 0    free]
  PV /dev/sdc1   VG rhel_harding-03   lvm2 [<93.16 GiB / 0    free]
  Total: 3 [<278.47 GiB] / in use: 3 [<278.47 GiB] / in no VG: 0 [0   ]


# 7.3 machine

3.10.0-514.el7.x86_64
lvm2-2.02.166-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
lvm2-libs-2.02.166-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-libs-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-event-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-event-libs-1.02.135-1.el7    BUILT: Wed Sep 28 02:26:52 CDT 2016
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 05:29:13 CDT 2016


[root@harding-02 ~]# pvscan --cache
[root@harding-02 ~]# pvscan
  PV /dev/sda2   VG rhel_harding-02   lvm2 [92.16 GiB / 0    free]
  PV /dev/sdb1   VG rhel_harding-02   lvm2 [93.16 GiB / 0    free]
  PV /dev/sdc1   VG rhel_harding-02   lvm2 [93.16 GiB / 0    free]
  Total: 3 [278.47 GiB] / in use: 3 [278.47 GiB] / in no VG: 0 [0   ]


# 7.4 machine

[root@harding-03 ~]# pvcreate /dev/mapper/mpath[ab]1
  Physical volume "/dev/mapper/mpatha1" successfully created.
  Physical volume "/dev/mapper/mpathb1" successfully created.
[root@harding-03 ~]# vgcreate VG /dev/mapper/mpath[ab]1
  Volume group "VG" successfully created
[root@harding-03 ~]# pvscan
  PV /dev/sda2             VG rhel_harding-03   lvm2 [<92.16 GiB / 0    free]
  PV /dev/sdb1             VG rhel_harding-03   lvm2 [<93.16 GiB / 0    free]
  PV /dev/sdc1             VG rhel_harding-03   lvm2 [<93.16 GiB / 0    free]
  PV /dev/mapper/mpatha1   VG VG                lvm2 [249.96 GiB / 249.96 GiB free]
  PV /dev/mapper/mpathb1   VG VG                lvm2 [249.96 GiB / 249.96 GiB free]
  Total: 5 [<778.40 GiB] / in use: 5 [<778.40 GiB] / in no VG: 0 [0   ]

# 7.3 machine

[root@harding-02 ~]# pvscan --cache
[root@harding-02 ~]# pvscan
  PV /dev/sda2             VG rhel_harding-02   lvm2 [92.16 GiB / 0    free]
  PV /dev/sdb1             VG rhel_harding-02   lvm2 [93.16 GiB / 0    free]
  PV /dev/sdc1             VG rhel_harding-02   lvm2 [93.16 GiB / 0    free]
  PV /dev/mapper/mpatha1   VG VG                lvm2 [249.96 GiB / 249.96 GiB free]
  PV /dev/mapper/mpathc1   VG VG                lvm2 [249.96 GiB / 249.96 GiB free]
  Total: 5 [778.40 GiB] / in use: 5 [778.40 GiB] / in no VG: 0 [0   ]


# 7.4 machine
[root@harding-03 ~]# lvcreate -L 20M -n origin VG /dev/mapper/mpatha1
  Logical volume "origin" created.
[root@harding-03 ~]# lvcreate -L 10M -n pool VG /dev/mapper/mpathb1
  Rounding up size to full physical extent 12.00 MiB
  Logical volume "pool" created.
[root@harding-03 ~]# lvcreate -L 12M -n pool_meta VG /dev/mapper/mpathb1
  Logical volume "pool_meta" created.
[root@harding-03 ~]# lvconvert --yes --type cache-pool --cachepolicy mq --cachemode writethrough -c 64 --poolmetadata VG/pool_meta VG/pool
  WARNING: Converting logical volume VG/pool and VG/pool_meta to cache pool's data and metadata volumes with metadata wiping.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
  Converted VG/pool_cdata to cache pool.
[root@harding-03 ~]# lvconvert --yes --type cache --cachemetadataformat 2 --cachepool VG/pool VG/origin
  Logical volume VG/origin is now cached.
[root@harding-03 ~]# lvs -a -o +devices
  LV              VG              Attr       LSize    Pool   Origin         Data%  Meta%  Move Log Cpy%Sync Convert Devices               
  [lvol0_pmspare] VG              ewi-------   12.00m                                                               /dev/mapper/mpatha1(5)
  origin          VG              Cwi-a-C---   20.00m [pool] [origin_corig] 0.00   0.39            0.00             origin_corig(0)       
  [origin_corig]  VG              owi-aoC---   20.00m                                                               /dev/mapper/mpatha1(0)
  [pool]          VG              Cwi---C---   12.00m                       0.00   0.39            0.00             pool_cdata(0)         
  [pool_cdata]    VG              Cwi-ao----   12.00m                                                               /dev/mapper/mpathb1(0)
  [pool_cmeta]    VG              ewi-ao----   12.00m                                                               /dev/mapper/mpathb1(3)
  home            rhel_harding-03 -wi-ao---- <200.52g                                                               /dev/sdb1(0)          
  home            rhel_harding-03 -wi-ao---- <200.52g                                                               /dev/sdc1(0)          
  home            rhel_harding-03 -wi-ao---- <200.52g                                                               /dev/sda2(7155)       
  root            rhel_harding-03 -wi-ao----   50.00g                                                               /dev/sda2(10792)      
  swap            rhel_harding-03 -wi-ao----  <27.95g                                                               /dev/sda2(0)          
[root@harding-03 ~]# lvchange -an VG



# 7.3 machine

[root@harding-02 ~]# pvscan --cache
  Unknown status flag 'METADATA_FORMAT'.
  Could not read status flags.
  Couldn't read status flags for logical volume VG/pool.
  Couldn't read all logical volume names for volume group VG.
  Unknown status flag 'METADATA_FORMAT'.
  Could not read status flags.
  Couldn't read status flags for logical volume VG/pool.
  Couldn't read all logical volume names for volume group VG.
[root@harding-02 ~]# pvscan
  PV /dev/sda2             VG rhel_harding-02   lvm2 [92.16 GiB / 0    free]
  PV /dev/sdb1             VG rhel_harding-02   lvm2 [93.16 GiB / 0    free]
  PV /dev/sdc1             VG rhel_harding-02   lvm2 [93.16 GiB / 0    free]
  WARNING: Repairing flag incorrectly marking Physical Volume /dev/mapper/mpathc1 as used.
  WARNING: Repairing flag incorrectly marking Physical Volume /dev/mapper/mpatha1 as used.
  PV /dev/mapper/mpathc1                        lvm2 [250.00 GiB]
  PV /dev/mapper/mpatha1                        lvm2 [250.00 GiB]
[root@harding-02 ~]# vgs
  VG              #PV #LV #SN Attr   VSize   VFree
  rhel_harding-02   3   3   0 wz--n- 278.47g    0 



# 7.4 machine (the VG and cache volume is now missing)

[root@harding-03 ~]# pvscan --cache
[root@harding-03 ~]# lvs -a -o +devices
  LV   VG              Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices         
  home rhel_harding-03 -wi-ao---- <200.52g                                                     /dev/sdb1(0)    
  home rhel_harding-03 -wi-ao---- <200.52g                                                     /dev/sdc1(0)    
  home rhel_harding-03 -wi-ao---- <200.52g                                                     /dev/sda2(7155) 
  root rhel_harding-03 -wi-ao----   50.00g                                                     /dev/sda2(10792)
  swap rhel_harding-03 -wi-ao----  <27.95g                                                     /dev/sda2(0)    
[root@harding-03 ~]# vgs
  VG              #PV #LV #SN Attr   VSize    VFree
  rhel_harding-03   3   3   0 wz--n- <278.47g    0 
[root@harding-03 ~]# pvscan
  PV /dev/sda2             VG rhel_harding-03   lvm2 [<92.16 GiB / 0    free]
  PV /dev/sdb1             VG rhel_harding-03   lvm2 [<93.16 GiB / 0    free]
  PV /dev/sdc1             VG rhel_harding-03   lvm2 [<93.16 GiB / 0    free]
  PV /dev/mapper/mpathb1                        lvm2 [<250.00 GiB]
  PV /dev/mapper/mpatha1                        lvm2 [<250.00 GiB]
  Total: 5 [778.46 GiB] / in use: 3 [<278.47 GiB] / in no VG: 2 [499.99 GiB]

Comment 2 Corey Marthaler 2017-05-16 17:20:13 UTC

The VG remains intact when lvmetad is not running on the 7.3 machine.

[root@harding-03 ~]# systemctl status lvm2-lvmetad
lvm2-lvmetad.service - LVM2 metadata daemon
   Loaded: loaded (/usr/lib/systemd/system/lvm2-lvmetad.service; disabled; vendor preset: enabled)
   Active: active (running) since Tue 2017-05-16 10:50:04 CDT; 1h 23min ago
     Docs: man:lvmetad(8)
 Main PID: 32643 (lvmetad)
   CGroup: /system.slice/lvm2-lvmetad.service
           32643 /usr/sbin/lvmetad -f

May 16 10:50:04 harding-03.lab.msp.redhat.com systemd[1]: Started LVM2 metadata daemon.
May 16 10:50:04 harding-03.lab.msp.redhat.com systemd[1]: Starting LVM2 metadata daemon...



[root@harding-02 ~]# systemctl status lvm2-lvmetad
lvm2-lvmetad.service - LVM2 metadata daemon
   Loaded: loaded (/usr/lib/systemd/system/lvm2-lvmetad.service; disabled; vendor preset: enabled)
   Active: inactive (dead) since Tue 2017-05-16 12:13:19 CDT; 15s ago
     Docs: man:lvmetad(8)
 Main PID: 822 (code=exited, status=0/SUCCESS)

May 16 15:12:13 harding-02.lab.msp.redhat.com systemd[1]: Started LVM2 metadata daemon.
May 16 15:12:13 harding-02.lab.msp.redhat.com systemd[1]: Starting LVM2 metadata daemon...
May 16 12:13:19 harding-02.lab.msp.redhat.com systemd[1]: Stopping LVM2 metadata daemon...
May 16 12:13:19 harding-02.lab.msp.redhat.com lvmetad[822]: Failed to accept connection errno 11.
May 16 12:13:19 harding-02.lab.msp.redhat.com systemd[1]: Stopped LVM2 metadata daemon.



# 7.3 machine (with lvmetad stopped)

[root@harding-02 ~]# pvscan --cache
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
[root@harding-02 ~]# lvs
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  Unknown status flag 'METADATA_FORMAT'.
  Could not read status flags.
  Couldn't read status flags for logical volume VG/pool.
  Couldn't read all logical volume names for volume group VG.
  Unknown status flag 'METADATA_FORMAT'.
  Could not read status flags.
  Couldn't read status flags for logical volume VG/pool.
  Couldn't read all logical volume names for volume group VG.
  Unknown status flag 'METADATA_FORMAT'.
  Could not read status flags.
  Couldn't read status flags for logical volume VG/pool.
  Couldn't read all logical volume names for volume group VG.
  Unknown status flag 'METADATA_FORMAT'.
  Could not read status flags.
  Couldn't read status flags for logical volume VG/pool.
  Couldn't read all logical volume names for volume group VG.
  LV   VG              Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home rhel_harding-02 -wi-ao---- 200.52g                                                    
  root rhel_harding-02 -wi-ao----  50.00g                                                    
  swap rhel_harding-02 -wi-ao----  27.95g                                                    


# Back to 7.4 machine

[root@harding-03 ~]# pvscan --cache
[root@harding-03 ~]# lvchange -ay VG/origin
[root@harding-03 ~]# lvs -a -o +devices
  LV              VG              Attr       LSize    Pool   Origin         Data%  Meta%  Move Log Cpy%Sync Convert Devices               
  [lvol0_pmspare] VG              ewi-------   12.00m                                                               /dev/mapper/mpatha1(5)
  origin          VG              Cwi-a-C---   20.00m [pool] [origin_corig] 0.00   0.39            0.00             origin_corig(0)       
  [origin_corig]  VG              owi-aoC---   20.00m                                                               /dev/mapper/mpatha1(0)
  [pool]          VG              Cwi---C---   12.00m                       0.00   0.39            0.00             pool_cdata(0)         
  [pool_cdata]    VG              Cwi-ao----   12.00m                                                               /dev/mapper/mpathb1(0)
  [pool_cmeta]    VG              ewi-ao----   12.00m                                                               /dev/mapper/mpathb1(3)

Comment 3 David Teigland 2017-05-16 19:54:56 UTC

The METADATA_FORMAT LV flag is added to the VG metadata so that older versions of lvm will not try to use the LV with the incompatible cache metadata format. Unfortunately, this is not a good way of handling backward compatability, because old versions fail to parse the VG metadata and return NULL. When using lvmetad, there is no VG metadata to send to lvmetad. This means that pvscan --cache sends the PV info to lvmetad, but then there is no VG metadata to send. In lvmetad, this looks like a PV exists with no VG using it. The recent "pv in use" flag does its job and tells lvm that the PV really is used by a VG, even though there's no VG metadata available for it. The next command, e.g. pvs, gets the PVs and VGs from lvmetad, and sees there is no VG using the PV marked in-use (and holding an mda). _check_or_repair_orphan_pv_ext() mistakenly thinks that this is a situation where the in-use flag was wrongly set and "repairs" the PV, and the VG is clobbered.

There are multiple problems:

1. There needs to be a way of reading and returning VG metadata with unknown parts, so that the VG metadata can exist in lvm but not be used. Something similar is done for unknown segment types. The unknown segment types logic needs to be expanded to cover other cases, or the same idea needs to be implemented for other parts of the VG metadata that are unknown. With this in place, pvscan in the example above would get a kind of "opaque" VG metadata back from scanning, and send it to lvmetad.

2. vg_read should not be repairing PVs when it doesn't think the in-use flag is correct. This should be a manual operation. There are too many cases where a mistake or bug can trigger this and remove a VG (I've suggested this before when this happened to me in a different situation.) Even a user error related to a mistake zoning devices could cause lvm to clobber a VG just by reading it.

3. The cache metadata format 2 compatability needs to use some mechanism described in 1, or maybe there's some other mechanism. Adding to the problem is that it should use mechanisms that already exist in lvm versions that have been released. I'm wondering what methods we've used before when adding new features that are not compatible with old versions.

Comment 4 David Teigland 2017-05-16 20:37:57 UTC

When not using lvmetad, lvm from 7.3 reports the errors in comment 2 and shows the device as not even being a PV.  Because it's not reported as a PV, lvm does not attempt to "repair" the in-use flag, so it doesn't automatically clobber the VG.  However, by not thinking the device is a PV, it will also happily allow you to run pvcreate on it.  So, the difference between with-lvmetad and without-lvmetad is that with lvmetad, lvm sees the device as an in-use PV, and without lvmetad, lvm sees the device as completely unused.  The with-lvmetad state is more correct, and actually what we want, but unfortunately it also happens to trigger the incorrect repair code.

Comment 5 Alasdair Kergon 2017-05-17 14:12:35 UTC

We're discussing alternative solutions, such as appending these flags to the segment type in the on-disk metadata e.g. "cache+v2" so the old version will invoke the unknown segment type code which we believe still works.

Comment 6 Alasdair Kergon 2017-05-17 14:13:23 UTC

I assume raid is similarly affected by this problem.

Comment 7 David Teigland 2017-05-17 19:15:01 UTC

> 2. vg_read should not be repairing PVs when it doesn't think the in-use flag
> is correct.

In general, this repair only applies to PVs with no mda's on them, which makes it not quite as problematic (although I still think it's better to require a manual action to remove the PV.)  To make the repair safer (e.g. less likely to be caused by other bugs in lvm as shown in this bz), the _check_or_repair_orphan_pv_ext() could read the mda itself directly from disk to verify there is no VG using the PV being repaired.  As it stands now, unknown bugs elsewhere in lvm are liable to trigger the repair and wrongly remove a PV.

Comment 8 David Teigland 2017-05-17 20:28:30 UTC

> > 2. vg_read should not be repairing PVs when it doesn't think the in-use flag
> > is correct.
> 
> In general, this repair only applies to PVs with no mda's on them, which
> makes it not quite as problematic

This is still a misunderstanding of the pv-in-use repair case.  It's in-use orphan PVs with mdas that are repaired (since in-use and orphan are contradictory, the in-use is cleared.)  lvm expects that since the PV has an mda, it will know about the VG using it, and the PV won't look like an orphan.  In this bug, the VG found in the mda is not understood (new flag unrecognized), and bad error handling makes lvm think there is no VG (instead of a VG that's not understood), which triggers the repair case of:  PV in-use with mda which looks unused by a VG.

I no longer think there are "too many cases" where this repair could be wrongly triggered.  I still think there are some (not least of which are unknown lvm bugs), which make me prefer an option for manual repair in the future.

Comment 9 David Teigland 2017-05-23 21:04:10 UTC

I think it's impossible for the PV in-use repair code to be made both automatic and correct in all cases.  The problematic case is when the VG metadata is damaged on a PV, and no other PVs with metadata are available at the moment.  In this case, lvm will identify the PV with damaged metadata as an in-use orphan (the case that the repair code currently automatically repairs by clearing the in-use flag).

The in-use flag is the one thing that identifies a PV as used when no metadata is available referencing it, i.e. exactly the kind of indicator that would be needed to protect the PV from the repair code from clobbering it.

The automatic repair code does not account for the case where the PV does hold VG metadata, but the metadata has been damaged and can't be used to reference the PV.

Comment 10 David Teigland 2017-05-23 21:31:01 UTC

I think comment 9 isn't quite correct.  lvm should keep a list of PVs with metadata that it can't read properly, and not do auto-repair of the in-use flag on those.  
This list of problematic PVs should also be used to:  prevent pvcreate/vgextend from being run on them (like duplicate PVs), display them with some new flag in 'pvs'.

Comment 11 Peter Rajnoha 2017-05-24 08:38:34 UTC

(In reply to David Teigland from comment #10)
> I think comment 9 isn't quite correct.  lvm should keep a list of PVs with
> metadata that it can't read properly, and not do auto-repair of the in-use
> flag on those.  

I agree with this - it's simply not an "orphan PV" anymore if we know there are metadata available, but we failed to read them either because the checksum failed or because it contains metadata that the LVM version used doesn't understand. The code as it is today happily marks such PVs as orphans. If LVM skips the VG metadata in this case (and marks that device as orphan), the pv-in-use repair code needs to know that! (...to skip the repair in this case) We need a new state for this besides "in VG" an "orphan" PV.

Comment 12 Zdenek Kabelac 2017-05-24 08:53:16 UTC

I'm still unclear about which auto-repair we are having trouble with.

AFAIK - any PV 'auto-repair' should be ONLY running for a case the PV was KNOWN to be empty (no allocated extent)  (Note: the problem is also closely related to 'raid' want-to-be transient-failure auto-repair -  which is very fragile ATM)

In case the PV is only marked  'IN-USE' but we don't know any other data about it, we should not consider this PV to be empty, so it should not qualify for auto repair.

Running any repair on  'invalid' metadata is not going to work as long as the code cannot differentiate the reason of metadata read failure i.e. invalid metadata with i.e.  broken checksum or failing parser with unsupported syntax and many other sorts of failure.

Comment 13 Peter Rajnoha 2017-05-24 11:20:23 UTC

(In reply to Zdenek Kabelac from comment #12)
> I'm still unclear about which auto-repair we are having trouble with.
> 

This code exactly in _vg_read fn:

        if (is_orphan_vg(vgname)) {
                if (use_precommitted) {
                        log_error(INTERNAL_ERROR "vg_read_internal requires vgname "
                                  "with pre-commit.");
                        return NULL;
                }
                return _vg_read_orphans(cmd, warn_flags, vgname, consistent);
        }


Then _vg_read_orphans fn:

        if (!(vginfo = lvmcache_vginfo_from_vgname(orphan_vgname, NULL)))
                return_NULL;

        if (!(fmt = lvmcache_fmt_from_vgname(cmd, orphan_vgname, NULL, 0)))
                return_NULL;

        vg = fmt->orphan_vg;
        ...
        if (!lvmcache_foreach_pv(vginfo, _vg_read_orphan_pv, &baton))
              return_NULL;


...and then _vg_read_orphan_pv calling _check_or_repair_pv_ext that fixes the "PV in use" flag - if we find some PV has this flag set, we try to fix that as we're processing "orphans" at this moment - so they're NOT in a VG. Unfortunately, we're mapping also PVs where we can't read VG metadata or VGs where CRC failed to "orphan PVs" too. That should be mapped onto a new entity like "PV with unknown VG metadata" so we can check properly for this and skip any "PV in use" repairs in this case.

The "PV in use" repair code only fires if there's:
  - at least one metadata area present
  - AND at the same time the PV is marked as "used by a VG"
  - AND at the same time it's considered "orphan" by LVM

In this case, the repair code tries to drop the "in use" flag.

The problem is in the "at least one metadata area present" condition - if we know there's at least one metadata area and at the same time it's orphan PV, it can also mean:
  - the VG's CRC check failed (and hence we have no valid VG metadata)
  - we simply can't read the VG metadata (because it comes from newer LVM version?)

This was not taken into account when coding the "PV in use repair" part and needs to be fixed.

Comment 14 Alasdair Kergon 2017-05-24 22:11:26 UTC

1) In the short term, we will disable the code that automatically repairs in use flags.

2) We will consider how situations that are no longer repaired can be handled.  E.g. make sure the messages/warnings are always appropriate and it's straightforward for someone to understand what they need to do to resolve them.  This might involve command line extensions.

3) We will consider how to handle the low-level checksum/unrecognised metadata situations, perhaps with a new flag/internal state, so they can be distinguished.

4) We will consider whether metadata with correct checksum but nevertheless unrecognised either in whole or in part needs to be handled and preserved in some way - just as we do when a segment type is unrecognised. 

5) We will consider whether we can re-enable an automatic in-use repair given some or all of (2)-(4), possibly introducing a multi-step VG metadata commit mechanism (where the VG metadata records PVs that still require their in-use flag cleared).

6) We will audit all the on-disk metadata extensions (especially raid) and identify similar changes that are incompatible with old versions of LVM and introduce segment_type+flags for each of them.

7) We will audit all changes to in-kernel state and ensure on-disk metadata adequately records the state required for recovery in line with (6).

Comment 16 David Teigland 2017-05-25 17:36:47 UTC

The first commit on this branch disables the auto repair:

https://sourceware.org/git/?p=lvm2.git;a=shortlog;h=refs/heads/dev-dct-pv-invalid-metadata

The second patch is a start of special handling for PVs with invalid metadata, keeping them in a special list of devices like duplicate PVs.  This can be used to skip in-use repair on them, and prevent pvcreate from being run on them.

Comment 17 Zdenek Kabelac 2017-05-31 08:33:51 UTC

lvm2 upstream commit converted  'cache2' support into new kind of flagging going with segtype name - this should produce metadata, which are readable by older lvm2 code and yet seen as unusable. Same change needs to be made for raid.

With this patch part of problem should be solved since user will not be able to 'easily' trigger the incompatibility problem with unknown status flag.

Comment 18 Zdenek Kabelac 2017-06-14 15:49:11 UTC

Patch seq:  https://www.redhat.com/archives/lvm-devel/2017-May/msg00085.html
added upstream this  SEGTYPE_FLAG support.

So any new STATUS_FLAG needs to be  converted to SEGTYPE_FLAG, where we do know, unknow segtype works well.

Comment 20 Corey Marthaler 2017-06-22 20:55:50 UTC

Fix verified in the latest rpms.

3.10.0-685.el7.x86_64
lvm2-2.02.171-6.el7    BUILT: Wed Jun 21 09:35:03 CDT 2017
lvm2-libs-2.02.171-6.el7    BUILT: Wed Jun 21 09:35:03 CDT 2017
lvm2-cluster-2.02.171-6.el7    BUILT: Wed Jun 21 09:35:03 CDT 2017
device-mapper-1.02.140-6.el7    BUILT: Wed Jun 21 09:35:03 CDT 2017
device-mapper-libs-1.02.140-6.el7    BUILT: Wed Jun 21 09:35:03 CDT 2017
device-mapper-event-1.02.140-6.el7    BUILT: Wed Jun 21 09:35:03 CDT 2017
device-mapper-event-libs-1.02.140-6.el7    BUILT: Wed Jun 21 09:35:03 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017



# format 2

# Created on a 7.4 machine and attempted to be activated on this 7.3 machine.
[root@host-130 ~]# pvscan --cache
  WARNING: Unrecognised segment type cache-pool+METADATA_FORMAT
[root@host-130 ~]# vgchange -ay activator1
  WARNING: Unrecognised segment type cache-pool+METADATA_FORMAT
  Internal error: _emit_target cannot handle segment type cache-pool+METADATA_FORMAT
  0 logical volume(s) in volume group "activator1" now active
[root@host-130 ~]# lvs -a -o +devices
  WARNING: Unrecognised segment type cache-pool+METADATA_FORMAT
  LV                  VG         Attr       LSize   Pool          Origin         Data%  Meta%  Cpy%Sync Devices        
  cache1              activator1 Cwi---C--- 100.00m [cache1_fast] [cache1_corig]                        cache1_corig(0)
  [cache1_corig]      activator1 owi---C--- 100.00m                                                     /dev/sda2(0)   
  [cache1_fast]       activator1 vwi---u---  52.00m                                                                    
  [cache1_fast_cdata] activator1 -wi-------  52.00m                                                     /dev/sda3(0)   
  [cache1_fast_cmeta] activator1 -wi-------   8.00m                                                     /dev/sda3(13)  
  [lvol0_pmspare]     activator1 ewi-------   8.00m                                                     /dev/sda3(15)  

# Unable to remove/alter the VG
[root@host-130 ~]# lvremove -f activator1
  WARNING: Unrecognised segment type cache-pool+METADATA_FORMAT
  Cannot change VG activator1 with unknown segments in it!
  Cannot process volume group activator1

[root@host-130 ~]# vgremove activator1
  WARNING: Unrecognised segment type cache-pool+METADATA_FORMAT
  Cannot change VG activator1 with unknown segments in it!
  Cannot process volume group activator1

[root@host-132 ~]# vgrename activator1 activator7
  WARNING: Unrecognised segment type cache-pool+METADATA_FORMAT
  Cannot change VG activator1 with unknown segments in it!

[root@host-132 ~]# lvcreate -n foo -L 100M activator1
  WARNING: Unrecognised segment type cache-pool+METADATA_FORMAT
  Cannot change VG activator1 with unknown segments in it!

# Same thing on 7.2 machine
[root@host-132 ~]# vgremove -f activator1
  WARNING: Unrecognised segment type cache-pool+METADATA_FORMAT
  Cannot change VG activator1 with unknown segments in it!
  Cannot process volume group activator1

# Still usuable back on the 7.4 machine
[root@host-127 ~]# pvscan --cache
[root@host-127 ~]# lvchange -ay activator1
[root@host-127 ~]# lvs -a -o +devices
  LV                  VG         Attr       LSize   Pool          Origin         Data%  Meta%  Cpy%Sync Devices             
  cache1              activator1 Cwi-a-C--- 100.00m [cache1_fast] [cache1_corig] 0.24   0.63   0.00     cache1_corig(0)     
  [cache1_corig]      activator1 owi-aoC--- 100.00m                                                     /dev/sdb2(0)        
  [cache1_fast]       activator1 Cwi---C---  52.00m                              0.24   0.63   0.00     cache1_fast_cdata(0)
  [cache1_fast_cdata] activator1 Cwi-ao----  52.00m                                                     /dev/sdb3(0)        
  [cache1_fast_cmeta] activator1 ewi-ao----   8.00m                                                     /dev/sdb3(13)       
  [lvol0_pmspare]     activator1 ewi-------   8.00m                                                     /dev/sdb3(15)       



# format 1

# Created on a 7.4 machine and attempted to be activated on this 7.3 machine.

[root@host-130 ~]# lvchange -ay activator1
[root@host-130 ~]# lvs -a -o +devices
  LV                  VG         Attr       LSize   Pool          Origin         Data%  Meta%  Cpy%Sync Devices             
  cache1              activator1 Cwi-a-C--- 100.00m [cache1_fast] [cache1_corig] 0.00   0.54   0.00     cache1_corig(0)     
  [cache1_corig]      activator1 owi-aoC--- 100.00m                                                     /dev/sda2(0)        
  [cache1_fast]       activator1 Cwi---C---  52.00m                              0.00   0.54   0.00     cache1_fast_cdata(0)
  [cache1_fast_cdata] activator1 Cwi-ao----  52.00m                                                     /dev/sda3(0)        
  [cache1_fast_cmeta] activator1 ewi-ao----   8.00m                                                     /dev/sda3(13)       
  [lvol0_pmspare]     activator1 ewi-------   8.00m                                                     /dev/sda3(15)       

# Same on 7.2 node
[root@host-132 ~]# lvs -a -o +devices
  LV                  VG         Attr       LSize   Pool          Origin         Data%  Meta%  Cpy%Sync Devices             
  cache1              activator1 Cwi-a-C--- 100.00m [cache1_fast] [cache1_corig] 0.00   0.54   100.00   cache1_corig(0)     
  [cache1_corig]      activator1 owi-aoC--- 100.00m                                                     /dev/sda2(0)        
  [cache1_fast]       activator1 Cwi---C---  52.00m                              0.00   0.54   100.00   cache1_fast_cdata(0)
  [cache1_fast_cdata] activator1 Cwi-ao----  52.00m                                                     /dev/sda3(0)        
  [cache1_fast_cmeta] activator1 ewi-ao----   8.00m                                                     /dev/sda3(13)       
  [lvol0_pmspare]     activator1 ewi-------   8.00m                                                     /dev/sda3(15)       

# Alteration is allowed
[root@host-132 ~]# lvremove -f activator1
  Logical volume "cache1_fast" successfully removed
  Logical volume "cache1" successfully removed

Comment 21 errata-xmlrpc 2017-08-01 21:54:18 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2222