Bug 1254393 - pvremove of a device under cloned vg fails
Summary: pvremove of a device under cloned vg fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: rc
: ---
Assignee: David Teigland
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-08-18 00:46 UTC by Shivananda
Modified: 2016-11-16 04:50 UTC (History)
21 users (show)

Fixed In Version: lvm2-2.02.161-1.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-04 04:10:22 UTC
Target Upstream Version:


Attachments (Terms of Use)
Console o/p captured while cloning (17.98 KB, text/plain)
2015-08-19 17:28 UTC, Shivananda
no flags Details
dmesg o/p during cloning operation (13.61 KB, text/plain)
2015-08-19 17:28 UTC, Shivananda
no flags Details
lsof, lsblk and lvmdump -l -s command output (3.42 MB, application/zip)
2015-10-28 13:06 UTC, Shivananda
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1445 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2016-11-03 13:46:41 UTC

Description Shivananda 2015-08-18 00:46:38 UTC
Description of problem:
pvremove fails to delete a cloned physical device as it's still held by original/source vg.

Version-Release number of selected component (if applicable):
# lvm version
  LVM version:     2.02.115(2)-RHEL7 (2015-01-28)
  Library version: 1.02.93-RHEL7 (2015-01-28)
  Driver version:  4.29.0

How reproducible:
Always

Steps to Reproduce:
1. On linux host have a VG created out of a single PV/LUN from stroage vendor (for eg. NETAPP)
2. Create a hardware snapshot of it
3. Create a clone of the LUN from the hardware snapshot
4. Map the cloned LUN to the host and discover it
5. issue 'vgimportclone -n <clone vg name> -i <device path of Cloned PV>
6. issue 'vgchange -a n <clone vg name>'
7. 'vgremove -f <clone vg name>'
8. issue 'pvremove -ff <clone PV>'


Actual results:

On the hardware create clone and map the lun to the host

#  rescan-scsi-bus.sh -a

#  pvscan
   PV /dev/mapper/3600a098054313968772b334e2f5a562f   VG vg2       lvm2 [5.00 GiB / 1020.00 MiB free]
  PV /dev/mapper/3600a098054313968772b334e2f643649   VG tstvg     lvm2 [1020.00 MiB / 520.00 MiB free]
  PV /dev/sda2                                       VG vg_root   lvm2 [271.44 GiB / 40.00 MiB free]
  PV /dev/mapper/3600a098054313968713f33483048396e   VG vg        lvm2 [2.00 GiB / 1.99 GiB free]
  Total: 4 [279.43 GiB] / in use: 4 [279.43 GiB] / in no VG: 0 [0   ]

#  vgscan
   Reading all physical volumes.  This may take a while...
  Found volume group "vg2" using metadata type lvm2
  Found volume group "tstvg" using metadata type lvm2
  Found volume group "vg_root" using metadata type lvm2
  Found volume group "vg" using metadata type lvm2

#  vgimportclone -n cloneoftstvg -i /dev/mapper/3600a098054313968772b334e2f643649

  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: Activation disabled. No device-mapper interaction will be attempted.
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: Activation disabled. No device-mapper interaction will be attempted.
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  WARNING: lvmetad is running but disabled. Restart lvmetad before enabling it!
  /dev/mapper/3600a098054313968772b334e2f637265: read failed after 0 of 4096 at 0: Input/output error
  /dev/mapper/3600a098054313968772b334e2f637265: read failed after 0 of 4096 at 1073676288: Input/output error
  /dev/mapper/3600a098054313968772b334e2f637265: read failed after 0 of 4096 at 1073733632: Input/output error
  /dev/mapper/3600a098054313968772b334e2f637265: read failed after 0 of 4096 at 4096: Input/output error
  /dev/mapper/3600a098054313968772b334e2f637266: read failed after 0 of 4096 at 0: Input/output error
  /dev/mapper/3600a098054313968772b334e2f637266: read failed after 0 of 4096 at 1073676288: Input/output error
  /dev/mapper/3600a098054313968772b334e2f637266: read failed after 0 of 4096 at 1073733632: Input/output error
  /dev/mapper/3600a098054313968772b334e2f637266: read failed after 0 of 4096 at 4096: Input/output error
  /dev/mapper/3600a098054313968772b334e2f643646: read failed after 0 of 4096 at 0: Input/output error
  /dev/mapper/3600a098054313968772b334e2f643646: read failed after 0 of 4096 at 1073676288: Input/output error
  /dev/mapper/3600a098054313968772b334e2f643646: read failed after 0 of 4096 at 1073733632: Input/output error
  /dev/mapper/3600a098054313968772b334e2f643646: read failed after 0 of 4096 at 4096: Input/output error
   Physical volume "/tmp/snap.VCXg6JnO/vgimport0" changed
  1 physical volume changed / 0 physical volumes not changed
  Volume group "tstvg" successfully changed
  Volume group "tstvg" successfully renamed to "cloneoftstvg"
Notifying lvmetad about changes since it was disabled temporarily.
(This resolves any WARNING message about restarting lvmetad that appears above.)
  Reading all physical volumes.  This may take a while...
  Found volume group "cloneoftstvg" using metadata type lvm2
  Found volume group "tstvg" using metadata type lvm2
  Found volume group "vg" using metadata type lvm2
  Found volume group "vg2" using metadata type lvm2
  Found volume group "vg_root" using metadata type lvm2

#  vgdisplay -v cloneoftstvg
    Using volume group(s) on command line.
   --- Volume group ---
  VG Name               cloneoftstvg
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  5
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               1020.00 MiB
  PE Size               4.00 MiB
  Total PE              255
  Alloc PE / Size       125 / 500.00 MiB
  Free  PE / Size       130 / 520.00 MiB
  VG UUID               ROo1q7-8qS2-dwdj-ZOJy-k4vn-25z1-2I2D27

  --- Logical volume ---
  LV Path                /dev/cloneoftstvg/tstlvol
  LV Name                tstlvol
  VG Name                cloneoftstvg
  LV UUID                Uz2Huo-QvOK-3LRq-qdCP-AN3U-tWZa-3IQPy2
  LV Write Access        read/write
  LV Creation host, time elias.gdl.englab.netapp.com, 2015-07-13 10:45:54 -0400
  LV Status              NOT available
  LV Size                500.00 MiB
  Current LE             125
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto

  --- Physical volumes ---
  PV Name               /dev/mapper/3600a098054313968772b334e2f643649
  PV UUID               gOXAn2-XILm-QVnh-FaBs-ltJm-De32-O8xwLa
  PV Status             allocatable
  Total PE / Free PE    255 / 130


#  vgchange -a y cloneoftstvg
  1 logical volume(s) in volume group "cloneoftstvg" now active

#  vgdisplay -v tstvg
    Using volume group(s) on command line.
   --- Volume group ---
  VG Name               tstvg
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  6
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               1020.00 MiB
  PE Size               4.00 MiB
  Total PE              255
  Alloc PE / Size       125 / 500.00 MiB
  Free  PE / Size       130 / 520.00 MiB
  VG UUID               6Q9k4H-Zw4d-O2Ur-a5E7-kvzi-xb3i-trlMSJ

  --- Logical volume ---
  LV Path                /dev/tstvg/tstlvol
  LV Name                tstlvol
  VG Name                tstvg
  LV UUID                Uz2Huo-QvOK-3LRq-qdCP-AN3U-tWZa-3IQPy2
  LV Write Access        read/write
  LV Creation host, time elias.gdl.englab.netapp.com, 2015-07-13 10:45:54 -0400
  LV Status              available
  # open                 0
  LV Size                500.00 MiB
  Current LE             125
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:10

  --- Physical volumes ---
  PV Name               /dev/mapper/3600a098054313968772b334e2f627648
  PV UUID               xmwZue-t1eO-fIcX-28jH-43E4-YHrn-2bMNWS
  PV Status             allocatable
  Total PE / Free PE    255 / 130


#  vgdisplay -v cloneoftstvg
    Using volume group(s) on command line.
   --- Volume group ---
  VG Name               cloneoftstvg
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  5
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               1020.00 MiB
  PE Size               4.00 MiB
  Total PE              255
  Alloc PE / Size       125 / 500.00 MiB
  Free  PE / Size       130 / 520.00 MiB
  VG UUID               ROo1q7-8qS2-dwdj-ZOJy-k4vn-25z1-2I2D27

  --- Logical volume ---
  LV Path                /dev/cloneoftstvg/tstlvol
  LV Name                tstlvol
  VG Name                cloneoftstvg
  LV UUID                Uz2Huo-QvOK-3LRq-qdCP-AN3U-tWZa-3IQPy2
  LV Write Access        read/write
  LV Creation host, time elias.gdl.englab.netapp.com, 2015-07-13 10:45:54 -0400
  LV Status              available
  # open                 0
  LV Size                500.00 MiB
  Current LE             125
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:13

  --- Physical volumes ---
  PV Name               /dev/mapper/3600a098054313968772b334e2f643649
  PV UUID               gOXAn2-XILm-QVnh-FaBs-ltJm-De32-O8xwLa
  PV Status             allocatable
  Total PE / Free PE    255 / 130



#  vgchange -a n cloneoftstvg
   0 logical volume(s) in volume group "cloneoftstvg" now active

#  vgremove -f cloneoftstvg
   Logical volume "tstlvol" successfully removed
  Volume group "cloneoftstvg" successfully removed

#  "echo y | pvremove -ff /dev/mapper/3600a098054313968772b334e2f643649"
  Can't open /dev/mapper/3600a098054313968772b334e2f643649 exclusively - not removing. Mounted filesystem?

#  multipath -f /dev/mapper/3600a098054313968772b334e2f643649

Aug 17 19:51:17 | /dev/mapper/3600a098054313968772b334e2f643649: map in use
Aug 17 19:51:17 | failed to remove multipath map /dev/mapper/3600a098054313968772b334e2f643649



Expected results:
pvremove should gracefully delete the PV on the clone device without any errors


Additional info:
Here is a snippet of the output of 'dmsetup ls --tree' command after 'vgimportclone'.

cloneoftstvg-tstlvol (253:13)
 +-3600a098054313968772b334e2f643649 (253:12)
    +- (65:224)
    +- (65:176)
    +- (65:192)
    +- (65:144)
    +- (66:0)
    +- (65:240)
    +- (65:208)
    +- (65:160)

tstvg-tstlvol (253:10)
 +-3600a098054313968772b334e2f643649 (253:12)
    +- (65:224)
    +- (65:176)
    +- (65:192)
    +- (65:144)
    +- (66:0)
    +- (65:240)
    +- (65:208)
    +- (65:160)

Observe both source/original and clone vg shows same PV - Clone PV.
To workaround this we've introduced 'multipath -F' and 'multipath -r' before 'vgimportclone' is issued in our steps so that Source and Cloned VG reports respective PVs.

Comment 2 Marian Csontos 2015-08-18 13:47:39 UTC
Hello, could you please upload a journal/messages from the affected system?

I am surprised pvscan and vgscan had not complained about seeing duplicate UUIDs.

I understand the tstvg/tstlvol was active while rescanning the SCSI bus and it looks like the tstvg's PV /dev/mapper/3600a098054313968772b334e2f627648 was somehow silently replaced by the new block device /dev/mapper/3600a098054313968772b334e2f627649.

> 3. Create a clone of the LUN from the hardware snapshot

Was the original LUN unmapped here?

> 4. Map the cloned LUN to the host and discover it

What do `dmsetup status` and `multipath -ll` say before and after rescanning SCSI bus?

Comment 3 Shivananda 2015-08-19 17:28:00 UTC
Created attachment 1064935 [details]
Console o/p captured while cloning

Comment 4 Shivananda 2015-08-19 17:28:49 UTC
Created attachment 1064936 [details]
dmesg o/p during cloning operation

Comment 5 Shivananda 2015-08-19 17:32:51 UTC
> could you please upload a journal/messages from the affected system?
Attached the console as well as dmesg o/p in another attempt to reproduce this. Let me know if that is not sufficient and any specific logs/messages you wish to have.

> Was the original LUN unmapped here?
No, the original LUN was mapped and VG/LV were active

> What do `dmsetup status` and `multipath -ll` say before and after rescanning SCSI bus?
Refer the attached console o/p in another attempt to reproduce this.

Comment 6 Peter Rajnoha 2015-09-22 13:40:38 UTC
When hit the pvremove failure, please, try to collect the output of following commands:
  lsof
  lsblk
  lvmdump -l -s (mainly to collect relevant lvmetad and systemd unit info)

Is this reproducible if you set use_lvmetad=0 in lvm.conf?

Comment 7 Shivananda 2015-10-28 13:06:44 UTC
Created attachment 1087249 [details]
lsof, lsblk and lvmdump -l -s command output

Comment 8 Shivananda 2015-10-28 13:08:00 UTC
It's not reproducible if use_lvmetad=0 in lvm.conf

Comment 9 Jonathan Earl Brassow 2016-01-22 15:34:46 UTC
More duplicate PV signature issues... Dave, could you take a look?

Comment 10 Shivananda 2016-02-18 05:28:35 UTC
If multipath is not configured and host sees only one path we do not have a workaround (worked around by issuing 'multipath -F' and 'multipath -r).
The only alternative we see now is to stop lvm2-lvmetad.service or set use_lvmetad=0 in lvm.conf.
Let us know if there is any other workaround that we can implement to overcome the issue.

Comment 11 David Teigland 2016-06-10 20:54:03 UTC
Handling of duplicate pvs will be much improved in 7.3 and these issues will probably be solved.  (lvmetad will be automatically disabled while duplicates exist.)

Comment 12 Shivananda 2016-06-13 10:26:27 UTC
Thanks, David for the update that 'lvmetad' will be automatically disabled.
However, I assume that 'lvmetad' will be enabled by 'vgimportlcone' after resolving duplicate PVs.

Comment 13 Roman Bednář 2016-07-20 12:06:39 UTC
Adding QA ack for 7.3.

Comment 15 Roman Bednář 2016-09-19 14:27:44 UTC
Marking verified with latest rpms. 

LVM does not use lvmetad data from now on when duplicate PV is detected until solved.
Then it automatically restores normal operation.


# pvs -o +uuid
  WARNING: Not using lvmetad because duplicate PVs were found.
  WARNING: Use multipath or vgimportclone to resolve duplicate PVs?
  WARNING: After duplicates are resolved, run "pvscan --cache" to enable lvmetad.
  WARNING: PV mGsQGs-y0Xr-xAEB-aKzU-CLKw-fpHy-RN64tu on /dev/sdj was already found on /dev/sdi.
  WARNING: PV mGsQGs-y0Xr-xAEB-aKzU-CLKw-fpHy-RN64tu prefers device /dev/sdi because device was seen first.
  PV         VG            Fmt  Attr PSize   PFree   PV UUID                               
  /dev/sdi   cloneofvg     lvm2 a--  972.00m 972.00m mGsQGs-y0Xr-xAEB-aKzU-CLKw-fpHy-RN64tu
  /dev/vda2  rhel_virt-283 lvm2 a--    7.79g  40.00m RgLVtC-zhsS-aSpi-Qf8F-HcKk-LzhJ-ViCb8D

# systemctl is-active lvm2-lvmetad
active

3.10.0-505.el7.x86_64

lvm2-2.02.165-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
lvm2-libs-2.02.165-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
lvm2-cluster-2.02.165-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-1.02.134-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-libs-1.02.134-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-event-1.02.134-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-event-libs-1.02.134-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 12:29:13 CEST 2016
cmirror-2.02.165-2.el7    BUILT: Wed Sep 14 16:01:43 CEST 2016

Comment 17 errata-xmlrpc 2016-11-04 04:10:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1445.html


Note You need to log in before you can comment on or make changes to this bug.