Bug 1287972
Summary: | vgimportclone fails because of duplicate PV | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Shivasharan <sharan8989> | ||||||||||||||
Component: | lvm2 | Assignee: | David Teigland <teigland> | ||||||||||||||
lvm2 sub component: | Scripts / lvmdump / vgimportclone (RHEL6) | QA Contact: | cluster-qe <cluster-qe> | ||||||||||||||
Status: | CLOSED WONTFIX | Docs Contact: | |||||||||||||||
Severity: | urgent | ||||||||||||||||
Priority: | unspecified | CC: | agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, sharan8989, zkabelac | ||||||||||||||
Version: | 6.4 | ||||||||||||||||
Target Milestone: | rc | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2017-12-06 10:59:58 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Attachments: |
|
Description
Shivasharan
2015-12-03 08:06:41 UTC
What's the version of lvm2 package? Also, please attach full output from the vgimportlone command while adding "-d -vvvv" options to it for debug output and also output from lsblk command. Thanks. Here is lvm2 info: LVM version: 2.02.98(2)-RHEL6 (2012-10-15) Library version: 1.02.77-RHEL6 (2012-10-15) Driver version: 4.23.6 I will provide you output of two commands shortly. Thanks. (In reply to Shivasharan from comment #3) > Here is lvm2 info: > > LVM version: 2.02.98(2)-RHEL6 (2012-10-15) > Library version: 1.02.77-RHEL6 (2012-10-15) > Driver version: 4.23.6 > So RHEL 6.4. The lvm2 version is quite old there - I think there were a few bugs in vgimportclone which we already fixed in 6.5 and higher. But I need to check what the bugs were about exactly, I don't remember now... (In reply to Peter Rajnoha from comment #4) > (In reply to Shivasharan from comment #3) > > Here is lvm2 info: > > > > LVM version: 2.02.98(2)-RHEL6 (2012-10-15) > > Library version: 1.02.77-RHEL6 (2012-10-15) > > Driver version: 4.23.6 > > > > So RHEL 6.4. The lvm2 version is quite old there - I think there were a few > bugs in vgimportclone which we already fixed in 6.5 and higher. ...nope, that was a bug in 6.6 which was then fixed in 6.6.z. So I'll wait for your debug info. Created attachment 1103305 [details]
vgimportclone output with -d and -vvvv options
I attaching vgimportclone command output (executed with -d and -vvvv options).
Created attachment 1103307 [details]
lsblk command output
Attached another file containing lsblk output.
Based on the logs, the PV you're referencing on command line (mpathkp1) does not exist. The vgimportclone log shows no pvs command output for /dev/mapper/mpathkp1 as well as lsblk not having mpathkp1 listed at all. Are you referencing proper mpath device (the PV used in vgimportclone)? Created attachment 1103831 [details]
output of vgscan command
Yea. mpathk is valid MPIO device. It is visible on host.
I have attached output of vgscan command which complains mpathkp1 having duplicate PV ID as that of mpathjp1. But unfortunately, I don't see mpathj as well in lsblk output. They are actually two copies of same VG (production VG is residing on different host and these are array based backups).
Please, retry the vgscan with "vgscan -vvvv". Then try again with "devices/obtain_device_list_from_udev=0" in /etc/lvm/lvm.conf. (In reply to Peter Rajnoha from comment #10) > Please, retry the vgscan with "vgscan -vvvv". Then try again with > "devices/obtain_device_list_from_udev=0" in /etc/lvm/lvm.conf. (And attach the output here please.) Created attachment 1106718 [details]
vgscan -vvvv output
Attaching vgscan -vvvv output. Sorry for the delay.
I am also going to attach new lsblk and vgimportclone output as the error occurred with new PV this time.
Created attachment 1106719 [details]
vgimportclone output for mpathj
mpathj is the new device on which the error is seen.
Created attachment 1106720 [details]
lsblk output when issue was seen on mpathj
Attaching lsblk output when the issue was seen for mpathj device.
Note: devices/obtain_device_list_from_udev=0 was already present in /etc/lvm/lvm.conf file. I think that these commits fixed the vgimportclone regression caused by the process_each_pv rework: b64da4d8b521 toollib: search for duplicate PVs only when needed 57d74a45a05e toollib: override the PV device with duplicates Also, given the large number of duplicates reported by vgscan, I suspect that lvm is scanning mpath subdevices, which is probably a different issue. Okay. Do you need any additional info? I have raised the Severity. Can this issue be resolved during next week? Let me know if more info is required. I have small query. Can I make one device (whose pvid is preceded by its duplicate) take precedence over its duplicate by executing some command? The goal is to make any PV (from 2 duplicate PVs) active when I wish using a command. You should always be able to use filters to work around issues like this. Either accept only the devices you want to use, or reject the devices you don't want to use. Do we have a command where I can include custom filter and run. Something analogous to - #pvs --config 'devices {filter =[...]}' I want the control to shift from one PV to its duplicate PV. I tried pvscan with --config option. It seem to not work. # pvscan --config 'devices { filter=["r|/dev/mapper/mpathbd|", "a|.*|" ]}' Found duplicate PV G13Sc8dGbKy3RDA06cdaJ9nYuJkXYhTJ: using /dev/mapper/mpathbd not /dev/mapper/mpathbc PV /dev/mapper/mpathbd VG sharan_vg lvm2 [4.00 GiB / 4.00 GiB free] PV /dev/sda2 VG vg_lrmg054 lvm2 [67.88 GiB / 0 free] PV /dev/mapper/mpathd lvm2 [4.00 GiB] PV /dev/mapper/mpathc lvm2 [4.00 GiB] PV /dev/mapper/mpathe lvm2 [4.00 GiB] Total: 5 [83.87 GiB] / in use: 2 [71.87 GiB] / in no VG: 3 [12.00 GiB] I still see mpathbd on the host even after filtering while scanning. Thanks, Sharan Try removing the final "a|.*|" entry in the filter line. I should use filter in every command. Only then it works. Do we know why vgimportclone fails from previous logs? Any lead? Thanks, Sharan (In reply to Shivasharan from comment #23) > I should use filter in every command. Only then it works. > > Do we know why vgimportclone fails from previous logs? Any lead? > There are inconsistencies in the logs: - lsblks doesn't display lots of mpath devices (including mpathjp1 used for vgimportclone) - vgimportclone is called with a device that it doesn't see either (just like lsblk) - vgscan sees devices which lsblk and vgimportclone doesn't see Note: there's no filtering in lsblk, so lsblk should be able to list all devices. Based on this, it's hard to decide what's the exact setup then. Also, looking at the vgscan log, it seems multipath component detection is not working correctly, for example: (grep "Ignoring duplicate PV" vgscan.log) #cache/lvmcache.c:1497 Ignoring duplicate PV PEdluaeBpyNmz7OQ6CE1A5lu7OLvQb21 on /dev/sdb1 - using dm /dev/mapper/mpathbp1 #cache/lvmcache.c:1497 Ignoring duplicate PV S0Ld0yAFsNZvMpVkiFnvQvMwJv1wXd9L on /dev/sdah1 - using dm /dev/mapper/mpathep1 ..and lots of others which are similar. In this case, it seems the /dev/sd* are mulitpath componets while /dev/mapper/mpath* are multipath devices (of course, with same content). At the same time, some of the devices are identified as multipath components correctly, for example: (grep "Skipping mpath component" vgscan.log) #filters/filter-mpath.c:163 /dev/sda: Skipping mpath component device #filters/filter-mpath.c:163 /dev/sdaw: Skipping mpath component device #filters/filter-mpath.c:163 /dev/sdr: Skipping mpath component device ... Were all the logs grabbed from exactly the same system run? If not, it seems the devices are changing very dynamically underneath... Can you please try collecting the udev event log: udevadm monitor --udev --env (and saving the log to a file) and then, while the udevadm monitor is running, call: lsblk vgscan -vvvv vgimportclone with -d and -vvvv lvmdump -u And possibly making sure there's nothing else executed that could possibly work with those devices in parallel. Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available. The official life cycle policy can be reviewed here: http://redhat.com/rhel/lifecycle This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL: https://access.redhat.com/ |