Bug 1106425
| Summary: | lvmetad doesn't report duplicate PVs UUID to the user | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Zdenek Kabelac <zkabelac> | ||||
| Component: | lvm2 | Assignee: | Petr Rockai <prockai> | ||||
| lvm2 sub component: | LVM Metadata / lvmetad | QA Contact: | cluster-qe <cluster-qe> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | unspecified | ||||||
| Priority: | unspecified | CC: | agk, cmarthal, heinzm, jbrassow, msnitzer, nperic, prajnoha, prockai, rbednar, teigland, zkabelac | ||||
| Version: | 7.1 | ||||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | lvm2-2.02.125-1.el7 | Doc Type: | Bug Fix | ||||
| Doc Text: |
Cause: lvmetad was not keeping track of devices with duplicate PVs.
Consequence: LVM commands would not print warnings about duplicate PVs.
Fix: lvmetad now keeps track of duplicate PVs.
Result: LVM commands print warnings about duplicate PVs.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-11-19 12:45:33 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Does vgck report this still? I don't think it's necessary to report this every time a command runs as that would defeat part of the purpose of lvmetad, but it ought to be made visible in the system log if detected while booting, for example, and - even with lvmetad running - there should be one basic command like vgck that does detect it. 1) Review vgck to ensure it detects and reports cases like these 2) Review boot messages to make sure this sort of problem is visible (logged by pvscan --cache perhaps?) I think that even though it slightly defeats the purpose of lvmetad, it would add value as a nagging warning to the user that something is actually QUITE wrong. The system should not work with PVs with identical PV UUIDs, since there is no smart way for LVM to pick which device it would use. If the user/admin is not aware of the problem, it could just get worse in the long run. I think that nagging about it would prompt the user to solve the issue, rather than not notice it (or simply forget about it, leaving the system in a generally bad/unpredictable state). Nenad, that may or may not be true -- there are completely harmless reasons for duplicate PVs, if the device is simply available multiple times even though it is in fact the same device. On the other hand, as you say, it could also indicate serious trouble. In case those are more important to us, it might be worth upgrading this to an error whenever it means that activation or metadata writes become ambiguous. One way to do that would be to add a PV flag like MISSING that would say the PV is ambiguous and cannot be used until fixed. I am a bit worried though that people that “know what they are doing” would complain if they are forced to only ever use unique PV IDs. Either way, I don't think a nagging warning is a good compromise: it's annoying if you want to ignore them but it won't stop you from doing something stupid if you don't know what's going on. Having duplicate PV UUIDs is always wrong and it either needs to be properly filtered by "automatic" filters (md, mpath, ...) or "manual" filters (devices/global_filter). So I think the error message suits best here so users know that something is not OK - either the automatic filter failed (and so people can report this to us) or there's missing manual filter they need to add (in case they're copying PVs - e.g. source VM image that is copied over to create other VMs). But definitely, user needs to know this. So I vote for log_error here. (Also, considering bug #1139216 comment #12, I would call for the log_error/log_warn even more now!) Since more distros are enabling lvmetad (i.e. gentoo), we start to get reports from users using i.e. kvm where they put same vgname within guest (vg0) and then they try to access such VG (when guest is shutdown) from host - and with lvm2 system is behaving weirdly without telling user where is actually problem (WARNING about duplicate vgname is missing). Maybe we could provide at least some 'temporary' hack solution until some 'correct' caching one is found - So I'd propose to extend lvmetad protocol in a way that once inconsistency in metadata is found (duplications of PVs, VGnames...) which lvmetad cannot resolve - it would simply turn into offline mode in way it would not return cached metadata, but just some 'inconsistency error message' and lvm2 command would then continue with own scanning (which is known usable way). (and reporting this error message with advice that fix is needed) Now the problem is how to 'reenable' lvmetad which is returning 'inconsistent data' error back to every lvm2 command now - since what is consistent with local filter could be still inconsistent with global filter. So I think it's valid to print some 'WARNING' that user needs to fix consistency and execute pvscan --cache (to use just global filter) - once he thinks he fixed all trouble either via setting proper filtering or renaming.... Comment 7 is not related to this bug. The warnings about duplicate PV UUIDs are now issued with lvmetad, since 24352aff2b30ad23aebd444c752d8b6dd02c9f03; therefore, this bug is POST. For other issues, please open separate tickets. Tested with: 3.10.0-320.el7.x86_64 lvm2-2.02.130-2.el7 BUILT: Tue Sep 15 14:15:40 CEST 2015 lvm2-libs-2.02.130-2.el7 BUILT: Tue Sep 15 14:15:40 CEST 2015 lvm2-cluster-2.02.130-2.el7 BUILT: Tue Sep 15 14:15:40 CEST 2015 device-mapper-1.02.107-2.el7 BUILT: Tue Sep 15 14:15:40 CEST 2015 device-mapper-libs-1.02.107-2.el7 BUILT: Tue Sep 15 14:15:40 CEST 2015 device-mapper-event-1.02.107-2.el7 BUILT: Tue Sep 15 14:15:40 CEST 2015 device-mapper-event-libs-1.02.107-2.el7 BUILT: Tue Sep 15 14:15:40 CEST 2015 device-mapper-persistent-data-0.5.5-1.el7 BUILT: Thu Aug 13 16:58:10 CEST 2015 cmirror-2.02.130-2.el7 BUILT: Tue Sep 15 14:15:40 CEST 2015 Marking as verified. Details of test results in attachment. Created attachment 1082798 [details]
test results
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2147.html |
Description of problem: When system is using 'lvmetad' and it happens there are multiple disks with same PV UUID, user will not see an error from lvm2 commands, since duplicates are filtered inside lvmetad. Here is an example code (from lvm2 test suite shell/pv-duplicate-uuid.sh pvcreate "$dev1" UUID1=$(get pv_field "$dev1" uuid) pvcreate --config "devices{filter=[\"a|$dev2|\",\"r|.*|\"]}" -u "$UUID1" --norestorefile "$dev2" pvcreate --config "devices{filter=[\"a|$dev3|\",\"r|.*|\"]}" -u "$UUID1" --norestorefile "$dev3" pvs -o+uuid -- gives for non-lvmetad case: Found duplicate PV 7yS60qutMr1Gdol1qBWsFhuOGFUn2fJl: using @TESTDIR@/dev/mapper/@PREFIX@pv2 not @TESTDIR@/dev/mapper/@PREFIX@pv1 Found duplicate PV 7yS60qutMr1Gdol1qBWsFhuOGFUn2fJl: using @TESTDIR@/dev/mapper/@PREFIX@pv3 not @TESTDIR@/dev/mapper/@PREFIX@pv2 PV VG Fmt Attr PSize PFree PV UUID @TESTDIR@/dev/mapper/@PREFIX@pv3 lvm2 a-- 33.67m 33.67m 7yS60q-utMr-1Gdo-l1qB-WsFh-uOGF-Un2fJl -- while 'lvmetad' system responds this way: PV VG Fmt Attr PSize PFree PV UUID @TESTDIR@/dev/mapper/@PREFIX@pv3 lvm2 a-- 33.67m 33.67m iQEH6m-1NmK-RXex-xLbb-Kcu3-wJHz-IkUIlm -- So with non-lvmetad case user knows about the problem and may i.e. change filter and fix UUID, but with lvmetad system 'silently' takes the last scanned PV and user has no info - there are major consistency problem in his system. i.e. user often use LVs as PVs for guest machine and doesn't properly exclude such PVs from host access... Version-Release number of selected component (if applicable): 2.02.106 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: User is informed about duplicate UUIDs in his system. Additional info: