Bug 1446309
| Summary: | lvmetad shouldn't be dumped unless repair cmd was actually completed | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> |
| Component: | lvm2 | Assignee: | David Teigland <teigland> |
| lvm2 sub component: | LVM Metadata / lvmetad | QA Contact: | cluster-qe <cluster-qe> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | low | ||
| Priority: | unspecified | CC: | agk, bugzilla, cfeller, coughlan, heinzm, jbrassow, loberman, mcsontos, msnitzer, prajnoha, teigland, zkabelac |
| Version: | 7.4 | ||
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | lvm2-2.02.184-1.el7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-08-06 13:10:41 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1577173 | ||
|
Description
Corey Marthaler
2017-04-27 16:08:34 UTC
This is not easy to fix because the logic which disables lvmetad usage happens right at the beginning of the command, long before we begin processing things and know if the LV doesn't exist. In a case like this, we don't want to start using lvmetad when we think it's has possibly wrong data, so we really do want to disable lvmetad right from the start. The solution to this is that the repair command should probably attempt to re-enable lvmetad at the end, before exiting. We already do this sort of thing, e.g. in vgcfgrestore, so it's feasible. But, we'd need to consider in which cases we don't want to reenable lvmetad, e.g. if the repair failed because the is still bad device state. So, this is an optimization that requires some careful work because of the potential problems involved. It's not suitable for 7.4. Is it documented that --repair requires lvmetad to be disabled? Is there any benefit in prompting for this? It's tempting to make the current behaviour the defined behaviour and close this one. I've documented that repair disables lvmetad. There is little benefit to prompting because lvconvert --repair is run automatically by dmeventd. We can leave this open for 7.5 to consider the other options in comment 2. Another example I came across to potentially test if/when a solution occurs. [root@host-006 ~]# lvconvert --repair mirror_sanity/fs_to_mirror WARNING: Disabling lvmetad cache for repair command. WARNING: Not using lvmetad because of repair. Volume mirror_sanity/fs_to_mirror is consistent. Nothing to repair. [root@host-006 ~]# lvs WARNING: Not using lvmetad because a repair command was run. We are having the same problem. Is there a workaround to re-enable lvmetad? ]# lvchange -ay vg0 WARNING: Not using lvmetad because a repair command was run. ]# lvchange -an vg0 WARNING: Not using lvmetad because a repair command was run. "pvscan --cache" is the standard command to rebuild the lvmetad cache and resolves most issues related to lvmetad. I think that the current method of re-enabling lvmetad (a manual pvscan --cache) is the best option. Repairing storage problems inherently involves some manual intervention and analysis. lvm doesn't know when everything has been properly resolved; the user has to decide this. We cannot safely re-enable lvmetad until everything is resolved, and since this is decided by the user, the step is manual, not automated. So, I don't think there's anything we should change here. https://sourceware.org/git/?p=lvm2.git;a=commit;h=322d4ed05e348c6d88f3cb880485e5777298c361 This change addresses the original description in this bug, which is that lvconvert --repair doesn't need to disable lvmetad if nothing is done. # pvs PV VG Fmt Attr PSize PFree /dev/sda2 rhel_null-05 lvm2 a-- <464.76g 4.00m /dev/sdf gg lvm2 a-- <931.01g 930.42g /dev/sdg gg lvm2 a-- <931.01g <930.88g # lvconvert --repair gg/mm WARNING: Not using lvmetad because of repair. Volume gg/mm is consistent. Nothing to repair. # pvs PV VG Fmt Attr PSize PFree /dev/sda2 rhel_null-05 lvm2 a-- <464.76g 4.00m /dev/sdf gg lvm2 a-- <931.01g 930.42g /dev/sdg gg lvm2 a-- <931.01g <930.88g # lvconvert --repair gg/tp0 WARNING: Not using lvmetad because of repair. WARNING: Disabling lvmetad cache for repair command. WARNING: LV gg/tp0_meta7 holds a backup of the unrepaired metadata. Use lvremove when no longer required. WARNING: New metadata LV gg/tp0_tmeta might use different PVs. Move it with pvmove if required. # pvs WARNING: Not using lvmetad because a repair command was run. PV VG Fmt Attr PSize PFree /dev/sda2 rhel_null-05 lvm2 a-- <464.76g 4.00m /dev/sdf gg lvm2 a-- <931.01g <930.42g /dev/sdg gg lvm2 a-- <931.01g <930.88g # pvscan --cache # pvs PV VG Fmt Attr PSize PFree /dev/sda2 rhel_null-05 lvm2 a-- <464.76g 4.00m /dev/sdf gg lvm2 a-- <931.01g <930.42g /dev/sdg gg lvm2 a-- <931.01g <930.88g # lvconvert --repair gg/rr WARNING: Not using lvmetad because of repair. Attempt to replace failed RAID images (requires full device resync)? [y/n]: y WARNING: Disabling lvmetad cache for repair command. gg/rr does not contain devices specified to replace. Faulty devices in gg/rr successfully replaced. # pvs WARNING: Not using lvmetad because a repair command was run. PV VG Fmt Attr PSize PFree /dev/sda2 rhel_null-05 lvm2 a-- <464.76g 4.00m /dev/sdf gg lvm2 a-- <931.01g <930.42g /dev/sdg gg lvm2 a-- <931.01g <930.88g # pvscan --cache # pvs PV VG Fmt Attr PSize PFree /dev/sda2 rhel_null-05 lvm2 a-- <464.76g 4.00m /dev/sdf gg lvm2 a-- <931.01g <930.42g /dev/sdg gg lvm2 a-- <931.01g <930.88g # lvconvert --repair aa/rr WARNING: Not using lvmetad because of repair. Volume group "aa" not found Cannot process volume group aa # pvs PV VG Fmt Attr PSize PFree /dev/sda2 rhel_null-05 lvm2 a-- <464.76g 4.00m /dev/sdf gg lvm2 a-- <931.01g <930.42g /dev/sdg gg lvm2 a-- <931.01g <930.88g These examples work now w/o dumping lvmetad. Marking verified in the latest rpms. 3.10.0-1057.el7.x86_64 lvm2-2.02.185-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 lvm2-libs-2.02.185-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 lvm2-cluster-2.02.185-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 lvm2-lockd-2.02.185-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 lvm2-python-boom-0.9-18.el7 BUILT: Fri Jun 21 04:18:58 CDT 2019 cmirror-2.02.185-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 device-mapper-1.02.158-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 device-mapper-libs-1.02.158-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 device-mapper-event-1.02.158-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 device-mapper-event-libs-1.02.158-2.el7 BUILT: Fri Jun 21 04:18:48 CDT 2019 device-mapper-persistent-data-0.8.5-1.el7 BUILT: Mon Jun 10 03:58:20 CDT 2019 [root@hayes-01 ~]# vgcreate VG /dev/sd[bcd]1 Physical volume "/dev/sdb1" successfully created. Physical volume "/dev/sdc1" successfully created. Physical volume "/dev/sdd1" successfully created. Volume group "VG" successfully created [root@hayes-01 ~]# lvs [root@hayes-01 ~]# lvconvert --repair VG/doesnt_exist WARNING: Not using lvmetad because of repair. Failed to find logical volume "VG/doesnt_exist" [root@hayes-01 ~]# lvs [root@hayes-01 ~]# lvcreate --thinpool pool -L 1G --poolmetadatasize 4M VG Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data. Logical volume "pool" created. [root@hayes-01 ~]# lvcreate --virtualsize 250M -T VG/pool -n V1 Rounding up size to full physical extent 252.00 MiB Logical volume "V1" created. [root@hayes-01 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices V1 VG Vwi-a-tz-- 252.00m pool 0.00 [lvol0_pmspare] VG ewi------- 4.00m /dev/sdb1(0) pool VG twi-aotz-- 1.00g 0.00 11.04 pool_tdata(0) [pool_tdata] VG Twi-ao---- 1.00g /dev/sdb1(1) [pool_tmeta] VG ewi-ao---- 4.00m /dev/sdd1(0) [root@hayes-01 ~]# lvconvert --repair VG/pool WARNING: Not using lvmetad because of repair. Active pools cannot be repaired. Use lvchange -an first. [root@hayes-01 ~]# lvs -a -o +devices LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert Devices V1 VG Vwi-a-tz-- 252.00m pool 0.00 [lvol0_pmspare] VG ewi------- 4.00m /dev/sdb1(0) pool VG twi-aotz-- 1.00g 0.00 11.04 pool_tdata(0) [pool_tdata] VG Twi-ao---- 1.00g /dev/sdb1(1) [pool_tmeta] VG ewi-ao---- 4.00m /dev/sdd1(0) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2253 |