Bug 1102919
Summary: | vgspliting volume groups without lvmetad running can produce "Checksum error" for each PV split | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> | ||||
Component: | lvm2 | Assignee: | Alasdair Kergon <agk> | ||||
lvm2 sub component: | Changing Logical Volumes | QA Contact: | cluster-qe <cluster-qe> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | medium | ||||||
Priority: | low | CC: | agk, heinzm, jbrassow, lmiksik, msnitzer, nperic, prajnoha, prockai, rbednar, zkabelac | ||||
Version: | 7.0 | Keywords: | Triaged | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | lvm2-2.02.175-1.el7 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1133116 (view as bug list) | Environment: | |||||
Last Closed: | 2018-04-10 15:16:02 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1133116, 1469559 | ||||||
Attachments: |
|
Description
Corey Marthaler
2014-05-29 20:10:32 UTC
Created attachment 900505 [details]
-vvvv of the vgsplit
# From -vvvv output #metadata/vg.c:60 Allocated VG ten at 0x7f7dc216f770. #label/label.c:155 /dev/sdb4: lvm2 label detected at sector 1 #format_text/text_label.c:421 /dev/sdb4: PV header extension version 1 found #config/config.c:411 /dev/sdb4: Checksum error #format_text/import.c:55 <backtrace> #format_text/format-text.c:1178 <backtrace> (I reproduced a simpler version of this but got diverted onto something else before I found the cause.) [root@bp-01 ~]# pvs PV VG Fmt Attr PSize PFree /dev/sda2 rhel_bp-01 lvm2 a-- <464.76g 4.00m /dev/sdb1 lvm2 --- <836.69g <836.69g /dev/sdc1 lvm2 --- <836.69g <836.69g /dev/sdd1 lvm2 --- <836.69g <836.69g /dev/sde1 lvm2 --- <836.69g <836.69g /dev/sdf1 lvm2 --- <836.69g <836.69g /dev/sdg1 lvm2 --- <836.69g <836.69g /dev/sdh1 lvm2 --- <836.69g <836.69g /dev/sdi1 lvm2 --- <836.69g <836.69g [root@bp-01 ~]# pvscan --cache [root@bp-01 ~]# vgcreate vg /dev/sd[bcdefghi]1 Volume group "vg" successfully created [root@bp-01 ~]# pvscan --cache [root@bp-01 ~]# lvmconfig global/use_lvmetad use_lvmetad=0 [root@bp-01 ~]# vgsplit vg new /dev/sd[cdefg]1 /dev/sdc1: Checksum error Couldn't read volume group metadata. /dev/sdd1: Checksum error Couldn't read volume group metadata. /dev/sde1: Checksum error Couldn't read volume group metadata. /dev/sdf1: Checksum error Couldn't read volume group metadata. /dev/sdg1: Checksum error Couldn't read volume group metadata. New volume group "new" successfully split from "vg" still appears to be a problem - # lvm version LVM version: 2.02.172(2)-git (2017-05-03) Library version: 1.02.141-git (2017-05-03) Driver version: 4.35.0 Configuration: ./configure --enable-lvm1_fallback --enable-fsadm --with-pool=internal --with-user= --with-group= --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --enable-pkgconfig --enable-units-compat --with-optimisation=-g --enable-cmdlib --enable-dmeventd --libdir=/usr/lib64 --with-usrlibdir=/usr/lib64 --with-pool=internal --enable-applib --enable-python2-bindings --enable-udev_sync --with-thin=internal --enable-lvmetad --with-cache=internal This seems to be a very old problem that occurs when there are no LVs in the VG. There are a number of problems with vgsplit. The main one is that the change is not atomic and recovery may be awkward if the process gets interrupted. With a bit of thought, this could be improved considerably. The bug itself comes about because it sometimes overwrites part of the old on-disk metadata with the new and then tries to read the old metadata back again and finds it got corrupted - the checksum error. When I added an LV, this shifted the metadata within the buffer and it didn't happen. The vgrename mechanism gets used for the new VG, but it incorrectly sets the 'old' VG name to the same as the 'new' one (because the VG structure got created afresh with the new name). By setting it instead to the correct old name the errors disappear. --- a/tools/vgsplit.c +++ b/tools/vgsplit.c @@ -705,6 +705,9 @@ int vgsplit(struct cmd_context *cmd, int argc, char **argv) if (!vg_rename(cmd, vg_to, vg_name_to)) goto_bad; + /* Set old VG name so the metadata operations recognise that the PVs are in an existing VG */ + vg_to->old_name = vg_from->name; + /* store it on disks */ log_verbose("Writing out updated volume groups"); https://sourceware.org/git/?p=lvm2.git;a=commitdiff;h=8146548d25e9104f0d530d943290d448c1994c0a https://www.redhat.com/archives/lvm-devel/2017-September/msg00046.html Marking verified with latest rpms. Checksum error no longer appears when splitting a pv from vg while lvmetad is not running. Adding regression check to seven_ten test suite to have this covered. (note: pvscan has to be run prior to splitting in order to trigger this bug) BEFORE PATCH: # pvs PV VG Fmt Attr PSize PFree /dev/sda1 vg lvm2 a-- 29.98g 29.98g /dev/sdb1 vg lvm2 a-- 29.98g 29.98g /dev/vda2 rhel_virt-366 lvm2 a-- 7.51g 40.00m # systemctl is-active lvm2-lvmetad inactive # pvscan --cache # vgsplit vg vg2 /dev/sdb1 /dev/sdb1: Checksum error Couldn't read volume group metadata. New volume group "vg2" successfully split from "vg" =================================================== AFTER PATCH: # pvs PV VG Fmt Attr PSize PFree /dev/sda1 vg lvm2 a-- <29.99g <29.99g /dev/sdb1 vg lvm2 a-- <29.99g <29.99g /dev/sdc1 lvm2 --- <30.00g <30.00g /dev/sdd1 lvm2 --- <30.00g <30.00g /dev/sde1 lvm2 --- <30.00g <30.00g /dev/sdf1 lvm2 --- <30.00g <30.00g /dev/sdg1 lvm2 --- <30.00g <30.00g /dev/sdh1 lvm2 --- <30.00g <30.00g /dev/sdi1 lvm2 --- <30.00g <30.00g /dev/sdj1 lvm2 --- <30.00g <30.00g /dev/vda2 rhel_virt-371 lvm2 a-- <7.00g 0 # systemctl is-active lvm2-lvmetad inactive # pvscan --cache # vgsplit vg vg2 /dev/sdb1 New volume group "vg2" successfully split from "vg" =================================================== 3.10.0-727.el7.x86_64 lvm2-2.02.175-2.el7 BUILT: Fri Oct 13 13:31:22 CEST 2017 lvm2-libs-2.02.175-2.el7 BUILT: Fri Oct 13 13:31:22 CEST 2017 lvm2-cluster-2.02.175-2.el7 BUILT: Fri Oct 13 13:31:22 CEST 2017 device-mapper-1.02.144-2.el7 BUILT: Fri Oct 13 13:31:22 CEST 2017 device-mapper-libs-1.02.144-2.el7 BUILT: Fri Oct 13 13:31:22 CEST 2017 device-mapper-event-1.02.144-2.el7 BUILT: Fri Oct 13 13:31:22 CEST 2017 device-mapper-event-libs-1.02.144-2.el7 BUILT: Fri Oct 13 13:31:22 CEST 2017 device-mapper-persistent-data-0.7.3-2.el7 BUILT: Tue Oct 10 11:00:07 CEST 2017 cmirror-2.02.175-2.el7 BUILT: Fri Oct 13 13:31:22 CEST 2017 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0853 |