RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1102919 - vgspliting volume groups without lvmetad running can produce "Checksum error" for each PV split
Summary: vgspliting volume groups without lvmetad running can produce "Checksum error"...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lvm2
Version: 7.0
Hardware: x86_64
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Alasdair Kergon
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1133116 1469559
TreeView+ depends on / blocked
 
Reported: 2014-05-29 20:10 UTC by Corey Marthaler
Modified: 2023-03-08 07:26 UTC (History)
10 users (show)

Fixed In Version: lvm2-2.02.175-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1133116 (view as bug list)
Environment:
Last Closed: 2018-04-10 15:16:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
-vvvv of the vgsplit (83.55 KB, text/plain)
2014-05-29 20:16 UTC, Corey Marthaler
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:0853 0 None None None 2018-04-10 15:17:48 UTC

Description Corey Marthaler 2014-05-29 20:10:32 UTC
Description of problem:
Without lvmetad running, there are errors, with it running, there are no errors. 

[root@harding-03 ~]# pvscan
  PV /dev/sdc2   VG seven             lvm2 [23.29 GiB / 23.29 GiB free]
  PV /dev/sdc3   VG seven             lvm2 [23.29 GiB / 23.29 GiB free]
  PV /dev/sdc4   VG seven             lvm2 [23.29 GiB / 23.29 GiB free]
  PV /dev/sdc1   VG seven             lvm2 [23.29 GiB / 23.29 GiB free]
  PV /dev/sdb4   VG seven             lvm2 [23.29 GiB / 23.29 GiB free]
  PV /dev/sdb3   VG seven             lvm2 [23.29 GiB / 23.29 GiB free]
  PV /dev/sdb2   VG seven             lvm2 [23.29 GiB / 23.29 GiB free]

[root@harding-03 ~]# vgsplit seven ten /dev/sdc2 /dev/sdc3 /dev/sdc4 /dev/sdc1 /dev/sdb4 /dev/sdb3 /dev/sdb2
  /dev/sdc2: Checksum error
  /dev/sdc3: Checksum error
  /dev/sdc4: Checksum error
  /dev/sdc1: Checksum error
  /dev/sdb4: Checksum error
  /dev/sdb3: Checksum error
  /dev/sdb2: Checksum error
  New volume group "ten" successfully split from "seven"


Version-Release number of selected component (if applicable):
3.10.0-110.el7.x86_64
lvm2-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-libs-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
lvm2-cluster-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-event-libs-1.02.84-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014
device-mapper-persistent-data-0.2.8-4.el7    BUILT: Fri Jan 24 14:28:55 CST 2014
cmirror-2.02.105-14.el7    BUILT: Wed Mar 26 08:29:41 CDT 2014

Comment 1 Corey Marthaler 2014-05-29 20:16:43 UTC
Created attachment 900505 [details]
-vvvv of the vgsplit

Comment 2 Corey Marthaler 2014-05-29 20:17:44 UTC
# From -vvvv output

#metadata/vg.c:60         Allocated VG ten at 0x7f7dc216f770.
#label/label.c:155       /dev/sdb4: lvm2 label detected at sector 1
#format_text/text_label.c:421         /dev/sdb4: PV header extension version 1 found
#config/config.c:411   /dev/sdb4: Checksum error
#format_text/import.c:55         <backtrace>
#format_text/format-text.c:1178         <backtrace>

Comment 4 Alasdair Kergon 2014-07-16 19:07:02 UTC
(I reproduced a simpler version of this but got diverted onto something else before I found the cause.)

Comment 6 Jonathan Earl Brassow 2017-07-26 21:05:52 UTC
[root@bp-01 ~]# pvs
  PV         VG         Fmt  Attr PSize    PFree
  /dev/sda2  rhel_bp-01 lvm2 a--  <464.76g    4.00m
  /dev/sdb1             lvm2 ---  <836.69g <836.69g
  /dev/sdc1             lvm2 ---  <836.69g <836.69g
  /dev/sdd1             lvm2 ---  <836.69g <836.69g
  /dev/sde1             lvm2 ---  <836.69g <836.69g
  /dev/sdf1             lvm2 ---  <836.69g <836.69g
  /dev/sdg1             lvm2 ---  <836.69g <836.69g
  /dev/sdh1             lvm2 ---  <836.69g <836.69g
  /dev/sdi1             lvm2 ---  <836.69g <836.69g
[root@bp-01 ~]# pvscan --cache
[root@bp-01 ~]# vgcreate vg /dev/sd[bcdefghi]1
  Volume group "vg" successfully created
[root@bp-01 ~]# pvscan --cache
[root@bp-01 ~]# lvmconfig global/use_lvmetad
use_lvmetad=0
[root@bp-01 ~]# vgsplit vg new /dev/sd[cdefg]1
  /dev/sdc1: Checksum error
  Couldn't read volume group metadata.
  /dev/sdd1: Checksum error
  Couldn't read volume group metadata.
  /dev/sde1: Checksum error
  Couldn't read volume group metadata.
  /dev/sdf1: Checksum error
  Couldn't read volume group metadata.
  /dev/sdg1: Checksum error
  Couldn't read volume group metadata.
  New volume group "new" successfully split from "vg"

still appears to be a problem -

# lvm version
  LVM version:     2.02.172(2)-git (2017-05-03)
  Library version: 1.02.141-git (2017-05-03)
  Driver version:  4.35.0
  Configuration:   ./configure --enable-lvm1_fallback --enable-fsadm --with-pool=internal --with-user= --with-group= --with-device-uid=0 --with-device-gid=6 --with-device-mode=0660 --enable-pkgconfig --enable-units-compat --with-optimisation=-g --enable-cmdlib --enable-dmeventd --libdir=/usr/lib64 --with-usrlibdir=/usr/lib64 --with-pool=internal --enable-applib --enable-python2-bindings --enable-udev_sync --with-thin=internal --enable-lvmetad --with-cache=internal

Comment 8 Alasdair Kergon 2017-09-21 16:07:40 UTC
This seems to be a very old problem that occurs when there are no LVs in the VG.

Comment 9 Alasdair Kergon 2017-09-22 01:43:12 UTC
There are a number of problems with vgsplit.  The main one is that the change is not atomic and recovery may be awkward if the process gets interrupted.  With a bit of thought, this could be improved considerably.

The bug itself comes about because it sometimes overwrites part of the old on-disk metadata with the new and then tries to read the old metadata back again and finds it got corrupted - the checksum error.  When I added an LV, this shifted the metadata within the buffer and it didn't happen.

The vgrename mechanism gets used for the new VG, but it incorrectly sets the 'old' VG name to the same as the 'new' one (because the VG structure got created afresh with the new name).  By setting it instead to the correct old name the errors disappear.

--- a/tools/vgsplit.c
+++ b/tools/vgsplit.c
@@ -705,6 +705,9 @@ int vgsplit(struct cmd_context *cmd, int argc, char **argv)
        if (!vg_rename(cmd, vg_to, vg_name_to))
                goto_bad;
 
+       /* Set old VG name so the metadata operations recognise that the PVs are in an existing VG */
+       vg_to->old_name = vg_from->name;
+
        /* store it on disks */
        log_verbose("Writing out updated volume groups");

Comment 12 Roman Bednář 2017-10-16 13:41:02 UTC
Marking verified with latest rpms. Checksum error no longer appears when splitting a pv from vg while lvmetad is not running. Adding regression check to seven_ten test suite to have this covered.

(note: pvscan has to be run prior to splitting in order to trigger this bug)


BEFORE PATCH:

# pvs
  PV         VG            Fmt  Attr PSize  PFree 
  /dev/sda1  vg            lvm2 a--  29.98g 29.98g
  /dev/sdb1  vg            lvm2 a--  29.98g 29.98g
  /dev/vda2  rhel_virt-366 lvm2 a--   7.51g 40.00m

# systemctl is-active lvm2-lvmetad
inactive

# pvscan --cache

# vgsplit vg vg2 /dev/sdb1
  /dev/sdb1: Checksum error
  Couldn't read volume group metadata.
  New volume group "vg2" successfully split from "vg"

===================================================
AFTER PATCH:

# pvs
  PV         VG            Fmt  Attr PSize   PFree  
  /dev/sda1  vg            lvm2 a--  <29.99g <29.99g
  /dev/sdb1  vg            lvm2 a--  <29.99g <29.99g
  /dev/sdc1                lvm2 ---  <30.00g <30.00g
  /dev/sdd1                lvm2 ---  <30.00g <30.00g
  /dev/sde1                lvm2 ---  <30.00g <30.00g
  /dev/sdf1                lvm2 ---  <30.00g <30.00g
  /dev/sdg1                lvm2 ---  <30.00g <30.00g
  /dev/sdh1                lvm2 ---  <30.00g <30.00g
  /dev/sdi1                lvm2 ---  <30.00g <30.00g
  /dev/sdj1                lvm2 ---  <30.00g <30.00g
  /dev/vda2  rhel_virt-371 lvm2 a--   <7.00g      0 

# systemctl is-active lvm2-lvmetad
inactive

# pvscan --cache

# vgsplit vg vg2 /dev/sdb1
  New volume group "vg2" successfully split from "vg"



===================================================


3.10.0-727.el7.x86_64

lvm2-2.02.175-2.el7    BUILT: Fri Oct 13 13:31:22 CEST 2017
lvm2-libs-2.02.175-2.el7    BUILT: Fri Oct 13 13:31:22 CEST 2017
lvm2-cluster-2.02.175-2.el7    BUILT: Fri Oct 13 13:31:22 CEST 2017
device-mapper-1.02.144-2.el7    BUILT: Fri Oct 13 13:31:22 CEST 2017
device-mapper-libs-1.02.144-2.el7    BUILT: Fri Oct 13 13:31:22 CEST 2017
device-mapper-event-1.02.144-2.el7    BUILT: Fri Oct 13 13:31:22 CEST 2017
device-mapper-event-libs-1.02.144-2.el7    BUILT: Fri Oct 13 13:31:22 CEST 2017
device-mapper-persistent-data-0.7.3-2.el7    BUILT: Tue Oct 10 11:00:07 CEST 2017
cmirror-2.02.175-2.el7    BUILT: Fri Oct 13 13:31:22 CEST 2017

Comment 15 errata-xmlrpc 2018-04-10 15:16:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0853


Note You need to log in before you can comment on or make changes to this bug.