Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1787071

Summary: pvresize overwrites wrong PV headers after failing to update old extension headers because of scsi reservations
Product: Red Hat Enterprise Linux 7 Reporter: Alexandros Panagiotou <apanagio>
Component: lvm2Assignee: David Teigland <teigland>
lvm2 sub component: Command-line tools QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: urgent CC: agk, cmarthal, heinzm, jbrassow, lmiksik, mcsontos, msnitzer, prajnoha, rbednar, rhandlin, rmadhuso, teigland, thornber, zkabelac
Version: 7.7   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.186-6.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-31 20:04:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
verbose output and strace of a failed pvresize and scripts
none
Test results including pvresize verbose output, strace and a funcgraph of the reproducing run of pvresize none

Description Alexandros Panagiotou 2019-12-30 17:34:06 UTC
Created attachment 1648656 [details]
verbose output and strace of a failed pvresize and scripts

Description of problem:
The problem appeared in a multi-node SAP HANA cluster and is triggered by factors related to the cluster architecture (mostly scsi reservations preventing write operations on all shared LUNs).

The result is that pvresize overwrites PV headers of some PVs with headers from other PVs. As an example:

  "pvresize /dev/mapper/mpathd" results in the PV header of mpathf being written on mpathd and of mpathc being written on sda2.

The sequence of events we can see in verbose logs is:

  1) pvresize detects that mpathf has an "old extension header" and needs to be updated:

  2) There is a scsi reservation on mpathf that is blocking writes from the node that pvresize runs on. As a result, pvresize fails to write the updated header.

  3) Later in the pvresize execution, mpathd is updated. An strace from pvresize, shows the header of mpathf being written on mptahd.


Version-Release number of selected component (if applicable):
  Reproduced with lvm2-2.02.185-2.el7.x86_64 & lvm2-2.02.185-2.el7_7.2.x86_64

How reproducible:
  So far 100% persistent.


Steps to Reproduce:
1. Create 2 systems (node1 and node2) with 7 shared LUNs (mpatha-mpathg). This will need also a 3rd system being a "storage array". I have been using targetcli/LIO on RHEL7 as my storage.
2. On each shared PV, create a VG (vg_a - vg_g) and in each VG an LV (lv_a - lv_g). This needs to be done with an old lvm version. (I have used lvm2-2.02.130-5.el7.x86_64 - RHEL 7.2)
3. Add a "Write Exclusive, registrants only" scsi reservation from node2 on LUNs: mpatha mpathb mpathc mpathe mpathf
   Add a "Write Exclusive, registrants only" scsi reservation from node1 on LUNs: mpathd mpathg

   Now both nodes have read access on all LUNs, but they can only write on the LUNs each of them has reserved.
4. Update LVM. 
5. On the shared storage, increase the size of LUN mpathd. Rescan the scsi bus and multipath (e.g. rescan-scsi-bus.sh -m -s) to detect the change.
6. On node1 (that owns mpathd) run pvresize /dev/mapper/mpathd


Actual results:
pvresize fails. After the failure pvs -v -a looks similar to:

# pvs -v -a
  WARNING: Not using device /dev/mapper/mpathc for PV Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D.
  WARNING: Not using device /dev/mapper/mpathd for PV XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz.
  WARNING: PV Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D prefers device /dev/sda2 because device is used by LV.
  WARNING: PV XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz prefers device /dev/mapper/mpathf because device size is correct.
  PV                 VG        Fmt  Attr PSize   PFree  DevSize PV UUID
  /dev/mapper/mpatha vg_a      lvm2 a--  <50,00g 25,00g  50,00g OWPpmm-FiyJ-CvaO-eVnd-Tgej-SkkK-DkroeM
  /dev/mapper/mpathb vg_b      lvm2 a--  <50,00g 25,00g  50,00g 19qYTb-GNSN-DclK-85eW-zJm3-xdf9-MHcezg
  /dev/mapper/mpathc [unknown] lvm2 d--       0      0   50,00g Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D <---+
  /dev/mapper/mpathd vg_f      lvm2 d--       0      0   54,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz <-+ |
  /dev/mapper/mpathe vg_e      lvm2 a--  <50,00g 25,00g  50,00g Z6TOqr-n9mU-Fy8e-Rmt2-0Iat-lHCt-rVQbbS   | |
  /dev/mapper/mpathf vg_f      lvm2 a--  <50,00g 25,00g  50,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz <-+ |
  /dev/mapper/mpathg vg_g      lvm2 a--  <50,00g 25,00g  50,00g Ydv7Lq-1WrC-O9Sc-578S-qNkt-jXpJ-V3tGlB     |
  /dev/sda1                         ---       0      0  500,00m                                            |
  /dev/sda2                    lvm2 ---   50,00g 50,00g   5,51g Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D <---+

The header of mpathf is cloned on mpathd
The header of mpathc is cloned on sda2


Expected results:
pvresize succeeds and does not write wrong PV headers on other devices.


Additional info:
1) Using the global_filter provides a functional workaround, by masking all other devices except mpathd. e.g.:

   pvresize --config 'devices{ global_filter =  [ "a|/dev/mapper/mpathd|" , "r|.*|" ] }' /dev/mapper/mpathd

   both succeeds and does not overwrite any other device.

2) I am attaching an archive containing verbose pvresize output and an strace collected in parallel from a failed run. It also contains the scripts I am using to setup/clear and view the scsi reservations.

3) I have been trying to get a simpler reproducer by limiting write access to VM disks, but failed so far.

Comment 8 Alexandros Panagiotou 2020-01-10 19:00:13 UTC
Created attachment 1651358 [details]
Test results including pvresize verbose output, strace and a funcgraph of the reproducing run of pvresize

Hello,
A few more notes along with the results of one more test (test5). This time I have excluded via filtering sda2 to avoid root getting overwritten and simplify recovery:

[1] The initial good state (before pvresize runs) is:

    # pvs -a -v
      PV                 VG   Fmt  Attr PSize   PFree  DevSize PV UUID                               
      /dev/mapper/mpatha vg_a lvm2 a--  <50,00g 25,00g  50,00g OWPpmm-FiyJ-CvaO-eVnd-Tgej-SkkK-DkroeM
      /dev/mapper/mpathb vg_b lvm2 a--  <50,00g 25,00g  50,00g 19qYTb-GNSN-DclK-85eW-zJm3-xdf9-MHcezg
      /dev/mapper/mpathc vg_c lvm2 a--  <50,00g 25,00g  50,00g Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D
      /dev/mapper/mpathd vg_d lvm2 a--  <54,00g 29,00g  56,00g AFNrSH-mwMc-QQz0-jBSQ-MoQI-X2gj-VKq5Xe
      /dev/mapper/mpathe vg_e lvm2 a--  <50,00g 25,00g  50,00g Z6TOqr-n9mU-Fy8e-Rmt2-0Iat-lHCt-rVQbbS
      /dev/mapper/mpathf vg_f lvm2 a--  <50,00g 25,00g  50,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz
      /dev/mapper/mpathg vg_g lvm2 a--  <50,00g 25,00g  50,00g Ydv7Lq-1WrC-O9Sc-578S-qNkt-jXpJ-V3tGlB
      /dev/sda1                    ---       0      0  500,00m                                       
      /dev/sda2          r7vg lvm2 a--   <9,51g <2,38g   9,51g xW57i5-rfdb-30UH-fnLG-GbAm-DDPL-eLz8d2

[2] After the test, pvs looks like (included in 1787071.test5.tar.gz):

      WARNING: Not using device /dev/mapper/mpathd for PV XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz.
      WARNING: PV XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz prefers device /dev/mapper/mpathf because device size is correct.
      PV                 VG   Fmt  Attr PSize   PFree  DevSize PV UUID                               
      /dev/mapper/mpatha vg_a lvm2 a--  <50,00g 25,00g  50,00g OWPpmm-FiyJ-CvaO-eVnd-Tgej-SkkK-DkroeM
      /dev/mapper/mpathb vg_b lvm2 a--  <50,00g 25,00g  50,00g 19qYTb-GNSN-DclK-85eW-zJm3-xdf9-MHcezg
      /dev/mapper/mpathc vg_c lvm2 a--  <50,00g 25,00g  50,00g Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D
      /dev/mapper/mpathd vg_f lvm2 d--       0      0   54,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz <-+
      /dev/mapper/mpathe vg_e lvm2 a--  <50,00g 25,00g  50,00g Z6TOqr-n9mU-Fy8e-Rmt2-0Iat-lHCt-rVQbbS   |
      /dev/mapper/mpathf vg_f lvm2 a--  <50,00g 25,00g  50,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz <-+
      /dev/mapper/mpathg vg_g lvm2 a--  <50,00g 25,00g  50,00g Ydv7Lq-1WrC-O9Sc-578S-qNkt-jXpJ-V3tGlB
      /dev/sda1                    ---       0      0  500,00m                                       
      /dev/sda2          r7vg lvm2 a--   <9,51g <2,38g   9,51g xW57i5-rfdb-30UH-fnLG-GbAm-DDPL-eLz8d2

[3] The strace of pvresize shows that pvresize tries to write the pv uuid XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz on mpathf, mpathe, mpathc and mpathd. As mpathf, mpathe and mpathc have a write reservation and do not allow writing, the result of io_submit (as seen in io_getevents) is -5. Practically, it appears that once pvresize fails on mpathf (due to the scsi reservation and the subsequent IO error) it tries to write the header on each device it detects, until it succeeds (which happens on mpathd).

    $ grep -A1 -B10 -e  "x58.x41.x4a.x5a.x45.x4f.x68.x78.x76.x33.x63" pvresize_vvvv_mpathd.fastvm-rhel-7-7-181.2019-12-31-16:26:59.strace | sed -E -e "s/(\\\x00)+/.../g" 
    21233 16:28:18.639606 write(2<pipe:[40663]>, "Unlock: Memlock counters: prioritized:0 locked:0 critical:0 daemon:0 suspended:0", 80) = 80 <0.000048>
    21233 16:28:18.639714 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000050>
    21233 16:28:18.640227 write(2<pipe:[40663]>, "#format_text/format-text.c:1474          ", 41) = 41 <0.000055>
    21233 16:28:18.640347 write(2<pipe:[40663]>, "Creating metadata area on /dev/mapper/mpathf at sector 8 size 2040 sectors", 74) = 74 <0.000044>
    21233 16:28:18.640446 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000055>
    21233 16:28:18.640935 ioctl(3</dev/dm-7>, BLKPBSZGET, [512]) = 0 <0.000023>
    21233 16:28:18.641034 ioctl(3</dev/dm-7>, BLKSSZGET, [512]) = 0 <0.000019>
    21233 16:28:18.641542 write(2<pipe:[40663]>, "#device/bcache.c:242           ", 31) = 31 <0.000060>
    21233 16:28:18.641674 write(2<pipe:[40663]>, "Limit write at 0 len 131072 to len 4608", 39) = 39 <0.000059>
    21233 16:28:18.641804 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000041>              L   A   B   E   L   O   N   E                                                uuid of mpathf:  X   A   J ...
    21233 16:28:18.641936 io_submit(140468775841792, 1, [{pwrite, fildes=3, str="...\x4c\x41\x42\x45\x4c\x4f\x4e\x45\x01...\xc0\x7f\xea\x64\x20...\x4c\x56\x4d\x32\x20\x30\x30\x31\x58\x41\x4a\x5a\x45\x4f\x68\x78\x76\x33\x63\x37\x6a\x7a\x4b\x66\x4c\x59\x56\x67\x73\x54\x48\x57\x77\x45\x32\x43\x34\x64\x76\x7a...\x80\x0c...\x10...\x10...\xf0\x0f...\x01..."..., nbytes=4608, offset=0}]) = 1 <0.000070>
    21233 16:28:18.642140 io_getevents(140468775841792, 1, 64, [{data=0, obj=0x5574f87ef748, res=-5, res2=0}], NULL) = 1 <0.009916>
    --
    21233 16:28:18.675008 open("/dev/mapper/mpathe", O_RDWR|O_DIRECT|O_NOATIME) = 3</dev/dm-6> <0.000032>
    21233 16:28:18.675271 write(2<pipe:[40663]>, "#label/label.c:668           ", 29) = 29 <0.000051>
    21233 16:28:18.675386 write(2<pipe:[40663]>, "Scanning submitted 1 reads", 26) = 26 <0.000042>
    21233 16:28:18.675514 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000047>
    21233 16:28:18.675867 write(2<pipe:[40663]>, "#label/label.c:677           ", 29) = 29 <0.000043>
    21233 16:28:18.676038 write(2<pipe:[40663]>, "Scan failed to read /dev/mapper/mpathe error 0.", 47) = 47 <0.000044>
    21233 16:28:18.676145 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000040>
    21233 16:28:18.676450 write(2<pipe:[40663]>, "#device/bcache.c:242           ", 31) = 31 <0.000050>
    21233 16:28:18.676594 write(2<pipe:[40663]>, "Limit write at 0 len 131072 to len 4608", 39) = 39 <0.000041>
    21233 16:28:18.676690 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000039>
    21233 16:28:18.676815 io_submit(140468775841792, 1, [{pwrite, fildes=3, str="...\x4c\x41\x42\x45\x4c\x4f\x4e\x45\x01...\xc0\x7f\xea\x64\x20...\x4c\x56\x4d\x32\x20\x30\x30\x31\x58\x41\x4a\x5a\x45\x4f\x68\x78\x76\x33\x63\x37\x6a\x7a\x4b\x66\x4c\x59\x56\x67\x73\x54\x48\x57\x77\x45\x32\x43\x34\x64\x76\x7a...\x80\x0c...\x10...\x10...\xf0\x0f...\x01..."..., nbytes=4608, offset=0}]) = 1 <0.000075>
    21233 16:28:18.677355 io_getevents(140468775841792, 1, 64, [{data=0, obj=0x5574f87ef748, res=-5, res2=0}], NULL) = 1 <0.009953>
    --
    21233 16:28:18.711564 open("/dev/mapper/mpathc", O_RDWR|O_DIRECT|O_NOATIME) = 3</dev/dm-5> <0.000042>
    21233 16:28:18.711826 write(2<pipe:[40663]>, "#label/label.c:668           ", 29) = 29 <0.000064>
    21233 16:28:18.712004 write(2<pipe:[40663]>, "Scanning submitted 1 reads", 26) = 26 <0.000058>
    21233 16:28:18.712166 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000099>
    21233 16:28:18.712644 write(2<pipe:[40663]>, "#label/label.c:677           ", 29) = 29 <0.000070>
    21233 16:28:18.712847 write(2<pipe:[40663]>, "Scan failed to read /dev/mapper/mpathc error 0.", 47) = 47 <0.000069>
    21233 16:28:18.713028 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000058>
    21233 16:28:18.713438 write(2<pipe:[40663]>, "#device/bcache.c:242           ", 31) = 31 <0.000083>
    21233 16:28:18.713696 write(2<pipe:[40663]>, "Limit write at 0 len 131072 to len 4608", 39) = 39 <0.000063>
    21233 16:28:18.713877 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000061>
    21233 16:28:18.714088 io_submit(140468775841792, 1, [{pwrite, fildes=3, str="...\x4c\x41\x42\x45\x4c\x4f\x4e\x45\x01...\xc0\x7f\xea\x64\x20...\x4c\x56\x4d\x32\x20\x30\x30\x31\x58\x41\x4a\x5a\x45\x4f\x68\x78\x76\x33\x63\x37\x6a\x7a\x4b\x66\x4c\x59\x56\x67\x73\x54\x48\x57\x77\x45\x32\x43\x34\x64\x76\x7a...\x80\x0c...\x10...\x10...\xf0\x0f...\x01..."..., nbytes=4608, offset=0}]) = 1 <0.000082>
    21233 16:28:18.714300 io_getevents(140468775841792, 1, 64, [{data=0, obj=0x5574f87ef748, res=-5, res2=0}], NULL) = 1 <0.009549>
    --
    21233 16:28:18.746946 open("/dev/mapper/mpathd", O_RDWR|O_DIRECT|O_NOATIME) = 3</dev/dm-4> <0.000034>
    21233 16:28:18.747216 write(2<pipe:[40663]>, "#label/label.c:668           ", 29) = 29 <0.000028>
    21233 16:28:18.747309 write(2<pipe:[40663]>, "Scanning submitted 1 reads", 26) = 26 <0.000027>
    21233 16:28:18.747390 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000026>
    21233 16:28:18.747739 write(2<pipe:[40663]>, "#label/label.c:677           ", 29) = 29 <0.000035>
    21233 16:28:18.747845 write(2<pipe:[40663]>, "Scan failed to read /dev/mapper/mpathd error 0.", 47) = 47 <0.000027>
    21233 16:28:18.747930 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000026>
    21233 16:28:18.748219 write(2<pipe:[40663]>, "#device/bcache.c:242           ", 31) = 31 <0.000030>
    21233 16:28:18.748317 write(2<pipe:[40663]>, "Limit write at 0 len 131072 to len 4608", 39) = 39 <0.000026>
    21233 16:28:18.748398 write(2<pipe:[40663]>, "\n", 1) = 1 <0.000026>
    21233 16:28:18.748540 io_submit(140468775841792, 1, [{pwrite, fildes=3, str="...\x4c\x41\x42\x45\x4c\x4f\x4e\x45\x01...\xc0\x7f\xea\x64\x20...\x4c\x56\x4d\x32\x20\x30\x30\x31\x58\x41\x4a\x5a\x45\x4f\x68\x78\x76\x33\x63\x37\x6a\x7a\x4b\x66\x4c\x59\x56\x67\x73\x54\x48\x57\x77\x45\x32\x43\x34\x64\x76\x7a...\x80\x0c...\x10...\x10...\xf0\x0f...\x01..."..., nbytes=4608, offset=0}]) = 1 <0.000079>
    21233 16:28:18.748762 io_getevents(140468775841792, 1, 64, [{data=0, obj=0x5574f87ef748, res=4608, res2=0}], NULL) = 1 <0.000516>


[4] Based on the verbose pvresize output, it all starts as a result of pvresize trying to update the old header extension it has found on mpathf (as a result of mpathf having been created with an old lvm version). It is worth noting that mpathf does not belong to the same VG as mpathd (which was getting resized).

    #metadata/metadata.c:2842          PV /dev/mapper/mpathf has old extension header, updating to newest version.
    #metadata/pv_manip.c:417           /dev/mapper/mpathf 0:      0   6399: lv_f(0:0)
    #metadata/pv_manip.c:417           /dev/mapper/mpathf 1:   6399   6400: NULL(0:0)
    #locking/locking.c:367           Dropping cache for vg_f.
    #mm/memlock.c:594           Unlock: Memlock counters: prioritized:0 locked:0 critical:0 daemon:0 suspended:0
    #format_text/format-text.c:1474          Creating metadata area on /dev/mapper/mpathf at sector 8 size 2040 sectors
    #device/bcache.c:242           Limit write at 0 len 131072 to len 4608
    #label/label.c:1383    Error writing device /dev/mapper/mpathf at 4096 length 512.
    #format_text/format-text.c:407     Failed to write mda header to /dev/mapper/mpathf fd -1
    #format_text/format-text.c:1411          <backtrace>
    #cache/lvmcache.c:2788          <backtrace>
    #format_text/format-text.c:1509          <backtrace>
    #metadata/metadata.c:5005          <backtrace>
    #metadata/metadata.c:2997          <backtrace>
    #metadata/metadata.c:2848    Failed to update old PV extension headers in VG vg_f.
    #metadata/vg.c:89            Freeing VG vg_f at 0x5574f8818b20.

[5] After a discussion with zkabelac, we have tried updating lvm to lvm2-2.02.186-4.el7 (as found on brew) to make sure that we don't hit a known fixed problem. Unfortunately, it still reproduces with lvm2-2.02.186-4.el7. I'll give it a try with RHEL 8 and fedora on Monday to see if newer lvm versions still hit this.

Regards,
Alexandros

Comment 9 David Teigland 2020-01-13 15:44:58 UTC
This looks like the bug that was reported and fixed by Heming Zhao on linux-lvm.

From: Heming Zhao <heming zhao suse com>
To: LVM general discussion and development <linux-lvm redhat com>, Gang He <GHe suse com>
Subject: Re: [linux-lvm] pvresize will cause a meta-data corruption with error message "Error writing device at 4096 length 512"
Date: Fri, 11 Oct 2019 08:11:29 +0000

https://www.redhat.com/archives/linux-lvm/2019-October/msg00004.html

The fixes (from Joe and Heming) on the master branch are:

13c254fc0538 fix dev_unset_last_byte after write error
25e7bf021a4e [bcache] bcache_invalidate_fd, only remove prefixes on success.
7e8296f4788d [bcache] reverse earlier patch.
2b3c39e402b9 [bcache] pass up the error from io_submit rather than using generic -EIO
5fdebf9bbf68 [bcache] add unit test
6b0d969b2a85 [label] Use bcache_abort_fd() to ensure blocks are no longer in the cache.
2938b4dcca0a [bcache] add bcache_abort()

We weren't able to trivially backport these to the stable branch because they touch on the bcache radix tree which does not exist in stable.

Comment 11 Joe Thornber 2020-01-16 15:07:17 UTC
I've backported the radix-tree and btree changes.  Cherry picking the patch list in comment #9 goes ok apart from the last patch (13c254fc0538) which fails in format-text.c.

The tree is here:

http://sourceware.org/git/?p=lvm2.git;a=shortlog;h=refs/heads/2020-01-16-back-port-bcache-changes

Comment 12 David Teigland 2020-01-16 15:38:26 UTC
The final list of commits from the stable-2.02 branch for this bz:

d20490f76dd5 [radix-tree] Add missing test case
019fa6f8eec7 [bcache] bcache_invalidate_fd, only remove prefixes on success.
1e2e12f19c58 [bcache] reverse earlier patch.
6370c20d392f [bcache] pass up the error from io_submit rather than using generic -EIO
056eb0a8809a [bcache] add unit test
babde3da5530 [label] Use bcache_abort_fd() to ensure blocks are no longer in the cache.
232f779db4a4 [bcache] add bcache_abort()
b6e6ea2d6578 [bcache] Bring bcache into sync with master branch
e55210302787 [radix-tree] Bring radix-tree up to date with the master branch
245d7fcd5905 fix dev_unset_last_byte after write error

Comment 14 Alexandros Panagiotou 2020-02-03 14:16:24 UTC
Hello,
I have been running tests on the systems that I have been using for reproducing this. 

With lvm2-2.02.185-2.el7_7.2.x86_64/device-mapper-1.02.158-2.el7_7.2.x86_64 the overwriting happens and pvresize fails to resize the PV on mpathd:

  pvs -a -v before the test:
    PV                 VG   Fmt  Attr PSize   PFree   DevSize PV UUID                               
    /dev/mapper/mpatha vg_a lvm2 a--  <50,00g <25,00g  50,00g OWPpmm-FiyJ-CvaO-eVnd-Tgej-SkkK-DkroeM
    /dev/mapper/mpathb vg_b lvm2 a--  <50,00g <25,00g  50,00g 19qYTb-GNSN-DclK-85eW-zJm3-xdf9-MHcezg
    /dev/mapper/mpathc vg_c lvm2 a--  <50,00g <25,00g  50,00g Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D
    /dev/mapper/mpathd vg_d lvm2 a--  <50,00g <25,00g  56,00g AFNrSH-mwMc-QQz0-jBSQ-MoQI-X2gj-VKq5Xe
    /dev/mapper/mpathe vg_e lvm2 a--  <50,00g <25,00g  50,00g Z6TOqr-n9mU-Fy8e-Rmt2-0Iat-lHCt-rVQbbS
    /dev/mapper/mpathf vg_f lvm2 a--  <50,00g <25,00g  50,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz
    /dev/mapper/mpathg vg_g lvm2 a--  <50,00g <25,00g  50,00g Ydv7Lq-1WrC-O9Sc-578S-qNkt-jXpJ-V3tGlB
    /dev/sda1                    ---       0       0  500,00m                                       
    /dev/sda2          r7vg lvm2 a--   <9,51g  <2,38g   9,51g xW57i5-rfdb-30UH-fnLG-GbAm-DDPL-eLz8d2
  
  pvs -a -v after the test:
    WARNING: Not using device /dev/mapper/mpathd for PV XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz.
    WARNING: PV XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz prefers device /dev/mapper/mpathf because device size is correct.
    PV                 VG   Fmt  Attr PSize   PFree   DevSize PV UUID                               
    /dev/mapper/mpatha vg_a lvm2 a--  <50,00g <25,00g  50,00g OWPpmm-FiyJ-CvaO-eVnd-Tgej-SkkK-DkroeM
    /dev/mapper/mpathb vg_b lvm2 a--  <50,00g <25,00g  50,00g 19qYTb-GNSN-DclK-85eW-zJm3-xdf9-MHcezg
    /dev/mapper/mpathc vg_c lvm2 a--  <50,00g <25,00g  50,00g Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D
    /dev/mapper/mpathd vg_f lvm2 d--       0       0   56,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz
    /dev/mapper/mpathe vg_e lvm2 a--  <50,00g <25,00g  50,00g Z6TOqr-n9mU-Fy8e-Rmt2-0Iat-lHCt-rVQbbS
    /dev/mapper/mpathf vg_f lvm2 a--  <50,00g <25,00g  50,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz
    /dev/mapper/mpathg vg_g lvm2 a--  <50,00g <25,00g  50,00g Ydv7Lq-1WrC-O9Sc-578S-qNkt-jXpJ-V3tGlB
    /dev/sda1                    ---       0       0  500,00m                                       
    /dev/sda2          r7vg lvm2 a--   <9,51g  <2,38g   9,51g xW57i5-rfdb-30UH-fnLG-GbAm-DDPL-eLz8d2


With lvm2-libs-2.02.186-6.el7.x86_64/device-mapper-1.02.164-6.el7.x86_64 the overwriting does not happen and pvs succeeds resizing the PV on mpathd

  pvs -a -v before the test:
  
    PV                 VG   Fmt  Attr PSize   PFree   DevSize PV UUID                               
    /dev/mapper/mpatha vg_a lvm2 a--  <50,00g <25,00g  50,00g OWPpmm-FiyJ-CvaO-eVnd-Tgej-SkkK-DkroeM
    /dev/mapper/mpathb vg_b lvm2 a--  <50,00g <25,00g  50,00g 19qYTb-GNSN-DclK-85eW-zJm3-xdf9-MHcezg
    /dev/mapper/mpathc vg_c lvm2 a--  <50,00g <25,00g  50,00g Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D
    /dev/mapper/mpathd vg_d lvm2 a--  <50,00g <25,00g  56,00g AFNrSH-mwMc-QQz0-jBSQ-MoQI-X2gj-VKq5Xe
    /dev/mapper/mpathe vg_e lvm2 a--  <50,00g <25,00g  50,00g Z6TOqr-n9mU-Fy8e-Rmt2-0Iat-lHCt-rVQbbS
    /dev/mapper/mpathf vg_f lvm2 a--  <50,00g <25,00g  50,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz
    /dev/mapper/mpathg vg_g lvm2 a--  <50,00g <25,00g  50,00g Ydv7Lq-1WrC-O9Sc-578S-qNkt-jXpJ-V3tGlB
    /dev/sda1                    ---       0       0  500,00m                                       
    /dev/sda2          r7vg lvm2 a--   <9,51g  <2,38g   9,51g xW57i5-rfdb-30UH-fnLG-GbAm-DDPL-eLz8d2
  
  pvs -a -v after the test:
    PV                 VG   Fmt  Attr PSize   PFree   DevSize PV UUID                               
    /dev/mapper/mpatha vg_a lvm2 a--  <50,00g <25,00g  50,00g OWPpmm-FiyJ-CvaO-eVnd-Tgej-SkkK-DkroeM
    /dev/mapper/mpathb vg_b lvm2 a--  <50,00g <25,00g  50,00g 19qYTb-GNSN-DclK-85eW-zJm3-xdf9-MHcezg
    /dev/mapper/mpathc vg_c lvm2 a--  <50,00g <25,00g  50,00g Rvhf2V-peNB-WLx9-yrP0-x8mi-5S9l-9AIA6D
    /dev/mapper/mpathd vg_d lvm2 a--  <56,00g <31,00g  56,00g AFNrSH-mwMc-QQz0-jBSQ-MoQI-X2gj-VKq5Xe
    /dev/mapper/mpathe vg_e lvm2 a--  <50,00g <25,00g  50,00g Z6TOqr-n9mU-Fy8e-Rmt2-0Iat-lHCt-rVQbbS
    /dev/mapper/mpathf vg_f lvm2 a--  <50,00g <25,00g  50,00g XAJZEO-hxv3-c7jz-KfLY-VgsT-HWwE-2C4dvz
    /dev/mapper/mpathg vg_g lvm2 a--  <50,00g <25,00g  50,00g Ydv7Lq-1WrC-O9Sc-578S-qNkt-jXpJ-V3tGlB
    /dev/sda1                    ---       0       0  500,00m                                       
    /dev/sda2          r7vg lvm2 a--   <9,51g  <2,38g   9,51g xW57i5-rfdb-30UH-fnLG-GbAm-DDPL-eLz8d2


The test command I use is:

pvresize -vvvv --config 'devices{ global_filter =  [ "a|/dev/mapper/mpath|", "r|.*|" ] }' /dev/mapper/mpathd

(the global_filter is added to save sda2 which contains root)

Thanks and Regards,
Alexandros

Comment 15 Corey Marthaler 2020-02-03 14:41:24 UTC
Thanks Alexandros!

Moving to verified based on test results shown in comment #14.

Comment 16 Alexandros Panagiotou 2020-02-03 15:14:00 UTC
(In reply to Corey Marthaler from comment #15)
> Thanks Alexandros!
> 
> Moving to verified based on test results shown in comment #14.

Just to make sure: I have only been testing for the problem mentioned in the BZ description (i.e. running pvresize). If there are any other tests you normally run with new packages, then I have not run these.

Regards,
Alexandros

Comment 18 errata-xmlrpc 2020-03-31 20:04:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1129

Comment 20 Red Hat Bugzilla 2024-01-06 04:27:28 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days