2204467 – multisegment RAID1, allocator uses one disk for both legs

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2204467 - multisegment RAID1, allocator uses one disk for both legs

Summary: multisegment RAID1, allocator uses one disk for both legs

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 9
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	9.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	LVM Team
QA Contact:	cluster-qe
Docs Contact:
URL:
Whiteboard:
Depends On:	1518121 2204480
Blocks:
TreeView+	depends on / blocked

Reported:	2023-05-15 15:46 UTC by Marian Csontos
Modified:	2023-11-07 11:28 UTC (History)
CC List:	8 users (show)
Fixed In Version:	lvm2-2.03.21-3.el9
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1518121
Environment:
Last Closed:	2023-11-07 08:53:33 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	CLUSTERQE-6703	None	None	None	2023-05-16 14:16:54 UTC
Red Hat Issue Tracker	RHELPLAN-157259	None	None	None	2023-05-15 15:48:13 UTC
Red Hat Product Errata	RHBA-2023:6633	None	None	None	2023-11-07 08:53:59 UTC

Description Marian Csontos 2023-05-15 15:46:12 UTC

+++ This bug was initially created as a clone of Bug #1518121 +++

Description of problem:
When creating RAID1 with 2 legs spanning multiple disks, allocator uses one of the disks for both legs.

Version-Release number of selected component (if applicable):
2.02.176

Affected versions: el7.2, el7.5, not checked other...

How reproducible:
100%

Steps to Reproduce:
- having 3 8GB disks /dev/sd[abc]

    vgcreate vg /dev/sd[abc]
    lvcreate -n t1 -L 4G vg /dev/sda
    lvcreate -n t2 -L 4G vg /dev/sdb
    lvcreate -n r1 -m 1 -L 6G vg
    lvs -aoname,devices


Actual results:

# lvs -aoname,devices
  LV            Devices                      
  r1            r1_rimage_0(0),r1_rimage_1(0)
  [r1_rimage_0] /dev/sdc(1)                  <---- sdc is used for both _rimage_0...
  [r1_rimage_0] /dev/sdb(1024)               
  [r1_rimage_1] /dev/sda(1025)               
  [r1_rimage_1] /dev/sdc(1023)               <---- ...as well as for _rimage_1
  [r1_rmeta_0]  /dev/sdc(0)                  
  [r1_rmeta_1]  /dev/sda(1024)               
  t1            /dev/sda(0)                  
  t2            /dev/sdb(0) 

Expected results:


Additional info:

--- Additional comment from Marian Csontos on 2017-11-28 12:53:58 UTC ---

In case of failure such device can not be repaired:

# lvconvert --repair vg/r1
  WARNING: Disabling lvmetad cache for repair command.
  WARNING: Not using lvmetad because of repair.
  /dev/vg/r1: read failed after 0 of 4096 at 6442385408: Input/output error
  /dev/vg/r1: read failed after 0 of 4096 at 6442442752: Input/output error
  Couldn't find device with uuid UFK7K0-nGPE-76Rq-F5WC-xGig-UzXP-MnFuDz.
Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
  Unable to replace all PVs from vg/r1 at once.
  Failed to replace faulty devices in vg/r1.

TODO: Test with devices in sync.

Workaround: `lvcreate -n r1 -m 1 -L 6G vg /dev/sd[cab]`

--- Additional comment from Marian Csontos on 2017-11-28 14:23:30 UTC ---

Waited for sync. 100% in sync unable to repair. Also it is possible to read only first 4 GBs, reading from the second segment fails.

Read from first segment:

# dd if=/dev/vg/r1 of=/dev/null skip=4000 count=1 bs=1M
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00195199 s, 537 MB/s

Read from second segment fails:

# dd if=/dev/vg/r1 of=/dev/null skip=5000 count=1 bs=1M
dd: error reading ‘/dev/vg/r1’: Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00201181 s, 0.0 kB/s

Surprisingly the RAID1 device sanity reports DA:

# dmsetup status
vg-r1_rmeta_1: 0 8192 linear 
vg-r1_rimage_1: 0 8372224 linear 
vg-r1_rimage_1: 8372224 4210688 linear 
vg-t2: 0 8388608 linear 
vg-r1_rmeta_0: 0 8192 linear 
vg-r1_rimage_0: 0 8372224 linear 
vg-r1_rimage_0: 8372224 4210688 linear 
vg-t1: 0 8388608 linear 
vg_stacker_OTzj-root: 0 12582912 linear 
vg-r1: 0 12582912 raid raid1 2 DA 12582912/12582912 idle 0 0 -

And for reference only the segments:

# lvs --segments -aolv_name,pe_ranges,le_ranges
  WARNING: Not using lvmetad because a repair command was run.
  /dev/vg/r1: read failed after 0 of 4096 at 6442385408: Input/output error
  /dev/vg/r1: read failed after 0 of 4096 at 6442442752: Input/output error
  Couldn't find device with uuid jDRXQI-jGSW-BAOG-LB3h-aLhd-8fPb-kHbPZf.
  LV            PE Ranges                             LE Ranges                                
  r1            r1_rimage_0:0-1535 r1_rimage_1:0-1535 [r1_rimage_0]:0-1535,[r1_rimage_1]:0-1535
  [r1_rimage_0] [unknown]:1-1022                      [unknown]:1-1022                         
  [r1_rimage_0] /dev/sdb:1024-1537                    /dev/sdb:1024-1537                       
  [r1_rimage_1] /dev/sda:1025-2046                    /dev/sda:1025-2046                       
  [r1_rimage_1] [unknown]:1023-1536                   [unknown]:1023-1536                      
  [r1_rmeta_0]  [unknown]:0-0                         [unknown]:0-0                            
  [r1_rmeta_1]  /dev/sda:1024-1024                    /dev/sda:1024-1024

--- Additional comment from Steve D on 2018-10-18 12:12:26 UTC ---

I've just been bitten by this for second time, though I swore I specified PVs manually. 2.02.176 (-4.1ubuntu3) on Ubuntu 18.04.

I also hit a whole load of scrub errors last night - the data stored in the fs seems fine, it looks like when I extended the RAID1 LV in question, the extensions didn't get synced. Still investigating that one - may have been triggered by me trying to work around this bug.

Any thoughts / progress?

--- Additional comment from Heinz Mauelshagen on 2019-09-05 11:29:03 UTC ---

This behaves like allocation policy 'anywhere' was applied.
Reworking the allocator gains importance!

Creating with 'cling' allocation policy avoids the problem with existing t[12] LVs allocated on sd[ab] and free sdc:

# lvcreate -y  -nr1 -l251 -m1 vg
  Logical volume "r1" created.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-250 r1_rimage_1:0-250
  [r1_rimage_0] /dev/sdc:1-126    <---
  [r1_rimage_0] /dev/sdb:128-252
  [r1_rimage_1] /dev/sda:129-254
  [r1_rimage_1] /dev/sdc:127-251  <--- bogus collocation with rimage_0
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127
  t2            /dev/sdb:0-127

# lvremove -y vg/r1
  Logical volume "r1" successfully removed

# lvcreate -y --alloc cling -nr1 -l100%FREE -m1 vg # could alternatively give the extent count, same result
  Logical volume "r1" created.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-125 r1_rimage_1:0-125
  [r1_rimage_0] /dev/sdc:1-126
  [r1_rimage_1] /dev/sda:129-254
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127
  t2            /dev/sdb:0-127

# lvextend -l+1 vg/r1
  Extending 2 mirror images.
  Insufficient suitable allocatable extents for logical volume r1: 2 more required


Now alloc_anywhere is needed to make use of the existing space and listing sdc, sdb in that
sequence (sdb, sdc does not work) to avoid collocation:

# lvextend -l+1 vg/r1 
  Extending 2 mirror images.
  Insufficient suitable allocatable extents for logical volume r1: 2 more required

# lvextend -y --alloc anywhere -l+1 vg/r1 
  Extending 2 mirror images.
  Size of logical volume vg/r1 changed from 504.00 MiB (126 extents) to 508.00 MiB (127 extents).
  Logical volume vg/r1 successfully resized.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-126 r1_rimage_1:0-126
  [r1_rimage_0] /dev/sdc:1-126    <---
  [r1_rimage_0] /dev/sdb:128-128
  [r1_rimage_1] /dev/sda:129-254
  [r1_rimage_1] /dev/sdc:127-127  <--- Bogus collocation again
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127

In this case, reducing the raid1 in size or pvmove'ing the collocated extents off to another unrelated PV are options.

# lvreduce -fy -l-1 vg/r1  WARNING: Reducing active logical volume to 504.00 MiB.
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
  Size of logical volume vg/r1 changed from 508.00 MiB (127 extents) to 504.00 MiB (126 extents).
  Logical volume vg/r1 successfully resized.

# lvextend -y -l+1 --alloc anywhere vg/r1 /dev/sdc /dev/sdb
  Extending 2 mirror images.
  Size of logical volume vg/r1 changed from 504.00 MiB (126 extents) to 508.00 MiB (127 extents).
  Logical volume vg/r1 successfully resized.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-126 r1_rimage_1:0-126
  [r1_rimage_0] /dev/sdc:1-127                     
  [r1_rimage_1] /dev/sda:129-254                   
  [r1_rimage_1] /dev/sdb:128-128                   
  [r1_rmeta_0]  /dev/sdc:0-0                       
  [r1_rmeta_1]  /dev/sda:128-128                   
  t1            /dev/sda:0-127                     
  t2            /dev/sdb:0-127

--- Additional comment from Heinz Mauelshagen on 2023-05-10 16:36:36 UTC ---

Fixed in commit 05c2b10c5d0a99993430ffbcef684a099ba810ad

Comment 8 Corey Marthaler 2023-07-24 20:46:34 UTC

Marking Verified:Tested in the latest rpms.



WITHOUT FIX: lvm2-2.03.21-2.el9    BUILT: Thu May 25 12:03:04 AM CEST 2023

SCENARIO (raid1) - [create_raid_on_limited_contig_space_on_both_legs]
Create a raid with limited but contiguous PV space on both leg devices, and verify the creation took place on the proper devices
Recreating PVs/VG with smaller 8G size
virt-004.cluster-qe.lab.eng.brq.redhat.com: pvcreate --yes --setphysicalvolumesize 8G /dev/sdc /dev/sdd /dev/sdb
virt-004.cluster-qe.lab.eng.brq.redhat.com: vgcreate  raid_sanity /dev/sdc /dev/sdd /dev/sdb
Placing a spacer linear on the two devices so that the raid creation will be constrained
lvcreate --yes  --type linear -n spacer1 -L 4G raid_sanity /dev/sdc
lvcreate --yes  --type linear -n spacer2 -L 4G raid_sanity /dev/sdd
Create the constrained allocation raid volume
lvcreate --yes  --type raid1 -n proper_alloc -L 6G raid_sanity
/dev/sdb shows up as a device in *both* _image0 and _image1
Possible regression of bugs 1518121/2204467


[root@virt-004 ~]# lvs -a -o +devices
  LV                      VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                          
  proper_alloc            raid_sanity   rwi-a-r---   6.00g                                    100.00           proper_alloc_rimage_0(0),proper_alloc_rimage_1(0)
  [proper_alloc_rimage_0] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdb(1)                                      
  [proper_alloc_rimage_0] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdd(1024)                                   
  [proper_alloc_rimage_1] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdc(1025)                                   
  [proper_alloc_rimage_1] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdb(1023)                                   
  [proper_alloc_rmeta_0]  raid_sanity   ewi-aor---   4.00m                                                     /dev/sdb(0)                                      
  [proper_alloc_rmeta_1]  raid_sanity   ewi-aor---   4.00m                                                     /dev/sdc(1024)                                   
  spacer1                 raid_sanity   -wi-a-----   4.00g                                                     /dev/sdc(0)                                      
  spacer2                 raid_sanity   -wi-a-----   4.00g                                                     /dev/sdd(0)                                      




WITH FIX: lvm2-2.03.21-3.el9    BUILT: Thu Jul 13 08:50:26 PM CEST 2023

SCENARIO (raid1) - [create_raid_on_limited_contig_space_on_both_legs]
Create a raid with limited but contiguous PV space on both leg devices, and verify the creation took place on the proper devices
Recreating PVs/VG with smaller 8G size
virt-009.cluster-qe.lab.eng.brq.redhat.com: pvcreate --yes --setphysicalvolumesize 8G /dev/sdb /dev/sde /dev/sdf
virt-009.cluster-qe.lab.eng.brq.redhat.com: vgcreate  raid_sanity /dev/sdb /dev/sde /dev/sdf
Placing a spacer linear on the two devices so that the raid creation will be constrained
lvcreate --yes  --type linear -n spacer1 -L 4G raid_sanity /dev/sdb
lvcreate --yes  --type linear -n spacer2 -L 4G raid_sanity /dev/sde
Create the constrained allocation raid volume
lvcreate --yes  --type raid1 -n proper_alloc -L 6G raid_sanity
  LV raid_sanity/proper_alloc_rimage_1 using PV /dev/sdf is not redundant.
  Insufficient suitable allocatable extents for logical volume raid_sanity/proper_alloc

Comment 13 errata-xmlrpc 2023-11-07 08:53:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (lvm2 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6633

Note You need to log in before you can comment on or make changes to this bug.