Bug 2204467

Summary: multisegment RAID1, allocator uses one disk for both legs
Product: Red Hat Enterprise Linux 9 Reporter: Marian Csontos <mcsontos>
Component: lvm2Assignee: LVM Team <lvm-team>
lvm2 sub component: Mirroring and RAID QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: agk, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, steved424, zkabelac
Version: 9.3Keywords: Triaged
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.03.21-3.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1518121 Environment:
Last Closed: 2023-11-07 08:53:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1518121, 2204480    
Bug Blocks:    

Description Marian Csontos 2023-05-15 15:46:12 UTC
+++ This bug was initially created as a clone of Bug #1518121 +++

Description of problem:
When creating RAID1 with 2 legs spanning multiple disks, allocator uses one of the disks for both legs.

Version-Release number of selected component (if applicable):
2.02.176

Affected versions: el7.2, el7.5, not checked other...

How reproducible:
100%

Steps to Reproduce:
- having 3 8GB disks /dev/sd[abc]

    vgcreate vg /dev/sd[abc]
    lvcreate -n t1 -L 4G vg /dev/sda
    lvcreate -n t2 -L 4G vg /dev/sdb
    lvcreate -n r1 -m 1 -L 6G vg
    lvs -aoname,devices


Actual results:

# lvs -aoname,devices
  LV            Devices                      
  r1            r1_rimage_0(0),r1_rimage_1(0)
  [r1_rimage_0] /dev/sdc(1)                  <---- sdc is used for both _rimage_0...
  [r1_rimage_0] /dev/sdb(1024)               
  [r1_rimage_1] /dev/sda(1025)               
  [r1_rimage_1] /dev/sdc(1023)               <---- ...as well as for _rimage_1
  [r1_rmeta_0]  /dev/sdc(0)                  
  [r1_rmeta_1]  /dev/sda(1024)               
  t1            /dev/sda(0)                  
  t2            /dev/sdb(0) 

Expected results:


Additional info:

--- Additional comment from Marian Csontos on 2017-11-28 12:53:58 UTC ---

In case of failure such device can not be repaired:

# lvconvert --repair vg/r1
  WARNING: Disabling lvmetad cache for repair command.
  WARNING: Not using lvmetad because of repair.
  /dev/vg/r1: read failed after 0 of 4096 at 6442385408: Input/output error
  /dev/vg/r1: read failed after 0 of 4096 at 6442442752: Input/output error
  Couldn't find device with uuid UFK7K0-nGPE-76Rq-F5WC-xGig-UzXP-MnFuDz.
Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
  Unable to replace all PVs from vg/r1 at once.
  Failed to replace faulty devices in vg/r1.

TODO: Test with devices in sync.

Workaround: `lvcreate -n r1 -m 1 -L 6G vg /dev/sd[cab]`

--- Additional comment from Marian Csontos on 2017-11-28 14:23:30 UTC ---

Waited for sync. 100% in sync unable to repair. Also it is possible to read only first 4 GBs, reading from the second segment fails.

Read from first segment:

# dd if=/dev/vg/r1 of=/dev/null skip=4000 count=1 bs=1M
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00195199 s, 537 MB/s

Read from second segment fails:

# dd if=/dev/vg/r1 of=/dev/null skip=5000 count=1 bs=1M
dd: error reading ‘/dev/vg/r1’: Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00201181 s, 0.0 kB/s

Surprisingly the RAID1 device sanity reports DA:

# dmsetup status
vg-r1_rmeta_1: 0 8192 linear 
vg-r1_rimage_1: 0 8372224 linear 
vg-r1_rimage_1: 8372224 4210688 linear 
vg-t2: 0 8388608 linear 
vg-r1_rmeta_0: 0 8192 linear 
vg-r1_rimage_0: 0 8372224 linear 
vg-r1_rimage_0: 8372224 4210688 linear 
vg-t1: 0 8388608 linear 
vg_stacker_OTzj-root: 0 12582912 linear 
vg-r1: 0 12582912 raid raid1 2 DA 12582912/12582912 idle 0 0 -

And for reference only the segments:

# lvs --segments -aolv_name,pe_ranges,le_ranges
  WARNING: Not using lvmetad because a repair command was run.
  /dev/vg/r1: read failed after 0 of 4096 at 6442385408: Input/output error
  /dev/vg/r1: read failed after 0 of 4096 at 6442442752: Input/output error
  Couldn't find device with uuid jDRXQI-jGSW-BAOG-LB3h-aLhd-8fPb-kHbPZf.
  LV            PE Ranges                             LE Ranges                                
  r1            r1_rimage_0:0-1535 r1_rimage_1:0-1535 [r1_rimage_0]:0-1535,[r1_rimage_1]:0-1535
  [r1_rimage_0] [unknown]:1-1022                      [unknown]:1-1022                         
  [r1_rimage_0] /dev/sdb:1024-1537                    /dev/sdb:1024-1537                       
  [r1_rimage_1] /dev/sda:1025-2046                    /dev/sda:1025-2046                       
  [r1_rimage_1] [unknown]:1023-1536                   [unknown]:1023-1536                      
  [r1_rmeta_0]  [unknown]:0-0                         [unknown]:0-0                            
  [r1_rmeta_1]  /dev/sda:1024-1024                    /dev/sda:1024-1024

--- Additional comment from Steve D on 2018-10-18 12:12:26 UTC ---

I've just been bitten by this for second time, though I swore I specified PVs manually. 2.02.176 (-4.1ubuntu3) on Ubuntu 18.04.

I also hit a whole load of scrub errors last night - the data stored in the fs seems fine, it looks like when I extended the RAID1 LV in question, the extensions didn't get synced. Still investigating that one - may have been triggered by me trying to work around this bug.

Any thoughts / progress?

--- Additional comment from Heinz Mauelshagen on 2019-09-05 11:29:03 UTC ---

This behaves like allocation policy 'anywhere' was applied.
Reworking the allocator gains importance!

Creating with 'cling' allocation policy avoids the problem with existing t[12] LVs allocated on sd[ab] and free sdc:

# lvcreate -y  -nr1 -l251 -m1 vg
  Logical volume "r1" created.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-250 r1_rimage_1:0-250
  [r1_rimage_0] /dev/sdc:1-126    <---
  [r1_rimage_0] /dev/sdb:128-252
  [r1_rimage_1] /dev/sda:129-254
  [r1_rimage_1] /dev/sdc:127-251  <--- bogus collocation with rimage_0
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127
  t2            /dev/sdb:0-127

# lvremove -y vg/r1
  Logical volume "r1" successfully removed

# lvcreate -y --alloc cling -nr1 -l100%FREE -m1 vg # could alternatively give the extent count, same result
  Logical volume "r1" created.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-125 r1_rimage_1:0-125
  [r1_rimage_0] /dev/sdc:1-126
  [r1_rimage_1] /dev/sda:129-254
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127
  t2            /dev/sdb:0-127

# lvextend -l+1 vg/r1
  Extending 2 mirror images.
  Insufficient suitable allocatable extents for logical volume r1: 2 more required


Now alloc_anywhere is needed to make use of the existing space and listing sdc, sdb in that
sequence (sdb, sdc does not work) to avoid collocation:

# lvextend -l+1 vg/r1 
  Extending 2 mirror images.
  Insufficient suitable allocatable extents for logical volume r1: 2 more required

# lvextend -y --alloc anywhere -l+1 vg/r1 
  Extending 2 mirror images.
  Size of logical volume vg/r1 changed from 504.00 MiB (126 extents) to 508.00 MiB (127 extents).
  Logical volume vg/r1 successfully resized.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-126 r1_rimage_1:0-126
  [r1_rimage_0] /dev/sdc:1-126    <---
  [r1_rimage_0] /dev/sdb:128-128
  [r1_rimage_1] /dev/sda:129-254
  [r1_rimage_1] /dev/sdc:127-127  <--- Bogus collocation again
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127

In this case, reducing the raid1 in size or pvmove'ing the collocated extents off to another unrelated PV are options.

# lvreduce -fy -l-1 vg/r1  WARNING: Reducing active logical volume to 504.00 MiB.
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
  Size of logical volume vg/r1 changed from 508.00 MiB (127 extents) to 504.00 MiB (126 extents).
  Logical volume vg/r1 successfully resized.

# lvextend -y -l+1 --alloc anywhere vg/r1 /dev/sdc /dev/sdb
  Extending 2 mirror images.
  Size of logical volume vg/r1 changed from 504.00 MiB (126 extents) to 508.00 MiB (127 extents).
  Logical volume vg/r1 successfully resized.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-126 r1_rimage_1:0-126
  [r1_rimage_0] /dev/sdc:1-127                     
  [r1_rimage_1] /dev/sda:129-254                   
  [r1_rimage_1] /dev/sdb:128-128                   
  [r1_rmeta_0]  /dev/sdc:0-0                       
  [r1_rmeta_1]  /dev/sda:128-128                   
  t1            /dev/sda:0-127                     
  t2            /dev/sdb:0-127

--- Additional comment from Heinz Mauelshagen on 2023-05-10 16:36:36 UTC ---

Fixed in commit 05c2b10c5d0a99993430ffbcef684a099ba810ad

Comment 8 Corey Marthaler 2023-07-24 20:46:34 UTC
Marking Verified:Tested in the latest rpms.



WITHOUT FIX: lvm2-2.03.21-2.el9    BUILT: Thu May 25 12:03:04 AM CEST 2023

SCENARIO (raid1) - [create_raid_on_limited_contig_space_on_both_legs]
Create a raid with limited but contiguous PV space on both leg devices, and verify the creation took place on the proper devices
Recreating PVs/VG with smaller 8G size
virt-004.cluster-qe.lab.eng.brq.redhat.com: pvcreate --yes --setphysicalvolumesize 8G /dev/sdc /dev/sdd /dev/sdb
virt-004.cluster-qe.lab.eng.brq.redhat.com: vgcreate  raid_sanity /dev/sdc /dev/sdd /dev/sdb
Placing a spacer linear on the two devices so that the raid creation will be constrained
lvcreate --yes  --type linear -n spacer1 -L 4G raid_sanity /dev/sdc
lvcreate --yes  --type linear -n spacer2 -L 4G raid_sanity /dev/sdd
Create the constrained allocation raid volume
lvcreate --yes  --type raid1 -n proper_alloc -L 6G raid_sanity
/dev/sdb shows up as a device in *both* _image0 and _image1
Possible regression of bugs 1518121/2204467


[root@virt-004 ~]# lvs -a -o +devices
  LV                      VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                          
  proper_alloc            raid_sanity   rwi-a-r---   6.00g                                    100.00           proper_alloc_rimage_0(0),proper_alloc_rimage_1(0)
  [proper_alloc_rimage_0] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdb(1)                                      
  [proper_alloc_rimage_0] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdd(1024)                                   
  [proper_alloc_rimage_1] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdc(1025)                                   
  [proper_alloc_rimage_1] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdb(1023)                                   
  [proper_alloc_rmeta_0]  raid_sanity   ewi-aor---   4.00m                                                     /dev/sdb(0)                                      
  [proper_alloc_rmeta_1]  raid_sanity   ewi-aor---   4.00m                                                     /dev/sdc(1024)                                   
  spacer1                 raid_sanity   -wi-a-----   4.00g                                                     /dev/sdc(0)                                      
  spacer2                 raid_sanity   -wi-a-----   4.00g                                                     /dev/sdd(0)                                      




WITH FIX: lvm2-2.03.21-3.el9    BUILT: Thu Jul 13 08:50:26 PM CEST 2023

SCENARIO (raid1) - [create_raid_on_limited_contig_space_on_both_legs]
Create a raid with limited but contiguous PV space on both leg devices, and verify the creation took place on the proper devices
Recreating PVs/VG with smaller 8G size
virt-009.cluster-qe.lab.eng.brq.redhat.com: pvcreate --yes --setphysicalvolumesize 8G /dev/sdb /dev/sde /dev/sdf
virt-009.cluster-qe.lab.eng.brq.redhat.com: vgcreate  raid_sanity /dev/sdb /dev/sde /dev/sdf
Placing a spacer linear on the two devices so that the raid creation will be constrained
lvcreate --yes  --type linear -n spacer1 -L 4G raid_sanity /dev/sdb
lvcreate --yes  --type linear -n spacer2 -L 4G raid_sanity /dev/sde
Create the constrained allocation raid volume
lvcreate --yes  --type raid1 -n proper_alloc -L 6G raid_sanity
  LV raid_sanity/proper_alloc_rimage_1 using PV /dev/sdf is not redundant.
  Insufficient suitable allocatable extents for logical volume raid_sanity/proper_alloc

Comment 13 errata-xmlrpc 2023-11-07 08:53:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (lvm2 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6633