Bug 2204467

Summary: multisegment RAID1, allocator uses one disk for both legs
Product: Red Hat Enterprise Linux 9 Reporter: Marian Csontos <mcsontos>
Component: lvm2Assignee: LVM Team <lvm-team>
lvm2 sub component: Mirroring and RAID QA Contact: cluster-qe <cluster-qe>
Status: VERIFIED --- Docs Contact:
Severity: high    
Priority: high CC: agk, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, steved424, zkabelac
Version: 9.3Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.03.21-3.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1518121 Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1518121, 2204480    
Bug Blocks:    

Description Marian Csontos 2023-05-15 15:46:12 UTC
+++ This bug was initially created as a clone of Bug #1518121 +++

Description of problem:
When creating RAID1 with 2 legs spanning multiple disks, allocator uses one of the disks for both legs.

Version-Release number of selected component (if applicable):
2.02.176

Affected versions: el7.2, el7.5, not checked other...

How reproducible:
100%

Steps to Reproduce:
- having 3 8GB disks /dev/sd[abc]

    vgcreate vg /dev/sd[abc]
    lvcreate -n t1 -L 4G vg /dev/sda
    lvcreate -n t2 -L 4G vg /dev/sdb
    lvcreate -n r1 -m 1 -L 6G vg
    lvs -aoname,devices


Actual results:

# lvs -aoname,devices
  LV            Devices                      
  r1            r1_rimage_0(0),r1_rimage_1(0)
  [r1_rimage_0] /dev/sdc(1)                  <---- sdc is used for both _rimage_0...
  [r1_rimage_0] /dev/sdb(1024)               
  [r1_rimage_1] /dev/sda(1025)               
  [r1_rimage_1] /dev/sdc(1023)               <---- ...as well as for _rimage_1
  [r1_rmeta_0]  /dev/sdc(0)                  
  [r1_rmeta_1]  /dev/sda(1024)               
  t1            /dev/sda(0)                  
  t2            /dev/sdb(0) 

Expected results:


Additional info:

--- Additional comment from Marian Csontos on 2017-11-28 12:53:58 UTC ---

In case of failure such device can not be repaired:

# lvconvert --repair vg/r1
  WARNING: Disabling lvmetad cache for repair command.
  WARNING: Not using lvmetad because of repair.
  /dev/vg/r1: read failed after 0 of 4096 at 6442385408: Input/output error
  /dev/vg/r1: read failed after 0 of 4096 at 6442442752: Input/output error
  Couldn't find device with uuid UFK7K0-nGPE-76Rq-F5WC-xGig-UzXP-MnFuDz.
Attempt to replace failed RAID images (requires full device resync)? [y/n]: y
  Unable to replace all PVs from vg/r1 at once.
  Failed to replace faulty devices in vg/r1.

TODO: Test with devices in sync.

Workaround: `lvcreate -n r1 -m 1 -L 6G vg /dev/sd[cab]`

--- Additional comment from Marian Csontos on 2017-11-28 14:23:30 UTC ---

Waited for sync. 100% in sync unable to repair. Also it is possible to read only first 4 GBs, reading from the second segment fails.

Read from first segment:

# dd if=/dev/vg/r1 of=/dev/null skip=4000 count=1 bs=1M
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00195199 s, 537 MB/s

Read from second segment fails:

# dd if=/dev/vg/r1 of=/dev/null skip=5000 count=1 bs=1M
dd: error reading ‘/dev/vg/r1’: Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.00201181 s, 0.0 kB/s

Surprisingly the RAID1 device sanity reports DA:

# dmsetup status
vg-r1_rmeta_1: 0 8192 linear 
vg-r1_rimage_1: 0 8372224 linear 
vg-r1_rimage_1: 8372224 4210688 linear 
vg-t2: 0 8388608 linear 
vg-r1_rmeta_0: 0 8192 linear 
vg-r1_rimage_0: 0 8372224 linear 
vg-r1_rimage_0: 8372224 4210688 linear 
vg-t1: 0 8388608 linear 
vg_stacker_OTzj-root: 0 12582912 linear 
vg-r1: 0 12582912 raid raid1 2 DA 12582912/12582912 idle 0 0 -

And for reference only the segments:

# lvs --segments -aolv_name,pe_ranges,le_ranges
  WARNING: Not using lvmetad because a repair command was run.
  /dev/vg/r1: read failed after 0 of 4096 at 6442385408: Input/output error
  /dev/vg/r1: read failed after 0 of 4096 at 6442442752: Input/output error
  Couldn't find device with uuid jDRXQI-jGSW-BAOG-LB3h-aLhd-8fPb-kHbPZf.
  LV            PE Ranges                             LE Ranges                                
  r1            r1_rimage_0:0-1535 r1_rimage_1:0-1535 [r1_rimage_0]:0-1535,[r1_rimage_1]:0-1535
  [r1_rimage_0] [unknown]:1-1022                      [unknown]:1-1022                         
  [r1_rimage_0] /dev/sdb:1024-1537                    /dev/sdb:1024-1537                       
  [r1_rimage_1] /dev/sda:1025-2046                    /dev/sda:1025-2046                       
  [r1_rimage_1] [unknown]:1023-1536                   [unknown]:1023-1536                      
  [r1_rmeta_0]  [unknown]:0-0                         [unknown]:0-0                            
  [r1_rmeta_1]  /dev/sda:1024-1024                    /dev/sda:1024-1024

--- Additional comment from Steve D on 2018-10-18 12:12:26 UTC ---

I've just been bitten by this for second time, though I swore I specified PVs manually. 2.02.176 (-4.1ubuntu3) on Ubuntu 18.04.

I also hit a whole load of scrub errors last night - the data stored in the fs seems fine, it looks like when I extended the RAID1 LV in question, the extensions didn't get synced. Still investigating that one - may have been triggered by me trying to work around this bug.

Any thoughts / progress?

--- Additional comment from Heinz Mauelshagen on 2019-09-05 11:29:03 UTC ---

This behaves like allocation policy 'anywhere' was applied.
Reworking the allocator gains importance!

Creating with 'cling' allocation policy avoids the problem with existing t[12] LVs allocated on sd[ab] and free sdc:

# lvcreate -y  -nr1 -l251 -m1 vg
  Logical volume "r1" created.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-250 r1_rimage_1:0-250
  [r1_rimage_0] /dev/sdc:1-126    <---
  [r1_rimage_0] /dev/sdb:128-252
  [r1_rimage_1] /dev/sda:129-254
  [r1_rimage_1] /dev/sdc:127-251  <--- bogus collocation with rimage_0
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127
  t2            /dev/sdb:0-127

# lvremove -y vg/r1
  Logical volume "r1" successfully removed

# lvcreate -y --alloc cling -nr1 -l100%FREE -m1 vg # could alternatively give the extent count, same result
  Logical volume "r1" created.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-125 r1_rimage_1:0-125
  [r1_rimage_0] /dev/sdc:1-126
  [r1_rimage_1] /dev/sda:129-254
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127
  t2            /dev/sdb:0-127

# lvextend -l+1 vg/r1
  Extending 2 mirror images.
  Insufficient suitable allocatable extents for logical volume r1: 2 more required


Now alloc_anywhere is needed to make use of the existing space and listing sdc, sdb in that
sequence (sdb, sdc does not work) to avoid collocation:

# lvextend -l+1 vg/r1 
  Extending 2 mirror images.
  Insufficient suitable allocatable extents for logical volume r1: 2 more required

# lvextend -y --alloc anywhere -l+1 vg/r1 
  Extending 2 mirror images.
  Size of logical volume vg/r1 changed from 504.00 MiB (126 extents) to 508.00 MiB (127 extents).
  Logical volume vg/r1 successfully resized.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-126 r1_rimage_1:0-126
  [r1_rimage_0] /dev/sdc:1-126    <---
  [r1_rimage_0] /dev/sdb:128-128
  [r1_rimage_1] /dev/sda:129-254
  [r1_rimage_1] /dev/sdc:127-127  <--- Bogus collocation again
  [r1_rmeta_0]  /dev/sdc:0-0
  [r1_rmeta_1]  /dev/sda:128-128
  t1            /dev/sda:0-127

In this case, reducing the raid1 in size or pvmove'ing the collocated extents off to another unrelated PV are options.

# lvreduce -fy -l-1 vg/r1  WARNING: Reducing active logical volume to 504.00 MiB.
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
  Size of logical volume vg/r1 changed from 508.00 MiB (127 extents) to 504.00 MiB (126 extents).
  Logical volume vg/r1 successfully resized.

# lvextend -y -l+1 --alloc anywhere vg/r1 /dev/sdc /dev/sdb
  Extending 2 mirror images.
  Size of logical volume vg/r1 changed from 504.00 MiB (126 extents) to 508.00 MiB (127 extents).
  Logical volume vg/r1 successfully resized.

# lvs --noh -aoname,segperanges vg
  r1            r1_rimage_0:0-126 r1_rimage_1:0-126
  [r1_rimage_0] /dev/sdc:1-127                     
  [r1_rimage_1] /dev/sda:129-254                   
  [r1_rimage_1] /dev/sdb:128-128                   
  [r1_rmeta_0]  /dev/sdc:0-0                       
  [r1_rmeta_1]  /dev/sda:128-128                   
  t1            /dev/sda:0-127                     
  t2            /dev/sdb:0-127

--- Additional comment from Heinz Mauelshagen on 2023-05-10 16:36:36 UTC ---

Fixed in commit 05c2b10c5d0a99993430ffbcef684a099ba810ad

Comment 8 Corey Marthaler 2023-07-24 20:46:34 UTC
Marking Verified:Tested in the latest rpms.



WITHOUT FIX: lvm2-2.03.21-2.el9    BUILT: Thu May 25 12:03:04 AM CEST 2023

SCENARIO (raid1) - [create_raid_on_limited_contig_space_on_both_legs]
Create a raid with limited but contiguous PV space on both leg devices, and verify the creation took place on the proper devices
Recreating PVs/VG with smaller 8G size
virt-004.cluster-qe.lab.eng.brq.redhat.com: pvcreate --yes --setphysicalvolumesize 8G /dev/sdc /dev/sdd /dev/sdb
virt-004.cluster-qe.lab.eng.brq.redhat.com: vgcreate  raid_sanity /dev/sdc /dev/sdd /dev/sdb
Placing a spacer linear on the two devices so that the raid creation will be constrained
lvcreate --yes  --type linear -n spacer1 -L 4G raid_sanity /dev/sdc
lvcreate --yes  --type linear -n spacer2 -L 4G raid_sanity /dev/sdd
Create the constrained allocation raid volume
lvcreate --yes  --type raid1 -n proper_alloc -L 6G raid_sanity
/dev/sdb shows up as a device in *both* _image0 and _image1
Possible regression of bugs 1518121/2204467


[root@virt-004 ~]# lvs -a -o +devices
  LV                      VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                          
  proper_alloc            raid_sanity   rwi-a-r---   6.00g                                    100.00           proper_alloc_rimage_0(0),proper_alloc_rimage_1(0)
  [proper_alloc_rimage_0] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdb(1)                                      
  [proper_alloc_rimage_0] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdd(1024)                                   
  [proper_alloc_rimage_1] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdc(1025)                                   
  [proper_alloc_rimage_1] raid_sanity   iwi-aor---   6.00g                                                     /dev/sdb(1023)                                   
  [proper_alloc_rmeta_0]  raid_sanity   ewi-aor---   4.00m                                                     /dev/sdb(0)                                      
  [proper_alloc_rmeta_1]  raid_sanity   ewi-aor---   4.00m                                                     /dev/sdc(1024)                                   
  spacer1                 raid_sanity   -wi-a-----   4.00g                                                     /dev/sdc(0)                                      
  spacer2                 raid_sanity   -wi-a-----   4.00g                                                     /dev/sdd(0)                                      




WITH FIX: lvm2-2.03.21-3.el9    BUILT: Thu Jul 13 08:50:26 PM CEST 2023

SCENARIO (raid1) - [create_raid_on_limited_contig_space_on_both_legs]
Create a raid with limited but contiguous PV space on both leg devices, and verify the creation took place on the proper devices
Recreating PVs/VG with smaller 8G size
virt-009.cluster-qe.lab.eng.brq.redhat.com: pvcreate --yes --setphysicalvolumesize 8G /dev/sdb /dev/sde /dev/sdf
virt-009.cluster-qe.lab.eng.brq.redhat.com: vgcreate  raid_sanity /dev/sdb /dev/sde /dev/sdf
Placing a spacer linear on the two devices so that the raid creation will be constrained
lvcreate --yes  --type linear -n spacer1 -L 4G raid_sanity /dev/sdb
lvcreate --yes  --type linear -n spacer2 -L 4G raid_sanity /dev/sde
Create the constrained allocation raid volume
lvcreate --yes  --type raid1 -n proper_alloc -L 6G raid_sanity
  LV raid_sanity/proper_alloc_rimage_1 using PV /dev/sdf is not redundant.
  Insufficient suitable allocatable extents for logical volume raid_sanity/proper_alloc