Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1447812

Summary: RAID RESHAPE: potential for data corruption when adding striped raid images
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: Heinz Mauelshagen <heinzm>
lvm2 sub component: Mirroring and RAID QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: unspecified CC: agk, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, prockai, zkabelac
Version: 7.4Keywords: TestBlocker
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.171-6.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1463705 (view as bug list) Environment:
Last Closed: 2017-08-01 21:52:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1463705    
Attachments:
Description Flags
output of new corrupt test run
none
lvchange -vvvv
none
output from different systems experiencing this issue none

Description Corey Marthaler 2017-05-03 23:36:01 UTC
Description of problem:
===============================================================================                                                   
Iteration 0.1 started at Wed May  3 18:16:37 CDT 2017                                                                                                                                                                   
===============================================================================                                                                                                                                        
Scenario raid6_ra_6: Convert Striped raid6_ra_6 volume                                                                                                                                                                     

********* Take over hash info for this scenario *********                                                                                                                                                                         
* from type:    raid6_ra_6                                                                                                                                                                                                        
* to type:      raid6_ls_6                                                                                                                                                                                                        
* from legs:    3                                                                                                                                                                                                                     
* to legs:      5                                                                                                                                                                                                                     
* from region:  256.00k                                                                                                                                                                                                               
* to region:    256.00k                                                                                                                                                                                                                   
* contiguous:   0                                                                                                                                                                                                                         
* snapshot:     1                                                                                                                                                                                                                          
******************************************************

Creating original volume on host-126...
host-126: lvcreate --type raid6_ra_6 -R 256.00k -i 3 -n takeover -L 4G centipede2
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 18.60% )
   0/1 mirror(s) are fully synced: ( 36.51% )
   0/1 mirror(s) are fully synced: ( 59.20% )
   0/1 mirror(s) are fully synced: ( 83.70% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Placing a spacer on all raid image PVs so that expansion will have to be placed beyond
Extending raid beyond spacer
        lvextend -L +50M centipede2/takeover

Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices                                                                                                 
  lvol0               -wi-a-----  20.00m          /dev/sda1(343)                                                                                          
  lvol1               -wi-a-----  20.00m          /dev/sda1(348)                                                                                          
  lvol2               -wi-a-----  20.00m          /dev/sdb1(343)                                                                                          
  lvol3               -wi-a-----  20.00m          /dev/sdb1(348)                                                                                          
  lvol4               -wi-a-----  20.00m          /dev/sde1(343)                                                                                          
  lvol5               -wi-a-----  20.00m          /dev/sde1(348)                                                                                          
  lvol6               -wi-a-----  20.00m          /dev/sdf1(343)                                                                                          
  lvol7               -wi-a-----  20.00m          /dev/sdf1(348)                                                                                          
  lvol8               -wi-a-----  20.00m          /dev/sdg1(343)                                                                                          
  lvol9               -wi-a-----  20.00m          /dev/sdg1(348)                                                                                          
  takeover            rwi-a-r---  <4.07g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0)
  [takeover_rimage_0] iwi-aor---  <1.36g          /dev/sde1(1)                                                                                            
  [takeover_rimage_0] iwi-aor---  <1.36g          /dev/sde1(353)                                                                                          
  [takeover_rimage_1] iwi-aor---  <1.36g          /dev/sdg1(1)                                                                                            
  [takeover_rimage_1] iwi-aor---  <1.36g          /dev/sdg1(353)                                                                                          
  [takeover_rimage_2] iwi-aor---  <1.36g          /dev/sdf1(1)                                                                                            
  [takeover_rimage_2] iwi-aor---  <1.36g          /dev/sdf1(353)                                                                                          
  [takeover_rimage_3] iwi-aor---  <1.36g          /dev/sdb1(1)                                                                                            
  [takeover_rimage_3] iwi-aor---  <1.36g          /dev/sdb1(353)                                                                                          
  [takeover_rimage_4] iwi-aor---  <1.36g          /dev/sda1(1)                                                                                            
  [takeover_rimage_4] iwi-aor---  <1.36g          /dev/sda1(353)                                                                                          
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/sde1(0)                                                                                            
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/sdg1(0)                                                                                            
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/sdf1(0)                                                                                            
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/sdb1(0)                                                                                            
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/sda1(0)                                                                                            


Creating xfs on top of mirror(s) on host-126...
Mounting mirrored xfs filesystems on host-126...

Writing verification files (checkit) to mirror(s) on...
        ---- host-126 ----

Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- host-126 ----

TAKEOVER: lvconvert --yes   --type raid6_ls_6 centipede2/takeover
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 19.57% )
   0/1 mirror(s) are fully synced: ( 36.31% )
   0/1 mirror(s) are fully synced: ( 53.05% )
   0/1 mirror(s) are fully synced: ( 69.50% )
   0/1 mirror(s) are fully synced: ( 85.96% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices                                                                                                 
  lvol0               -wi-a-----  20.00m          /dev/sda1(343)                                                                                          
  lvol1               -wi-a-----  20.00m          /dev/sda1(348)                                                                                          
  lvol2               -wi-a-----  20.00m          /dev/sdb1(343)                                                                                          
  lvol3               -wi-a-----  20.00m          /dev/sdb1(348)                                                                                          
  lvol4               -wi-a-----  20.00m          /dev/sde1(343)                                                                                          
  lvol5               -wi-a-----  20.00m          /dev/sde1(348)                                                                                          
  lvol6               -wi-a-----  20.00m          /dev/sdf1(343)                                                                                          
  lvol7               -wi-a-----  20.00m          /dev/sdf1(348)                                                                                          
  lvol8               -wi-a-----  20.00m          /dev/sdg1(343)                                                                                          
  lvol9               -wi-a-----  20.00m          /dev/sdg1(348)                                                                                          
  takeover            rwi-aor---  <4.07g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0)
  [takeover_rimage_0] iwi-aor---  <1.36g          /dev/sde1(1)                                                                                            
  [takeover_rimage_0] iwi-aor---  <1.36g          /dev/sde1(353)                                                                                          
  [takeover_rimage_1] iwi-aor---  <1.36g          /dev/sdg1(1)                                                                                            
  [takeover_rimage_1] iwi-aor---  <1.36g          /dev/sdg1(353)                                                                                          
  [takeover_rimage_2] iwi-aor---  <1.36g          /dev/sdf1(1)                                                                                            
  [takeover_rimage_2] iwi-aor---  <1.36g          /dev/sdf1(353)                                                                                          
  [takeover_rimage_3] iwi-aor---  <1.36g          /dev/sdb1(1)                                                                                            
  [takeover_rimage_3] iwi-aor---  <1.36g          /dev/sdb1(353)                                                                                          
  [takeover_rimage_4] iwi-aor---  <1.36g          /dev/sda1(1)                                                                                            
  [takeover_rimage_4] iwi-aor---  <1.36g          /dev/sda1(353)                                                                                          
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/sde1(0)                                                                                            
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/sdg1(0)                                                                                            
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/sdf1(0)                                                                                            
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/sdb1(0)                                                                                            
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/sda1(0)                                                                                            


RESHAPE: lvconvert --yes --stripes 5 centipede2/takeover
  WARNING: Adding stripes to active and open logical volume centipede2/takeover will grow it from 1041 to 1735 extents!

Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices                                                                                                                                           
  lvol0               -wi-a-----  20.00m          /dev/sda1(343)                                                                                                                                    
  lvol1               -wi-a-----  20.00m          /dev/sda1(348)                                                                                                                                    
  lvol2               -wi-a-----  20.00m          /dev/sdb1(343)                                                                                                                                    
  lvol3               -wi-a-----  20.00m          /dev/sdb1(348)                                                                                                                                    
  lvol4               -wi-a-----  20.00m          /dev/sde1(343)                                                                                                                                    
  lvol5               -wi-a-----  20.00m          /dev/sde1(348)                                                                                                                                    
  lvol6               -wi-a-----  20.00m          /dev/sdf1(343)                                                                                                                                    
  lvol7               -wi-a-----  20.00m          /dev/sdf1(348)                                                                                                                                    
  lvol8               -wi-a-----  20.00m          /dev/sdg1(343)                                                                                                                                    
  lvol9               -wi-a-----  20.00m          /dev/sdg1(348)                                                                                                                                    
  takeover            rwi-aor-s-  <6.78g 1.44     takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0),takeover_rimage_5(0),takeover_rimage_6(0)
  [takeover_rimage_0] Iwi-aor---  <1.36g          /dev/sde1(1)                                                                                                                                      
  [takeover_rimage_0] Iwi-aor---  <1.36g          /dev/sde1(353)                                                                                                                                    
  [takeover_rimage_1] Iwi-aor---  <1.36g          /dev/sdg1(1)                                                                                                                                      
  [takeover_rimage_1] Iwi-aor---  <1.36g          /dev/sdg1(353)                                                                                                                                    
  [takeover_rimage_2] Iwi-aor---  <1.36g          /dev/sdf1(1)                                                                                                                                      
  [takeover_rimage_2] Iwi-aor---  <1.36g          /dev/sdf1(353)                                                                                                                                    
  [takeover_rimage_3] Iwi-aor---  <1.36g          /dev/sdb1(1)                                                                                                                                      
  [takeover_rimage_3] Iwi-aor---  <1.36g          /dev/sdb1(353)                                                                                                                                    
  [takeover_rimage_4] Iwi-aor---  <1.36g          /dev/sda1(1)                                                                                                                                      
  [takeover_rimage_4] Iwi-aor---  <1.36g          /dev/sda1(353)                                                                                                                                    
  [takeover_rimage_5] Iwi-aor---  <1.36g          /dev/sdd1(1)                                                                                                                                      
  [takeover_rimage_6] Iwi-aor---  <1.36g          /dev/sdc1(1)                                                                                                                                      
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/sde1(0)                                                                                                                                      
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/sdg1(0)                                                                                                                                      
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/sdf1(0)                                                                                                                                      
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/sdb1(0)                                                                                                                                      
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/sda1(0)                                                                                                                                      
  [takeover_rmeta_5]  ewi-aor---   4.00m          /dev/sdd1(0)                                                                                                                                      
  [takeover_rmeta_6]  ewi-aor---   4.00m          /dev/sdc1(0)                                                                                                                                      


Verifying files (checkit) on mirror(s) on...
        ---- host-126 ----
*** DATA COMPARISON ERROR [file:pmmovitmbhjnjnugownsvgnythykthqj] ***
Corrupt regions follow - unprintable chars are represented as '.'
-----------------------------------------------------------------
corrupt bytes starting at file offset 196608
    1st 32 expected bytes:  88888888888888888888888888888888
    1st 32 actual bytes:    ................................


Version-Release number of selected component (if applicable):
3.10.0-660.el7.x86_64

lvm2-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-libs-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-cluster-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017

Comment 2 Corey Marthaler 2017-05-04 22:35:02 UTC
This is reproducible.

================================================================================
Iteration 0.1 started at Thu May  4 17:10:36 CDT 2017
================================================================================
Scenario raid6_ra_6: Convert Striped raid6_ra_6 volume
********* Take over hash info for this scenario *********
* from type:    raid6_ra_6
* to type:      raid6_n_6
* from legs:    3
* to legs:      5
* from region:  1024.00k
* to region:    4096.00k
* contiguous:   1
* snapshot:     1
******************************************************

Creating original volume on host-073...
host-073: lvcreate -vvvv  --type raid6_ra_6 -R 1024.00k -i 3 -n takeover -L 4G centipede2 > /tmp/lvcreate 2>&1
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 21.12% )
   0/1 mirror(s) are fully synced: ( 41.80% )
   0/1 mirror(s) are fully synced: ( 60.03% )
   0/1 mirror(s) are fully synced: ( 75.81% )
   0/1 mirror(s) are fully synced: ( 91.71% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec


Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices
  takeover            rwi-a-r---  <4.01g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0)
  [takeover_rimage_0] iwi-aor---  <1.34g          /dev/sdb1(1)
  [takeover_rimage_1] iwi-aor---  <1.34g          /dev/sdf1(1)
  [takeover_rimage_2] iwi-aor---  <1.34g          /dev/sda1(1)
  [takeover_rimage_3] iwi-aor---  <1.34g          /dev/sdd1(1)
  [takeover_rimage_4] iwi-aor---  <1.34g          /dev/sde1(1)
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/sdb1(0)
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/sdf1(0)
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/sda1(0)
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/sdd1(0)
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/sde1(0)

Creating xfs on top of mirror(s) on host-073...
Mounting mirrored xfs filesystems on host-073...

Writing verification files (checkit) to mirror(s) on...
        ---- host-073 ----

Verifying files (checkit) on mirror(s) on...
        ---- host-073 ----

TAKEOVER: lvconvert --yes -R 4096.00k  --type raid6_n_6 centipede2/takeover
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 12.38% )
   0/1 mirror(s) are fully synced: ( 22.16% )
   0/1 mirror(s) are fully synced: ( 31.66% )
   0/1 mirror(s) are fully synced: ( 41.16% )
   0/1 mirror(s) are fully synced: ( 49.22% )
   0/1 mirror(s) are fully synced: ( 59.58% )
   0/1 mirror(s) are fully synced: ( 72.25% )
   0/1 mirror(s) are fully synced: ( 84.91% )
   0/1 mirror(s) are fully synced: ( 97.00% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices
  takeover            rwi-aor---  <4.01g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0)
  [takeover_rimage_0] iwi-aor---  <1.34g          /dev/sdb1(1)
  [takeover_rimage_1] iwi-aor---  <1.34g          /dev/sdf1(1)
  [takeover_rimage_2] iwi-aor---  <1.34g          /dev/sda1(1)
  [takeover_rimage_3] iwi-aor---  <1.34g          /dev/sdd1(1)
  [takeover_rimage_4] iwi-aor---  <1.34g          /dev/sde1(1)
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/sdb1(0)
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/sdf1(0)
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/sda1(0)
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/sdd1(0)
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/sde1(0)

RESHAPE: lvconvert --yes --stripes 5 centipede2/takeover
  WARNING: Adding stripes to active and open logical volume centipede2/takeover will grow it from 1026 to 1710 extents!


May  4 17:14:35 host-073 qarshd[6743]: Running cmdline: lvconvert --yes --stripes 5 centipede2/takeover
May  4 17:14:36 host-073 multipathd: dm-13: remove map (uevent)
May  4 17:14:36 host-073 multipathd: dm-13: devmap not registered, can't remove
May  4 17:14:36 host-073 multipathd: dm-13: remove map (uevent)
May  4 17:14:36 host-073 multipathd: dm-14: remove map (uevent)
May  4 17:14:36 host-073 multipathd: dm-14: devmap not registered, can't remove
May  4 17:14:36 host-073 multipathd: dm-14: remove map (uevent)
May  4 17:14:36 host-073 kernel: md/raid:mdX: device dm-3 operational as raid disk 0
May  4 17:14:36 host-073 kernel: md/raid:mdX: device dm-5 operational as raid disk 1
May  4 17:14:36 host-073 kernel: md/raid:mdX: device dm-7 operational as raid disk 2
May  4 17:14:36 host-073 kernel: md/raid:mdX: device dm-9 operational as raid disk 3
May  4 17:14:36 host-073 kernel: md/raid:mdX: device dm-11 operational as raid disk 4
May  4 17:14:36 host-073 kernel: md/raid:mdX: raid level 6 active with 5 out of 5 devices, algorithm 5
May  4 17:14:36 host-073 dmeventd[1300]: No longer monitoring RAID device centipede2-takeover for events.
May  4 17:14:36 host-073 kernel: dm-12: detected capacity change from 7172259840 to 4303355904
May  4 17:14:36 host-073 kernel: VFS: busy inodes on changed media or resized disk dm-12
May  4 17:14:36 host-073 kernel: md: reshape of RAID array mdX
May  4 17:14:37 host-073 lvm[1300]: Monitoring RAID device centipede2-takeover for events.
May  4 17:14:38 host-073 kernel: md/raid:mdX: device dm-3 operational as raid disk 0
May  4 17:14:38 host-073 kernel: md/raid:mdX: device dm-5 operational as raid disk 1
May  4 17:14:38 host-073 kernel: md/raid:mdX: device dm-7 operational as raid disk 2
May  4 17:14:38 host-073 kernel: md/raid:mdX: device dm-9 operational as raid disk 3
May  4 17:14:38 host-073 kernel: md/raid:mdX: device dm-11 operational as raid disk 4
May  4 17:14:38 host-073 kernel: md/raid:mdX: device dm-14 operational as raid disk 5
May  4 17:14:38 host-073 kernel: md/raid:mdX: device dm-16 operational as raid disk 6
May  4 17:14:38 host-073 kernel: md/raid:mdX: raid level 6 active with 7 out of 7 devices, algorithm 5
May  4 17:14:38 host-073 dmeventd[1300]: No longer monitoring RAID device centipede2-takeover for events.
May  4 17:14:38 host-073 kernel: md: mdX: reshape interrupted.
May  4 17:14:38 host-073 kernel: dm-12: detected capacity change from 7172259840 to 4303355904
May  4 17:14:38 host-073 kernel: VFS: busy inodes on changed media or resized disk dm-12
May  4 17:14:38 host-073 kernel: md: reshape of RAID array mdX
May  4 17:14:38 host-073 lvm[1300]: Monitoring RAID device centipede2-takeover for events.
May  4 17:16:27 host-073 kernel: md: mdX: reshape done.
May  4 17:16:27 host-073 kernel: dm-12: detected capacity change from 4303355904 to 7172259840
May  4 17:16:27 host-073 kernel: VFS: busy inodes on changed media or resized disk dm-12
May  4 17:16:27 host-073 lvm[1300]: raid6_n_6 array, centipede2-takeover, is now in-sync.




[root@host-073 ~]# lvs -a -o +devices
  LV                  Attr       LSize   Cpy%Sync Devices
  takeover            rwi-aor---  <6.68g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0),takeover_rimage_5(0),takeover_rimage_6(0)
  [takeover_rimage_0] iwi-aor---  <1.34g          /dev/sdb1(1)
  [takeover_rimage_1] iwi-aor---  <1.34g          /dev/sdf1(1)
  [takeover_rimage_2] iwi-aor---  <1.34g          /dev/sda1(1)
  [takeover_rimage_3] iwi-aor---  <1.34g          /dev/sdd1(1)
  [takeover_rimage_4] iwi-aor---  <1.34g          /dev/sde1(1)
  [takeover_rimage_5] iwi-aor---  <1.34g          /dev/sdh1(1)
  [takeover_rimage_6] iwi-aor---  <1.34g          /dev/sdc1(1)
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/sdb1(0)
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/sdf1(0)
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/sda1(0)
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/sdd1(0)
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/sde1(0)
  [takeover_rmeta_5]  ewi-aor---   4.00m          /dev/sdh1(0)
  [takeover_rmeta_6]  ewi-aor---   4.00m          /dev/sdc1(0)


Verifying files (checkit) on mirror(s) on...
        ---- host-073 ----

*** DATA COMPARISON ERROR [file:lsemmhcsvgwerfchcvknkatoxsickwuvrakhhmcdvy] ***
Corrupt regions follow - unprintable chars are represented as '.'
-----------------------------------------------------------------
corrupt bytes starting at file offset 0
    1st 32 expected bytes:  AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    1st 32 actual bytes:    88888888888888888888888888888888

checkit write verify failed

Comment 3 Corey Marthaler 2017-05-04 22:53:01 UTC
Another repo showing the file verification passes before and after the takeover, but then fails after the reshape.


Verifying files (checkit) on mirror(s) on...
        ---- host-126 ----

TAKEOVER: lvconvert --yes -R 256.00k  --type raid6_n_6 centipede2/takeover
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 34.36% )
   0/1 mirror(s) are fully synced: ( 72.59% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices                                                                                                 
  takeover            rwi-aor---  <4.01g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0)
  [takeover_rimage_0] iwi-aor---  <1.34g          /dev/sde1(0)                                                                                            
  [takeover_rimage_1] iwi-aor---  <1.34g          /dev/sdg1(0)                                                                                            
  [takeover_rimage_2] iwi-aor---  <1.34g          /dev/sdf1(0)                                                                                            
  [takeover_rimage_3] iwi-aor---  <1.34g          /dev/sdb1(1)                                                                                            
  [takeover_rimage_4] iwi-aor---  <1.34g          /dev/sda1(1)                                                                                            
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/sde1(342)                                                                                          
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/sdg1(342)                                                                                          
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/sdf1(342)                                                                                          
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/sdb1(0)                                                                                            
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/sda1(0)                                                                                            


Verifying files (checkit) on mirror(s) on...
        ---- host-126 ----

RESHAPE: lvconvert --yes --stripes 4 centipede2/takeover
  WARNING: Adding stripes to active and open logical volume centipede2/takeover will grow it from 1026 to 1368 extents!
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 19.58% )
   0/1 mirror(s) are fully synced: ( 35.47% )
   0/1 mirror(s) are fully synced: ( 63.73% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices                                                                                                                      
  takeover            rwi-aor---   5.34g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0),takeover_rimage_5(0)
  [takeover_rimage_0] iwi-aor---  <1.34g          /dev/sde1(343)                                                                                                               
  [takeover_rimage_0] iwi-aor---  <1.34g          /dev/sde1(0)                                                                                                                 
  [takeover_rimage_1] iwi-aor---  <1.34g          /dev/sdg1(343)                                                                                                               
  [takeover_rimage_1] iwi-aor---  <1.34g          /dev/sdg1(0)                                                                                                                 
  [takeover_rimage_2] iwi-aor---  <1.34g          /dev/sdf1(343)                                                                                                               
  [takeover_rimage_2] iwi-aor---  <1.34g          /dev/sdf1(0)                                                                                                                 
  [takeover_rimage_3] iwi-aor---  <1.34g          /dev/sdb1(343)                                                                                                               
  [takeover_rimage_3] iwi-aor---  <1.34g          /dev/sdb1(1)                                                                                                                 
  [takeover_rimage_4] iwi-aor---  <1.34g          /dev/sda1(343)                                                                                                               
  [takeover_rimage_4] iwi-aor---  <1.34g          /dev/sda1(1)                                                                                                                 
  [takeover_rimage_5] iwi-aor---  <1.34g          /dev/sdd1(343)                                                                                                               
  [takeover_rimage_5] iwi-aor---  <1.34g          /dev/sdd1(1)                                                                                                                 
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/sde1(342)                                                                                                               
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/sdg1(342)                                                                                                               
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/sdf1(342)                                                                                                               
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/sdb1(0)                                                                                                                 
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/sda1(0)                                                                                                                 
  [takeover_rmeta_5]  ewi-aor---   4.00m          /dev/sdd1(0)                                                                                                                 


Verifying files (checkit) on mirror(s) on...
        ---- host-126 ----
*** DATA COMPARISON ERROR [file:fixblocifkojakxyxedvvbnatetx] ***
Corrupt regions follow - unprintable chars are represented as '.'
-----------------------------------------------------------------
corrupt bytes starting at file offset 196608
    1st 32 expected bytes:  EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
    1st 32 actual bytes:    11111111111111111111111111111111

Comment 4 Heinz Mauelshagen 2017-05-08 11:55:44 UTC
Corey,
on how many cores is this reshape corruption happening?

Comment 5 Heinz Mauelshagen 2017-05-08 12:23:11 UTC
I was able to reproduce this on single core (possibly related to https://bugzilla.redhat.com/show_bug.cgi?id=1443999) but not on multi core (yet).

Comment 6 Corey Marthaler 2017-05-08 14:00:49 UTC
Currently I've only seen this issue on single core machines as well. I'll attempt on multi core machines today.

[root@host-073 ~]# nproc
1

Comment 7 Corey Marthaler 2017-05-09 18:47:39 UTC
Continued testing has yet to reproduce this on multiple core machines. Side note, thess multi core machines were also using multipath when the single core machines were not.

[root@mckinley-01 ~]# /usr/tests/sts-rhel7.4/lvm2/bin/lvm_rpms 
3.10.0-657.el7.x86_64

lvm2-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-libs-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-cluster-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017

Comment 8 Corey Marthaler 2017-05-10 16:17:58 UTC
Continued testing *has* now reproduced this issue on multiple core machines. Please disregard comment #7.


[root@harding-02 ~]# nproc
32


================================================================================
Iteration 9.6 started at Wed May 10 03:34:34 CDT 2017
================================================================================
Scenario raid5: Convert Striped raid5 volume
********* Take over hash info for this scenario *********
* from type:    raid5
* to type:      raid6_ls_6
* from legs:    3
* to legs:      5
* from region:  4096.00k
* to region:    4096.00k
* contiguous:   0
* snapshot:     0
******************************************************

Creating original volume on harding-02...
harding-02: lvcreate -vvvv  --type raid5 -R 4096.00k -i 3 -n takeover -L 4G centipede2 > /tmp/lvcreate 2>&1
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 33.87% )
   0/1 mirror(s) are fully synced: ( 93.31% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Placing a spacer on all raid image PVs so that expansion will have to be placed beyond
Extending raid beyond spacer
        lvextend -L +50M centipede2/takeover

Current volume device structure:
  LV                  Attr        LSize  Cpy%Sync Devices
  lvol0               -wi-a-----  20.00m          /dev/mapper/mpatha1(343)
  lvol1               -wi-a-----  20.00m          /dev/mapper/mpatha1(348)
  lvol2               -wi-a-----  20.00m          /dev/mapper/mpathb1(343)
  lvol3               -wi-a-----  20.00m          /dev/mapper/mpathb1(348)
  lvol4               -wi-a-----  20.00m          /dev/mapper/mpathc1(343)
  lvol5               -wi-a-----  20.00m          /dev/mapper/mpathc1(348)
  lvol6               -wi-a-----  20.00m          /dev/mapper/mpathd1(343)
  lvol7               -wi-a-----  20.00m          /dev/mapper/mpathd1(348)
  takeover            rwi-a-r---  <4.07g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0)
  [takeover_rimage_0] iwi-aor---  <1.36g          /dev/mapper/mpatha1(1)
  [takeover_rimage_0] iwi-aor---  <1.36g          /dev/mapper/mpatha1(353)
  [takeover_rimage_1] iwi-aor---  <1.36g          /dev/mapper/mpathb1(1)
  [takeover_rimage_1] iwi-aor---  <1.36g          /dev/mapper/mpathb1(353)
  [takeover_rimage_2] iwi-aor---  <1.36g          /dev/mapper/mpathc1(1)
  [takeover_rimage_2] iwi-aor---  <1.36g          /dev/mapper/mpathc1(353)
  [takeover_rimage_3] iwi-aor---  <1.36g          /dev/mapper/mpathd1(1)
  [takeover_rimage_3] iwi-aor---  <1.36g          /dev/mapper/mpathd1(353)
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/mapper/mpatha1(0)
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/mapper/mpathb1(0)
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/mapper/mpathc1(0)
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/mapper/mpathd1(0)


Creating xfs on top of mirror(s) on harding-02...
warning: device is not properly aligned /dev/centipede2/takeover
Mounting mirrored xfs filesystems on harding-02...

Writing verification files (checkit) to mirror(s) on...
        ---- harding-02 ----

Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- harding-02 ----


TAKEOVER: lvconvert --yes   --type raid6_ls_6 centipede2/takeover
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 27.57% )
   0/1 mirror(s) are fully synced: ( 52.87% )
   0/1 mirror(s) are fully synced: ( 69.87% )
   0/1 mirror(s) are fully synced: ( 92.89% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize  Cpy%Sync Devices
  lvol0               -wi-a----- 20.00m          /dev/mapper/mpatha1(343)
  lvol1               -wi-a----- 20.00m          /dev/mapper/mpatha1(348)
  lvol2               -wi-a----- 20.00m          /dev/mapper/mpathb1(343)
  lvol3               -wi-a----- 20.00m          /dev/mapper/mpathb1(348)
  lvol4               -wi-a----- 20.00m          /dev/mapper/mpathc1(343)
  lvol5               -wi-a----- 20.00m          /dev/mapper/mpathc1(348)
  lvol6               -wi-a----- 20.00m          /dev/mapper/mpathd1(343)
  lvol7               -wi-a----- 20.00m          /dev/mapper/mpathd1(348)
  takeover            rwi-aor--- <4.07g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0)
  [takeover_rimage_0] iwi-aor--- <1.36g          /dev/mapper/mpatha1(1)
  [takeover_rimage_0] iwi-aor--- <1.36g          /dev/mapper/mpatha1(353)
  [takeover_rimage_1] iwi-aor--- <1.36g          /dev/mapper/mpathb1(1)
  [takeover_rimage_1] iwi-aor--- <1.36g          /dev/mapper/mpathb1(353)
  [takeover_rimage_2] iwi-aor--- <1.36g          /dev/mapper/mpathc1(1)
  [takeover_rimage_2] iwi-aor--- <1.36g          /dev/mapper/mpathc1(353)
  [takeover_rimage_3] iwi-aor--- <1.36g          /dev/mapper/mpathd1(1)
  [takeover_rimage_3] iwi-aor--- <1.36g          /dev/mapper/mpathd1(353)
  [takeover_rimage_4] iwi-aor--- <1.36g          /dev/mapper/mpathe1(1)
  [takeover_rmeta_0]  ewi-aor---  4.00m          /dev/mapper/mpatha1(0)
  [takeover_rmeta_1]  ewi-aor---  4.00m          /dev/mapper/mpathb1(0)
  [takeover_rmeta_2]  ewi-aor---  4.00m          /dev/mapper/mpathc1(0)
  [takeover_rmeta_3]  ewi-aor---  4.00m          /dev/mapper/mpathd1(0)
  [takeover_rmeta_4]  ewi-aor---  4.00m          /dev/mapper/mpathe1(0)


Verifying files (checkit) on mirror(s) on...
        ---- harding-02 ----

RESHAPE: lvconvert --yes --stripes 5 centipede2/takeover
  WARNING: Adding stripes to active and open logical volume centipede2/takeover will grow it from 1041 to 1735 extents!
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 8.71% )
   0/1 mirror(s) are fully synced: ( 24.94% )
   0/1 mirror(s) are fully synced: ( 37.99% )
   0/1 mirror(s) are fully synced: ( 49.77% )
   0/1 mirror(s) are fully synced: ( 63.32% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize  Cpy%Sync Devices
  lvol0               -wi-a----- 20.00m          /dev/mapper/mpatha1(343)
  lvol1               -wi-a----- 20.00m          /dev/mapper/mpatha1(348)
  lvol2               -wi-a----- 20.00m          /dev/mapper/mpathb1(343)
  lvol3               -wi-a----- 20.00m          /dev/mapper/mpathb1(348)
  lvol4               -wi-a----- 20.00m          /dev/mapper/mpathc1(343)
  lvol5               -wi-a----- 20.00m          /dev/mapper/mpathc1(348)
  lvol6               -wi-a----- 20.00m          /dev/mapper/mpathd1(343)
  lvol7               -wi-a----- 20.00m          /dev/mapper/mpathd1(348)
  takeover            rwi-aor--- <6.78g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0),takeover_rimage_5(0),takeover_rimage_6(0)
  [takeover_rimage_0] iwi-aor--- <1.36g          /dev/mapper/mpatha1(358)
  [takeover_rimage_0] iwi-aor--- <1.36g          /dev/mapper/mpatha1(1)
  [takeover_rimage_0] iwi-aor--- <1.36g          /dev/mapper/mpatha1(353)
  [takeover_rimage_1] iwi-aor--- <1.36g          /dev/mapper/mpathb1(358)
  [takeover_rimage_1] iwi-aor--- <1.36g          /dev/mapper/mpathb1(1)
  [takeover_rimage_1] iwi-aor--- <1.36g          /dev/mapper/mpathb1(353)
  [takeover_rimage_2] iwi-aor--- <1.36g          /dev/mapper/mpathc1(358)
  [takeover_rimage_2] iwi-aor--- <1.36g          /dev/mapper/mpathc1(1)
  [takeover_rimage_2] iwi-aor--- <1.36g          /dev/mapper/mpathc1(353)
  [takeover_rimage_3] iwi-aor--- <1.36g          /dev/mapper/mpathd1(358)
  [takeover_rimage_3] iwi-aor--- <1.36g          /dev/mapper/mpathd1(1)
  [takeover_rimage_3] iwi-aor--- <1.36g          /dev/mapper/mpathd1(353)
  [takeover_rimage_4] iwi-aor--- <1.36g          /dev/mapper/mpathe1(348)
  [takeover_rimage_4] iwi-aor--- <1.36g          /dev/mapper/mpathe1(1)
  [takeover_rimage_5] iwi-aor--- <1.36g          /dev/mapper/mpathf1(348)
  [takeover_rimage_5] iwi-aor--- <1.36g          /dev/mapper/mpathf1(1)
  [takeover_rimage_6] iwi-aor--- <1.36g          /dev/mapper/mpathg1(348)
  [takeover_rimage_6] iwi-aor--- <1.36g          /dev/mapper/mpathg1(1)
  [takeover_rmeta_0]  ewi-aor---  4.00m          /dev/mapper/mpatha1(0)
  [takeover_rmeta_1]  ewi-aor---  4.00m          /dev/mapper/mpathb1(0)
  [takeover_rmeta_2]  ewi-aor---  4.00m          /dev/mapper/mpathc1(0)
  [takeover_rmeta_3]  ewi-aor---  4.00m          /dev/mapper/mpathd1(0)
  [takeover_rmeta_4]  ewi-aor---  4.00m          /dev/mapper/mpathe1(0)
  [takeover_rmeta_5]  ewi-aor---  4.00m          /dev/mapper/mpathf1(0)
  [takeover_rmeta_6]  ewi-aor---  4.00m          /dev/mapper/mpathg1(0)


Verifying files (checkit) on mirror(s) on...
        ---- harding-02 ----
*** DATA COMPARISON ERROR [file:sumxbivgugbquklblo] ***
Corrupt regions follow - unprintable chars are represented as '.'
-----------------------------------------------------------------
corrupt bytes starting at file offset 131072
    1st 32 expected bytes:  11111111111111111111111111111111
    1st 32 actual bytes:    B:21384:writev*B:21384:writev*B:

checkit write verify failed



3.10.0-643.el7.x86_64
lvm2-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-libs-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-cluster-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017

Comment 9 Alasdair Kergon 2017-05-19 15:52:44 UTC
Please could you attach a -vvvv trace from the problem lvconvert command?  Thanks.

Comment 11 Corey Marthaler 2017-05-26 14:10:02 UTC
Created attachment 1282579 [details]
output of new corrupt test run

Comment 12 Corey Marthaler 2017-05-26 14:11:40 UTC
Created attachment 1282580 [details]
lvchange -vvvv

Comment 13 Heinz Mauelshagen 2017-06-08 14:18:07 UTC
Related to 1443999 causing superblocks reflecting reshape state not written when they should

Comment 14 Corey Marthaler 2017-06-12 20:36:45 UTC
This issue appears fixed with the test kernel for bug 1443999. Multiple iterations of the test case listed in comment #0 now pass.

Comment 15 Corey Marthaler 2017-06-13 18:44:59 UTC
Continued testing has shown that this may still exist, although the timing window in order to see this may be smaller now.

3.10.0-679.el7.bz1443999.x86_64

lvm2-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
lvm2-libs-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
lvm2-cluster-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-libs-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-event-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-event-libs-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017


================================================================================
Iteration 0.23 started at Mon Jun 12 18:44:17 CDT 2017
================================================================================
Scenario striped: Convert Striped volume

********* Take over hash info for this scenario *********
* from type:    striped
* to type:      raid4
* from legs:    2
* to legs:      6
* from region:  0
* to region:    512.00k
* contiguous:   0
* snapshot:     0
******************************************************

Creating original volume on mckinley-04...
mckinley-04: lvcreate  --type striped  -i 2 -n takeover -L 4G centipede2

Placing a spacer on all raid image PVs so that expansion will have to be placed beyond
Extending raid beyond spacer
        lvextend -L +50M centipede2/takeover

Current volume device structure:
  LV       Attr       LSize   Cpy%Sync Devices
  lvol0    -wi-a-----  20.00m          /dev/mapper/mpatha1(512)
  lvol1    -wi-a-----  20.00m          /dev/nvme0n1p1(512)
  takeover -wi-a-----   4.05g          /dev/nvme0n1p1(0),/dev/mapper/mpatha1(0)
  takeover -wi-a-----   4.05g          /dev/nvme0n1p1(517),/dev/mapper/mpatha1(517)


Creating xfs on top of mirror(s) on mckinley-04...
warning: device is not properly aligned /dev/centipede2/takeover
Mounting mirrored xfs filesystems on mckinley-04...

Writing verification files (checkit) to mirror(s) on...
        ---- mckinley-04 ----

<start name="mckinley-04_takeover"  pid="32524" time="Mon Jun 12 18:44:22 2017 -0500" type="cmd" />
Sleeping 15 seconds to get some outsanding I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- mckinley-04 ----

TAKEOVER: lvconvert --yes -R 512.00k  --type raid4 centipede2/takeover
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 32.92% )
   0/1 mirror(s) are fully synced: ( 56.94% )
   0/1 mirror(s) are fully synced: ( 79.12% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices
  lvol0               -wi-a-----  20.00m          /dev/mapper/mpatha1(512)
  lvol1               -wi-a-----  20.00m          /dev/nvme0n1p1(512)
  takeover            rwi-aor---   4.05g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0)
  [takeover_rimage_0] iwi-aor---  <2.03g          /dev/mapper/mpathb1(1)
  [takeover_rimage_1] iwi-aor---  <2.03g          /dev/nvme0n1p1(0)
  [takeover_rimage_1] iwi-aor---  <2.03g          /dev/nvme0n1p1(517)
  [takeover_rimage_2] iwi-aor---  <2.03g          /dev/mapper/mpatha1(0)
  [takeover_rimage_2] iwi-aor---  <2.03g          /dev/mapper/mpatha1(517)
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/mapper/mpathb1(0)
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/nvme0n1p1(524)
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/mapper/mpatha1(524)

Verifying files (checkit) on mirror(s) on...
        ---- mckinley-04 ----

RESHAPE: lvconvert --yes  --stripes 6 centipede2/takeover
  WARNING: Adding stripes to active and open logical volume centipede2/takeover will grow it from 1038 to 3114 extents!
Waiting until all mirror|raid volumes become fully syncd...
   0/1 mirror(s) are fully synced: ( 7.67% )
   0/1 mirror(s) are fully synced: ( 17.33% )
   0/1 mirror(s) are fully synced: ( 26.39% )
   0/1 mirror(s) are fully synced: ( 43.77% )
   1/1 mirror(s) are fully synced: ( 100.00% )
Sleeping 15 sec

Current volume device structure:
  LV                  Attr       LSize   Cpy%Sync Devices
  lvol0               -wi-a-----  20.00m          /dev/mapper/mpatha1(512)
  lvol1               -wi-a-----  20.00m          /dev/nvme0n1p1(512)
  takeover            rwi-aor---  12.16g 100.00   takeover_rimage_0(0),takeover_rimage_1(0),takeover_rimage_2(0),takeover_rimage_3(0),takeover_rimage_4(0),takeover_rimage_5(0),takeover_rimage_6(0)
  [takeover_rimage_0] iwi-aor---   2.03g          /dev/mapper/mpathb1(520)
  [takeover_rimage_0] iwi-aor---   2.03g          /dev/mapper/mpathb1(1)
  [takeover_rimage_1] iwi-aor---   2.03g          /dev/nvme0n1p1(525)
  [takeover_rimage_1] iwi-aor---   2.03g          /dev/nvme0n1p1(0)
  [takeover_rimage_1] iwi-aor---   2.03g          /dev/nvme0n1p1(517)
  [takeover_rimage_2] iwi-aor---   2.03g          /dev/mapper/mpatha1(525)
  [takeover_rimage_2] iwi-aor---   2.03g          /dev/mapper/mpatha1(0)
  [takeover_rimage_2] iwi-aor---   2.03g          /dev/mapper/mpatha1(517)
  [takeover_rimage_3] iwi-aor---   2.03g          /dev/mapper/mpathc1(520)
  [takeover_rimage_3] iwi-aor---   2.03g          /dev/mapper/mpathc1(1)
  [takeover_rimage_4] iwi-aor---   2.03g          /dev/mapper/mpathd1(520)
  [takeover_rimage_4] iwi-aor---   2.03g          /dev/mapper/mpathd1(1)
  [takeover_rimage_5] iwi-aor---   2.03g          /dev/mapper/mpathe1(520)
  [takeover_rimage_5] iwi-aor---   2.03g          /dev/mapper/mpathe1(1)
  [takeover_rimage_6] iwi-aor---   2.03g          /dev/mapper/mpathf1(520)
  [takeover_rimage_6] iwi-aor---   2.03g          /dev/mapper/mpathf1(1)
  [takeover_rmeta_0]  ewi-aor---   4.00m          /dev/mapper/mpathb1(0)
  [takeover_rmeta_1]  ewi-aor---   4.00m          /dev/nvme0n1p1(524)
  [takeover_rmeta_2]  ewi-aor---   4.00m          /dev/mapper/mpatha1(524)
  [takeover_rmeta_3]  ewi-aor---   4.00m          /dev/mapper/mpathc1(0)
  [takeover_rmeta_4]  ewi-aor---   4.00m          /dev/mapper/mpathd1(0)
  [takeover_rmeta_5]  ewi-aor---   4.00m          /dev/mapper/mpathe1(0)
  [takeover_rmeta_6]  ewi-aor---   4.00m          /dev/mapper/mpathf1(0)

Verifying files (checkit) on mirror(s) on...
        ---- mckinley-04 ----
*** DATA COMPARISON ERROR [file:dudkcqyhgxujbsmy] ***
Corrupt regions follow - unprintable chars are represented as '.'
-----------------------------------------------------------------
corrupt bytes starting at file offset 0
    1st 32 expected bytes:  PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP
    1st 32 actual bytes:    Q:75216:writev*Q:75216:writev*Q:

checkit write verify failed

Comment 18 Corey Marthaler 2017-06-15 18:32:48 UTC
Created attachment 1288155 [details]
output from different systems experiencing this issue

This bug was hit on multiple systems running the latest test kernel over night.

3.10.0-681.el7.bz1443999a.x86_64

lvm2-2.02.171-5.el7    BUILT: Wed Jun 14 10:33:32 CDT 2017
lvm2-libs-2.02.171-5.el7    BUILT: Wed Jun 14 10:33:32 CDT 2017
lvm2-cluster-2.02.171-5.el7    BUILT: Wed Jun 14 10:33:32 CDT 2017
device-mapper-1.02.140-5.el7    BUILT: Wed Jun 14 10:33:32 CDT 2017
device-mapper-libs-1.02.140-5.el7    BUILT: Wed Jun 14 10:33:32 CDT 2017
device-mapper-event-1.02.140-5.el7    BUILT: Wed Jun 14 10:33:32 CDT 2017
device-mapper-event-libs-1.02.140-5.el7    BUILT: Wed Jun 14 10:33:32 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017

Comment 19 Jonathan Earl Brassow 2017-06-19 14:29:49 UTC
This fix is not yet in that kernel (or 682).  It will land with the fix for bug 1443999.

Comment 20 Heinz Mauelshagen 2017-06-19 15:40:40 UTC
(In reply to Jonathan Earl Brassow from comment #19)
> This fix is not yet in that kernel (or 682).  It will land with the fix for
> bug 1443999.

Mind the test kernel 3.10.0-681.el7.bz1443999a.x86_64, which has the patches in

Comment 21 Heinz Mauelshagen 2017-06-19 16:09:30 UTC
Growing the size of the rimages to allocate reshape space and reordering the reshape space to the beginning of the rimage LVs is not allowed in one step but requires 2 to avoid the active mapping to be able to write to false offsets.

We have to restrict reshaping to inactive LVs for the time being until this fix is properly designed, implemented and tested.

Comment 22 Heinz Mauelshagen 2017-06-19 20:57:22 UTC
(In reply to Heinz Mauelshagen from comment #21)
> Growing the size of the rimages to allocate reshape space and reordering the
> reshape space to the beginning of the rimage LVs is not allowed in one step
> but requires 2 to avoid the active mapping to be able to write to false
> offsets.
> 
> We have to restrict reshaping to inactive LVs for the time being until this
> fix is properly designed, implemented and tested.

Upstream commit 9e9163618ab36a6b046b6380d6ef58429b219ef8 disables reshaping until we got this fix.

Comment 23 Heinz Mauelshagen 2017-06-19 21:04:11 UTC
(In reply to Heinz Mauelshagen from comment #22)
> (In reply to Heinz Mauelshagen from comment #21)
> > Growing the size of the rimages to allocate reshape space and reordering the
> > reshape space to the beginning of the rimage LVs is not allowed in one step
> > but requires 2 to avoid the active mapping to be able to write to false
> > offsets.
> > 
> > We have to restrict reshaping to inactive LVs for the time being until this
> > fix is properly designed, implemented and tested.
> 
> Upstream commit 9e9163618ab36a6b046b6380d6ef58429b219ef8 disables reshaping
> until we got this fix.

Commit still allows reshapes on closed LVs (e.g. unmounted ones).

Comment 25 Corey Marthaler 2017-06-23 21:37:18 UTC
Marking "verified" with the caveat that this bug still exists, it's just not possible now to reshape when the raid device is open.

3.10.0-685.el7.x86_64
lvm2-2.02.171-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
lvm2-libs-2.02.171-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
lvm2-cluster-2.02.171-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-1.02.140-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-libs-1.02.140-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-event-1.02.140-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-event-libs-1.02.140-7.el7    BUILT: Thu Jun 22 08:35:15 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017



host-073: lvcreate  --type raid5_n -R 2048.00k -i 4 -n takeover -L 4G centipede2

Creating xfs on top of mirror(s) on host-073...
Mounting mirrored xfs filesystems on host-073...

[root@host-073 ~]# lvs -o +segtype
  LV       VG            Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Type
  takeover centipede2    rwi-aor---   4.06g                                    100.00           raid5_n

[root@host-073 ~]#  lvconvert --yes --type raid5_ls centipede2/takeover
  Using default stripesize 64.00 KiB.
  Reshape is only supported when centipede2/takeover is not in use (e.g. unmount filesystem).

[root@host-073 ~]#  lvconvert --yes --stripes 5 centipede2/takeover
  Using default stripesize 64.00 KiB.
  Reshape is only supported when centipede2/takeover is not in use (e.g. unmount filesystem).

Comment 26 errata-xmlrpc 2017-08-01 21:52:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2222