Bug 1400301

Summary: lvm reports bogus sync percentage on large raid1/4/5/6/10 LVs >= 1PiB
Product: Red Hat Enterprise Linux 7 Reporter: Heinz Mauelshagen <heinzm>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Mirroring and RAID QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, cmarthal, heinzm, jbrassow, msnitzer, prajnoha, prockai, zkabelac
Version: 7.4   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.169-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 21:49:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Heinz Mauelshagen 2016-11-30 20:41:57 UTC
Description of problem:
On >= 1PiB RaidLVs created with --nosync, lvs erroneously reports
e.g. 15.63 synchronization instead of 100 percent.

Version-Release number of selected component (if applicable):
lvm2 2.02.168(2)

How reproducible:
Always with >= 1PiB size

Steps to Reproduce:
1. lvcreate -L1P -nfoo --nosync -y vg
2. lvs -oname,vg_name,size,sync_percent vg/foo
3.

Actual results:
lvs shows: "foo vg 1.00p 15.63"

Expected results:
lvs should show: "foo vg 1.00p 100.00"

Additional info:

Comment 1 Heinz Mauelshagen 2016-11-30 23:03:31 UTC
Upstream commit 745250073cf9

Comment 4 Corey Marthaler 2017-05-23 18:32:17 UTC
Nothing has changed here since the QA ack was granted. Please provide unit test results.


[root@harding-03 ~]# lvcreate --type raid1 -m 1 -n large_raid -L 1P --nosync vg
  WARNING: Device /dev/mapper/mpatha1 has size of 524281212 sectors which is smaller than corresponding PV size of 2199023255552 sectors. Was device resized?
  WARNING: Device /dev/mapper/mpathb1 has size of 524281212 sectors which is smaller than corresponding PV size of 2199023255552 sectors. Was device resized?
  WARNING: Device /dev/mapper/mpathc1 has size of 524281212 sectors which is smaller than corresponding PV size of 2199023255552 sectors. Was device resized?
  WARNING: Device /dev/mapper/mpathd1 has size of 524281212 sectors which is smaller than corresponding PV size of 2199023255552 sectors. Was device resized?
  WARNING: Device /dev/mapper/mpathe1 has size of 524281212 sectors which is smaller than corresponding PV size of 2199023255552 sectors. Was device resized?
  WARNING: Device /dev/mapper/mpathf1 has size of 524281212 sectors which is smaller than corresponding PV size of 2199023255552 sectors. Was device resized?
  WARNING: Device /dev/mapper/mpathg1 has size of 524281212 sectors which is smaller than corresponding PV size of 2199023255552 sectors. Was device resized?
  WARNING: Device /dev/mapper/mpathh1 has size of 524281212 sectors which is smaller than corresponding PV size of 2199023255552 sectors. Was device resized?
  One or more devices used as PVs in VG vg have changed sizes.
  WARNING: New raid1 won't be synchronised. Don't read what you didn't write!
  device-mapper: resume ioctl on  (253:19) failed: Invalid argument
  Unable to resume vg-large_raid_rimage_0 (253:19)
  device-mapper: resume ioctl on  (253:20) failed: Invalid argument
  Unable to resume vg-large_raid_rimage_1 (253:20)
  Failed to activate new LV.


[529761.757381] device-mapper: table: 253:19: dm-3 too small for target: start=73728, len=2199023181824, dev_size=524281212
[529761.790419] device-mapper: table: 253:20: dm-7 too small for target: start=73728, len=2199023181824, dev_size=524281212


3.10.0-651.el7.x86_64
lvm2-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-libs-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
lvm2-cluster-2.02.171-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-event-libs-1.02.140-1.el7    BUILT: Wed May  3 07:05:13 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017

Comment 5 Alasdair Kergon 2017-06-07 18:14:09 UTC
Works for me.

Use sparse devices e.g.

lvcreate -V1PB vg1 -L100M

   - an LV that pretends to be 1PB in size but can hold at most 100MB of written content

Make PVs on the sparse LVs
pvcreate /dev/vg1/lvol0

Put them into a VG and make the LV --nosync.

  LV    VG  Attr       LSize  Pool Origin          Data%  Meta%  Move Log Cpy%Sync Convert
                 
Before:
  lvol0 vg2  Rwa-a-r---  1.00p                                             0.00            

After:                 
  lvol0 vg2 Rwa-a-r---  1.00p                                             100.00

Comment 6 Alasdair Kergon 2017-06-07 18:19:28 UTC
With 1G instead of 1P we get the same output before and after:

  lvol0 vg2 Rwa-a-r---  1.00g                                             100.00

Comment 7 Corey Marthaler 2017-06-07 21:56:53 UTC
The --nosync case has been verified to work now with the latest rpms. That said, the title doesn't mention that this is a --nosync specific bug and for a regular raid, the sync percent is still bogus (0.00).


# NOSYNC

[root@host-073 ~]# lvs -oname,vg_name,size,sync_percent raid_sanity_stack/nosync
  LV     VG                LSize Cpy%Sync
  nosync raid_sanity_stack 1.00p 100.00

raid_sanity_stack-nosync: 0 2199023255552 raid raid1 2 AA 2199023255552/2199023255552 idle 0 0 -


# REGULAR

[root@host-073 ~]# lvs -oname,vg_name,size,sync_percent raid_sanity_stack/regular
  LV      VG                LSize Cpy%Sync
  regular raid_sanity_stack 1.00p 0.00

raid_sanity_stack-regular: 0 2199023255552 raid raid1 2 aa 38604352/2199023255552 resync 0 0 -



# I used 20g real lvs to back up the 1P virt volumes so that I can watch the sync process for awhile before the pools eventually fill up due to the syncing:

lvcreate  -n pv1 -V1PB -L2G raid_sanity
  WARNING: Sum of all thin volume sizes (1.00 PiB) exceeds the size of thin pool raid_sanity/lvol1 and the size of whole volume group (104.96 GiB)!
pvcreate /dev/raid_sanity/pv1
lvcreate  -n pv2 -V1PB -L2G raid_sanity
  WARNING: Sum of all thin volume sizes (2.00 PiB) exceeds the size of thin pools and the size of whole volume group (104.96 GiB)!
pvcreate /dev/raid_sanity/pv2
lvcreate  -n pv3 -V1PB -L2G raid_sanity
  WARNING: Sum of all thin volume sizes (3.00 PiB) exceeds the size of thin pools and the size of whole volume group (104.96 GiB)!
pvcreate /dev/raid_sanity/pv3
lvcreate  -n pv4 -V1PB -L2G raid_sanity
  WARNING: Sum of all thin volume sizes (4.00 PiB) exceeds the size of thin pools and the size of whole volume group (104.96 GiB)!
pvcreate /dev/raid_sanity/pv4
lvcreate  -n pv5 -V1PB -L2G raid_sanity
  WARNING: Sum of all thin volume sizes (5.00 PiB) exceeds the size of thin pools and the size of whole volume group (104.96 GiB)!
pvcreate /dev/raid_sanity/pv5
vgcreate raid_sanity_stack /dev/raid_sanity/pv1 /dev/raid_sanity/pv2 /dev/raid_sanity/pv3 /dev/raid_sanity/pv4 /dev/raid_sanity/pv5
Creating nosync...
lvcreate --type raid1 --nosync -L 1P -n nosync raid_sanity_stack


3.10.0-672.el7.x86_64

lvm2-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
lvm2-libs-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
lvm2-cluster-2.02.171-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-libs-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-event-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-event-libs-1.02.140-4.el7    BUILT: Wed Jun  7 09:16:17 CDT 2017
device-mapper-persistent-data-0.7.0-0.1.rc6.el7    BUILT: Mon Mar 27 10:15:46 CDT 2017

Comment 8 Corey Marthaler 2017-06-07 22:01:41 UTC
scale=10
38604352/2199023255552
.0000175552

That is most likely why it's still zero. Marking verified.

Comment 9 errata-xmlrpc 2017-08-01 21:49:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2222