Bug 1607774

Summary: Target files for 'qemu-img convert' do not support thin_provisoning with iscsi/nfs backend
Product: Red Hat Enterprise Linux 7 Reporter: Tingting Mao <timao>
Component: qemu-kvm-rhevAssignee: Fam Zheng <famz>
Status: CLOSED ERRATA QA Contact: Tingting Mao <timao>
Severity: medium Docs Contact:
Priority: high    
Version: 7.6CC: chaoyang, coli, famz, juzhang, michen, mrezanin, ngu, pingl, timao, virt-maint
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.12.0-10.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1623082 (view as bug list) Environment:
Last Closed: 2018-11-01 11:13:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1623082, 1661411    

Description Tingting Mao 2018-07-24 09:01:22 UTC
Description of problem:
Target files for 'qemu-img convert' do not support thin_provisoning with iscsi backend.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.12.0-7.el7
kernel-3.10.0-918.el7

How reproducible:
100%

Steps to Reproduce:
1.Login iscsi target and create PV and VG.
# vi /etc/iscsi/initiatorname.iscsi(update the value of InitiatorName)
# systemctl restart iscsid
# iscsiadm -m discovery -t st -p $iscsi_server_ip
# iscsiadm -m node -T iqn.2018-05.com.example.xxx -p $iscsi_server_ip:3260 -l
# pvcreate /dev/sdb
# vgcreate vgtest /dev/sdb
2. Create lvs with thin_provisoning
# lvcreate -L 5G -T vgtest/my_thin
# lvcreate -T /dev/vgtest/my_thin -V 1G -n thin_test01
  Logical volume "thin_test01" created.
# lvcreate -T /dev/vgtest/my_thin -V 1G -n thin_test02
  Logical volume "thin_test02" created.
# lvcreate -T /dev/vgtest/my_thin -V 1G -n thin_test.qcow2
  Logical volume "thin_test.qcow2" created.
3.Check the info of lv
[root@lenovo-sr630-01 test]# lvs
  LV              VG                   Attr       LSize    Pool    Origin Data%  Meta%  Move Log Cpy%Sync Convert
                                               
  thin_test.qcow2 vgtest               Vwi-a-tz--    1.00g my_thin        0.00  ------------> no data                                  
  thin_test01     vgtest               Vwi-a-tz--    1.00g my_thin        0.00  ------------> no data                                  
  thin_test02     vgtest               Vwi-a-tz--    1.00g my_thin        0.00  ------------> no data            
4.Create raw file in thin_test01
# qemu-img create -f raw /dev/vgtest/thin_test01 1G
Formatting '/dev/vgtest/thin_test01', fmt=raw size=1073741824
# lvs
  LV              VG                   Attr       LSize    Pool    Origin Data%  Meta%  Move Log Cpy%Sync Convert
                                                 
  thin_test.qcow2 vgtest               Vwi-a-tz--    1.00g my_thin        0.00                                  
  thin_test01     vgtest               Vwi-a-tz--    1.00g my_thin        0.01                                  
  thin_test02     vgtest               Vwi-a-tz--    1.00g my_thin        0.00        
5.Convert to raw/qcow2 file                          
# qemu-img convert -f raw -O raw /dev/vgtest/thin_test01 /dev/vgtest/thin_test02 -n
[root@lenovo-sr630-01 test]# lvs
  LV              VG                   Attr       LSize    Pool    Origin Data%  Meta%  Move Log Cpy%Sync Convert
                                                 
  thin_test.qcow2 vgtest               Vwi-a-tz--    1.00g my_thin        0.00                                  
  thin_test01     vgtest               Vwi-a-tz--    1.00g my_thin        0.01                                  
  thin_test02     vgtest               Vwi-a-tz--    1.00g my_thin        100.00 ---------------> written data                        
# qemu-img convert -f raw -O qcow2 /dev/vgtest/thin_test01 /dev/vgtest/thin_test.qcow2 -p
    (100.00/100%)
# lvs
  LV              VG                   Attr       LSize    Pool    Origin Data%  Meta%  Move Log Cpy%Sync Convert
                                                   
  thin_test.qcow2 vgtest               Vwi-a-tz--    1.00g my_thin        100.00 --------------> written data                                
  thin_test01     vgtest               Vwi-a-tz--    1.00g my_thin        0.01                                  
  thin_test02     vgtest               Vwi-a-tz--    1.00g my_thin        100.00



Actual results:
Target files were written data.

Expected results:
Qemu should find the blocks unallocated, and write no data to the target files.

Additional info:
For rhel7.5, if the target file is raw, the result is the same as rhel7.6 above. While if the target file is qcow2, QEMU only writes metadata to it, like below.
# qemu-img --version
qemu-img version 2.10.0(qemu-kvm-rhev-2.10.0-21.el7_5.1)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
[root@hp-dl385g8-02 ~]# qemu-img convert -f raw -O qcow2 /dev/vg/thin_test01 /dev/vg/thin_test.qcow2 -p
    (100.00/100%)
[root@hp-dl385g8-02 ~]# lvs
  LV              VG                 Attr       LSize   Pool    Origin Data%  Meta%  Move Log Cpy%Sync Convert
                        
  thin_test.qcow2 vg                 Vwi-a-tz--   1.00g my_thin        0.02 -----> just metadata                                   
  thin_test01     vg                 Vwi-a-tz--   1.00g my_thin        0.00

Comment 4 Fam Zheng 2018-08-01 06:31:49 UTC
After more discussion on this matter on upstream, we will have to leave the decision of whether to enable copy offloading to the user or upper layers. I'll backport the patch:

commit e11ce12f5eb26438419e486a3ae2c9bb58a23c1f
Author: Fam Zheng <famz>
Date:   Fri Jul 27 11:34:01 2018 +0800

    qemu-img: Add -C option for convert with copy offloading

    Signed-off-by: Fam Zheng <famz>
    Signed-off-by: Kevin Wolf <kwolf>

Comment 6 Miroslav Rezanina 2018-08-10 10:47:00 UTC
Fix included in qemu-kvm-rhev-2.12.0-10.el7

Comment 8 Tingting Mao 2018-08-17 03:25:11 UTC
Scenario 1(for NFS backend)

Nfs mount info:
# nfsstat -m | grep 10.66.11.19
/home/nfs_share from 10.66.11.19:/home/nfs_share
 Flags:    rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.66.11.19,mountvers=3,mountport=20048,mountproto=udp,local_lock=none,addr=10.66.11.19

1. Convert to raw file
# qemu-img create -f raw test.img 1G
Formatting 'test.img', fmt=raw size=1073741824
# qemu-io -c 'write -P 1 0 512M' test.img
WARNING: Image format was not specified for 'test.img' and probing guessed raw.
         Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
         Specify the 'raw' format explicitly to remove the restrictions.
wrote 536870912/536870912 bytes at offset 0
512 MiB, 1 ops; 0:00:11.82 (43.308 MiB/sec and 0.0846 ops/sec)
# qemu-img info test.img
image: test.img
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 512M
# time strace -f -o convert_without_offload.log qemu-img convert -f raw -O raw test.img convert.img -p
    (100.00/100%)

real    0m19.711s
user    0m0.247s
sys    0m2.107s
# qemu-img info convert.img
image: convert.img
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 512M -------------------------> expected

2.Convert to qcow2 file
# time strace -f -o convert_without_offload.log qemu-img convert -f raw -O qcow2 test.img convert.qcow2 -p
    (100.00/100%)

real    0m20.244s
user    0m0.279s
sys    0m2.275s
# qemu-img info convert.qcow2
image: convert.qcow2
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 512M  -------------------------> expected
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false


Scenario 2(for iscsi)
1.Login iscsi target and create PV and VG.
# vi /etc/iscsi/initiatorname.iscsi(update the value of InitiatorName)
# systemctl restart iscsid
# iscsiadm -m discovery -t st -p $iscsi_server_ip
# iscsiadm -m node -T iqn.2018-05.com.example.xxx -p $iscsi_server_ip:3260 -l
# pvcreate /dev/sdb
# vgcreate vgtest /dev/sdb

2. Create lvs with thin_provisoning
# lvcreate -L 20G -T vgtest/my_thin
# lvcreate -T /dev/vgtest/my_thin -V 1G -n thin_lv01
# lvcreate -T /dev/vgtest/my_thin -V 1G -n thin_lv02
# lvcreate -T /dev/vgtest/my_thin -V 1G -n thin_lv03

3. Create raw file on thin_lv01 and write data to it
# qemu-img create -f raw /dev/vgtest/thin_lv01 1G
# qemu-io -c 'write -P 1 0 512M' /dev/vgtest/thin_lv01
# lvs
  thin_lv01 vgtest               Vwi-a-tz--    1.00g my_thin        50.00                                  
  thin_lv02 vgtest               Vwi-a-tz--    1.00g my_thin        0.00                                   
  thin_lv03 vgtest               Vwi-a-tz--    1.00g my_thin        0.00 

4. Convert to raw and qcow2 files
# qemu-img convert -f raw -O raw /dev/vgtest/thin_lv01 /dev/vgtest/thin_lv02 -p
    (100.00/100%)
# qemu-img convert -f raw -O qcow2 /dev/vgtest/thin_lv01 /dev/vgtest/thin_lv03 -p
    (100.00/100%)

5. Check the lvs info
  thin_lv01 vgtest               Vwi-a-tz--    1.00g my_thin        50.00                                  
  thin_lv02 vgtest               Vwi-a-tz--    1.00g my_thin        100.00                                 
  thin_lv03 vgtest               Vwi-a-tz--    1.00g my_thin        50.03 -------> expected


Scenario 3(compare convert with and without copy offloading)
1.for nfs backend

1.1Create raw file and write data to it
# qemu-img create -f raw test.img 1G
Formatting 'test.img', fmt=raw size=1073741824
# qemu-io -c 'write -P 1 0 128M' test.img -f raw
wrote 134217728/134217728 bytes at offset 0
128 MiB, 1 ops; 0:00:02.94 (43.471 MiB/sec and 0.3396 ops/sec)

1.2 Convert without copy offload
# time qemu-img convert -f raw -O qcow2 test.img target.qcow2 -p
    (100.00/100%)

real    0m12.505s
user    0m0.163s
sys    0m1.277s

1.3 convert with copy offload
# time strace -e trace=copy_file_range -f qemu-img convert -f raw -O qcow2 test.img target.qcow2 -C -p
strace: Process 74764 attached
[pid 74764] copy_file_range(10, [0], 12, [327680], 2097152, 0) = 2097152
[pid 74764] copy_file_range(10, [2097152], 12, [2424832], 2097152, 0) = 2097152
…...
…...
[pid 74764] copy_file_range(10, [1067450368], 12, [1067843584], 2097152, 0) = 2097152
[pid 74764] copy_file_range(10, [1069547520], 12, [1069940736], 2097152, 0) = 2097152
[pid 74764] copy_file_range(10, [1071644672], 12, [1072037888], 2097152, 0) = 2097152
    (100.00/100%)
[pid 74764] +++ exited with 0 +++
strace: Process 74786 attached
[pid 74786] +++ exited with 0 +++
+++ exited with 0 +++

real    0m25.565s
user    0m0.122s
sys    0m2.754s

2. For iscsi backend(local+lvs)

2.1Create raw file and write data to it
# qemu-img create -f raw /dev/vgtest/my_thin 1G
Formatting '/dev/vgtest/my_thin', fmt=raw size=1073741824
# qemu-io -c 'write -P 1 0 128M' /dev/vgtest/thin_lv01 -f raw
wrote 134217728/134217728 bytes at offset 0
128 MiB, 1 ops; 0:00:07.82 (16.351 MiB/sec and 0.1277 ops/sec)

2.2 Convert without copy offload
# time qemu-img convert -f raw -O qcow2 /dev/vgtest/thin_lv01 /dev/vgtest/thin_lv02 -p
    (100.00/100%)

real    0m12.048s
user    0m0.142s
sys    0m1.240s

2.3 convert with copy offload
# time strace -e trace=copy_file_range -f qemu-img convert -f raw -O qcow2 /dev/vgtest/thin_lv01 /dev/vgtest/thin_lv03 -C -p
strace: Process 76023 attached
[pid 76023] copy_file_range(10, [0], 12, [327680], 2097152, 0) = 2097152
[pid 76023] copy_file_range(10, [2097152], 12, [2424832], 2097152, 0) = 2097152
[pid 76023] copy_file_range(10, [4194304], 12, [4521984], 2097152, 0) = 2097152
[pid 76023] copy_file_range(10, [6291456], 12, [6619136], 2097152, 0) = 2097152
) = 2097152 copy_file_range(10, [8388608], 12, [8716288], 2097152, 0    (2.34/100%)
[pid 76023] copy_file_range(10, [10485760], 12, [10813440], 2097152, 0) = 2097152
……
……
[pid 76023] copy_file_range(10, [1071644672], 12, [1072037888], 2097152, 0) = 2097152
    (100.00/100%)
[pid 76023] +++ exited with 0 +++
+++ exited with 0 +++

real    0m46.274s
user    0m0.105s
sys    0m2.348s

3. For libiscsi backend(iscsi://)

3.1Create raw file and write data to it
# qemu-img create -f raw iscsi://10.66.11.19/iqn.2018-08.com.example:t1/0 1G
Formatting 'iscsi://10.66.11.19/iqn.2018-08.com.example:t1/0', fmt=raw size=1073741824
# qemu-io -c 'write 0 128M' iscsi://10.66.11.19/iqn.2018-08.com.example:t1/0
WARNING: Image format was not specified for 'json:{"lun": "0", "portal": "10.66.11.19", "driver": "iscsi", "transport": "tcp", "target": "iqn.2018-08.com.example:t1"}' and probing guessed raw.
         Automatically detecting the format is dangerous for raw images, write operations on block 0 will be restricted.
         Specify the 'raw' format explicitly to remove the restrictions.
wrote 134217728/134217728 bytes at offset 0
128 MiB, 1 ops; 0:00:04.63 (27.634 MiB/sec and 0.2159 ops/sec)

3.2 Convert without copy offload
# time qemu-img convert -f raw -O qcow2 iscsi://10.66.11.19/iqn.2018-08.com.example:t1/0 iscsi://10.66.11.19/iqn.2018-08.com.example:t1/2 -p
    (100.00/100%)

real    0m15.338s
user    0m0.735s
sys    0m2.160s

3.3 convert with copy offload
# time strace -e trace=copy_file_range -f qemu-img convert -f raw -O qcow2 iscsi://10.66.11.19/iqn.2018-08.com.example:t1/0 iscsi://10.66.11.19/iqn.2018-08.com.example:t1/1 -C -p
    (100.00/100%)
+++ exited with 0 +++

real    0m2.846s
user    0m0.107s
sys    0m0.318s


Scenario 4(qemu iotest case)
1. Get the source code
# brew download-build --rpm --arch=x86_64 qemu-kvm-rhev-2.12.0-10.el7.src.rpm
# rpm -ivhf qemu-kvm-rhev-2.12.0-10.el7.src.rpm
# rpmbuild -bp /root/rpmbuild/SPECS/qemu-kvm.spec --nodeps 

2. configure the env
# cd /root/rpmbuild/BUILD/qemu-2.12.0/
# ./configure 
# export QEMU_PROG=/usr/libexec/qemu-kvm
# export QEMU_IMG_PROG=/usr/bin/qemu-img
# export QEMU_IO_PROG=/usr/bin/qemu-io
# export QEMU_NBD_PROG=/usr/bin/qemu-nbd

3. test the 82th case
# cd tests/qemu-iotests/
# ./check -qcow2 082
QEMU          -- "/usr/libexec/qemu-kvm" -nodefaults -machine accel=qtest
QEMU_IMG      -- "/usr/bin/qemu-img"
QEMU_IO       -- "/usr/bin/qemu-io"  --cache writeback -f qcow2
QEMU_NBD      -- "/usr/bin/qemu-nbd"
IMGFMT        -- qcow2 (compat=1.1)
IMGPROTO      -- file
PLATFORM      -- Linux/x86_64 lenovo-sr630-01 3.10.0-931.el7.x86_64
TEST_DIR      -- /root/rpmbuild/BUILD/qemu-2.12.0/tests/qemu-iotests/scratch
SOCKET_SCM_HELPER -- /root/rpmbuild/BUILD/qemu-2.12.0/tests/qemu-iotests/socket_scm_helper

082        
Passed all 1 tests

Comment 9 Tingting Mao 2018-08-17 03:30:23 UTC
Hi Fam,

The bug is verified with scenarios like comment 8, could you please help to check it is enough? 
And for "Scenario 3(compare convert with and without copy offloading)", could you please pay more attention to check if the result is okay?

Thanks in advance.

Comment 10 Fam Zheng 2018-08-17 06:35:54 UTC
Yes. For iscsi:// you won't see copy_file_range calls since the operation is instead implemented in libiscsi w/o any special host syscall.

This bug can be verified.

Comment 11 Tingting Mao 2018-08-17 08:12:49 UTC
Thanks for Fam's information, and set the bug as verified.

Comment 12 errata-xmlrpc 2018-11-01 11:13:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443