Bug 1336705

Summary: Drive mirror with option granularity fail
Product: Red Hat Enterprise Linux 7 Reporter: Yang Yang <yanyang>
Component: qemu-kvm-rhevAssignee: John Snow <jsnow>
Status: CLOSED ERRATA QA Contact: Qianqian Zhu <qizhu>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.3CC: chayang, huding, juzhang, knoel, mrezanin, pezhang, qizhu, virt-maint, yalzhang
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.6.0-13.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-07 21:10:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yang Yang 2016-05-17 09:10:03 UTC
Description of problem:
Drive mirror with option granularity fails on qemu-2.6. It cannot be reproduced on qemu-kvm-rhev-2.5.0-4.el7.x86_64

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.6.0-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. mirror with granularity option
{ "execute": "drive-mirror", "arguments": { "device": "drive-virtio-blk0","target": "mirror0", "sync": "full","format": "raw", "mode": "absolute-paths","granularity":8192} }
{"return": {}}

{"timestamp": {"seconds": 1463468322, "microseconds": 946118}, "event": "BLOCK_JOB_ERROR", "data": {"device": "drive-virtio-blk0", "operation": "write", "action": "report"}}

{"timestamp": {"seconds": 1463468323, "microseconds": 75441}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-blk0", "len": 3408723968, "offset": 10551296, "speed": 0, "type": "mirror", "error": "Invalid argument"}}

{"timestamp": {"seconds": 1463468423, "microseconds": 303307}, "event": "SHUTDOWN"}

Actual results:
Mirror with option granularity fail

Expected results:
Mirror with option granularity pass

Additional info:

Comment 2 John Snow 2016-06-15 23:50:33 UTC
Identified as a regression caused by:

commit e5b43573e28b226621ac6ed9ad71e1a72d71922d
Author: Fam Zheng <famz>
Date:   Fri Feb 5 10:00:29 2016 +0800

    mirror: Rewrite mirror_iteration


...Which is fairly large change. Will investigate a fix, I am still not fully clear on the root problem -- something beneath paio_submit is returning -EINVAL from some point, so we're probably passing something incorrect to 
bdrv_aio_readv / blk_aio_preadv, I'd guess.

Odd to me that it gets so far down the chain before EINVAL gets bubbled back up, though.

Comment 3 Qianqian Zhu 2016-06-16 07:21:54 UTC
Meet same issue on qemu-kvm-rhev-2.6.0-6.el7.x86_64.
According to my test, specify 'buf-size' together with 'granularity' mirroring could accomplish successfully, while mirror with only 'granularity' qemu will core dump for 100%.

(gdb) run -name avocado-vt-vm1 -sandbox off -machine pc -nodefaults -vga cirrus -device ich9-usb-ehci1,id=usb1,addr=1d.7,multifunction=on,bus=pci.0 -device ich9-usb-uhci1,id=usb1.0,multifunction=on,masterbus=usb1.0,addr=1d.0,firstport=0,bus=pci.0 -device ich9-usb-uhci2,id=usb1.1,multifunction=on,masterbus=usb1.0,addr=1d.2,firstport=2,bus=pci.0 -device ich9-usb-uhci3,id=usb1.2,multifunction=on,masterbus=usb1.0,addr=1d.4,firstport=4,bus=pci.0 -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/RHEL-Server-7.3-64-virtio.qcow2 -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=03,disable-legacy=off,disable-modern=on -device virtio-net-pci,mac=9a:54:55:56:57:58,id=idcDKlyP,vectors=4,netdev=idePy7IS,bus=pci.0,addr=04,disable-legacy=off,disable-modern=on -netdev tap,id=idePy7IS,vhost=on -m 4096 -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 -cpu SandyBridge,+kvm_pv_unhalt -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 -vnc :1 -rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=c,menu=off,strict=off -monitor stdio -qmp tcp:0:5555,server,nowait

[New Thread 0x7fffe7526700 (LWP 3929)]
Formatting 'target1.qcow2', fmt=qcow2 size=21474836480 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

Program received signal SIGABRT, Aborted.
0x00007fffefe385f7 in raise () from /lib64/libc.so.6

Comment 4 John Snow 2016-06-22 19:18:53 UTC
Max helped identify that this is a problem with us exceeding MAX_IOV, which has happened to use before: cae98cb87d269c33d23b2bccd79bb8d99a60d811

Fixes en route.

Comment 6 Miroslav Rezanina 2016-07-12 08:49:03 UTC
Fix included in qemu-kvm-rhev-2.6.0-13.el7

Comment 8 Qianqian Zhu 2016-07-28 10:14:04 UTC
Verified with:

Steps:
1. Launch guest:
/usr/libexec/qemu-kvm -name linux -cpu SandyBridge -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid 7bef3814-631a-48bb-bae8-2b1de75f7a13 -nodefaults -monitor stdio -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot order=c,menu=on -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/nfs/RHEL-Server-7.3-64-virtio.qcow2,if=none,cache=writeback,id=drive-virtio-disk0,format=qcow2 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on -spice port=5901,disable-ticketing -vga qxl -global qxl-vga.revision=3 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=3C:D9:2B:09:AB:44,bus=pci.0,addr=0x3 -qmp tcp:0:5555,server,nowait

2. Mirror with granularity:
{ "execute": "drive-mirror", "arguments": { "device": "drive-virtio-disk0","target": "mirror0", "sync": "full","format": "raw", "mode": "absolute-paths","granularity":8192}}

3. Reopen to the new mirror image:
{"execute": "block-job-complete", "arguments": { "device": "drive-virtio-disk0"}}

Results:
Block mirror succeed, without any error.

Step 2:
{"timestamp": {"seconds": 1469700262, "microseconds": 460010}, "event": "BLOCK_JOB_READY", "data": {"device": "drive-virtio-disk0", "len": 8180924416, "offset": 8180924416, "speed": 0, "type": "mirror"}}

Step 3:
{"timestamp": {"seconds": 1469700464, "microseconds": 509785}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 8180924416, "offset": 8180924416, "speed": 0, "type": "mirror"}}

(qemu) info block
drive-virtio-disk0 (#block312): mirror0 (raw)
    Cache mode:       writeback

Comment 9 Qianqian Zhu 2016-07-28 10:15:20 UTC
(In reply to qianqianzhu from comment #8)
> Verified with:
qemu-kvm-rhev-2.6.0-15.el7.x86_64
kernel-3.10.0-475.el7.x86_64

Comment 10 Qianqian Zhu 2016-08-15 07:23:46 UTC
Moving to verified as per Comment 9

Comment 12 errata-xmlrpc 2016-11-07 21:10:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html