Bug 1147358
Summary: | qemu-img compare error after drive-mirror with 'sync=full' | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | ShupingCui <scui> | |
Component: | qemu-kvm-rhev | Assignee: | Jeff Cody <jcody> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 7.1 | CC: | chayang, coli, hhuang, jcody, juzhang, meyang, michen, ngu, qizhu, scui, shuang, virt-maint, xuhan, xutian | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1200350 (view as bug list) | Environment: | ||
Last Closed: | 2016-03-15 20:10:13 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1200350 |
Description
ShupingCui
2014-09-29 06:08:46 UTC
Hi Jeff, This test always failed in our qemu-kvm-rhev acceptance testing, can you help to looks it ASAP. Thanks very much!! Xu Hi Shuping, I try to reproduce this with qemu-kvm-rhev-2.1.2-18.el7.x86_64, fail to reproduce with 5 times: /usr/libexec/qemu-kvm -enable-kvm -M pc-i440fx-rhel7.0.0 -smp 4 -m 4G -name rhel6.3-64 -uuid 3f2ea5cd-3d29-48ff-aab2-23df1b6ae213 -drive file=/root/RHEL-Server-7.1-64-virtio.qcow2,cache=none,if=none,rerror=stop,werror=stop,id=drive-virtio-disk0,format=qcow2,aio=native -device virtio-blk-pci,drive=drive-virtio-disk0,id=device-virtio-disk0,bootindex=1 -boot order=cd -monitor stdio -readconfig nfs/ich9-ehci-uhci.cfg -device usb-tablet,id=input0 -chardev socket,id=s1,path=/tmp/s1,server,nowait -device isa-serial,chardev=s1 -monitor tcp::1235,server,nowait -vga qxl -global qxl-vga.revision=3 -spice port=5930,disable-ticketing -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -vnc :10 -qmp tcp:0:5556,server,nowait -sandbox on -cpu host -netdev tap,script=/etc/qemu-ifup,id=netdev0 -device virtio-net-pci,netdev=netdev0,id=device-net0,mac=aa:54:00:11:22:33 steps: 1. start mirroring: drive_mirror drive-virtio-disk0 /root/sn1 qcow2 2. after steady state, on host: sync echo 3 > /proc/sys/vm/drop_caches sync # qemu-img compare RHEL-Server-7.1-64-virtio.qcow2 sn1 -p Images are identical. Could you try with latest qemu whether this issue still can be reproduced? Bests, Shaolong (In reply to xu from comment #4) > Hi Jeff, > > This test always failed in our qemu-kvm-rhev acceptance testing, can you > help to looks it ASAP. > > Thanks very much!! > Xu Looking at the description, it appears that the disks are being compared after the BLOCK_JOB_READY response has been received. However, the user/tester did not enter a BLOCK_JOB_COMPLETE after the BLOCK_JOB_READY response. If there is any guest i/o to that drive after the BLOCK_JOB_READY, then the images may differ at that point. In order to finish the mirror, a BLOCK_JOB_COMPLETE must be issued. Can you confirm that after issuing a BLOCK_JOB_COMPLETE, there is no issue? (In reply to Jeff Cody from comment #6) > (In reply to xu from comment #4) > > Hi Jeff, > > > > This test always failed in our qemu-kvm-rhev acceptance testing, can you > > help to looks it ASAP. > > > > Thanks very much!! > > Xu > > Looking at the description, it appears that the disks are being compared > after the BLOCK_JOB_READY response has been received. However, the > user/tester did not enter a BLOCK_JOB_COMPLETE after the BLOCK_JOB_READY > response. If there is any guest i/o to that drive after the > BLOCK_JOB_READY, then the images may differ at that point. In order to > finish the mirror, a BLOCK_JOB_COMPLETE must be issued. > > Can you confirm that after issuing a BLOCK_JOB_COMPLETE, there is no issue? Hi Jeff, As I understand after BLOCK_JOB_COMPLETE issued, target image will be reopen by qemu, and qemu will not touch source image. so I think if there is any guest i/o to the drive source image and target image will differ. But I will have a try according your comment. Thanks, Xu Hi all, Here is some heads up: 1. This bug is captured by autotest 2. the correct test steps is after reaching steady state, stop guest, do a sync, then compare the source and target image. 3. At first autotest steps has a little difference with manual test case, i have debug with reporter ShupingCui who is our autotest colleague, after fix autotest steps, qemu-img compare still fail. At last, i use same steps as in comment 5, on same host that autotest scripts runs on, and with same guest image, do test manually, ShupingCui is there too, can not hit the problem, it's very weird. btw, "echo 3 > /proc/sys/vm/drop_caches" is not necessary. Closing this as NOTABUG. The mirror is not complete until the BLOCK_JOB_COMPLETE command is issued; up until then, there may be discrepancies due to guest filesystem caching, writes, etc. kernel-3.10.0-229.35.1.el7.x86_64 qemu-kvm-rhev-2.1.2-23.el7_1.12.x86_64 i met this problem on the version above steps: 1)boot up guest MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox off \ -machine pc \ -nodefaults \ -vga qxl \ -device intel-hda,bus=pci.0,addr=03 \ -device hda-duplex \ -chardev socket,id=qmp_id_qmp1,path=/var/tmp/monitor-qmp1-20160519-043259-jrcQKSGq,server,nowait \ -mon chardev=qmp_id_qmp1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/monitor-catch_monitor-20160519-043259-jrcQKSGq,server,nowait \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device pvpanic,ioport=0x505,id=id4Vcvh8 \ -chardev socket,id=serial_id_serial0,path=/var/tmp/serial-serial0-20160519-043259-jrcQKSGq,server,nowait \ -device isa-serial,chardev=serial_id_serial0 \ -device virtio-serial-pci,id=virtio_serial_pci0,bus=pci.0,addr=04 \ -chardev socket,id=devvs,path=/var/tmp/virtio_port-vs-20160519-043259-jrcQKSGq,server,nowait \ -device virtserialport,chardev=devvs,name=vs,id=vs,bus=virtio_serial_pci0.0 \ -chardev socket,id=seabioslog_id_20160519-043259-jrcQKSGq,path=/var/tmp/seabios-20160519-043259-jrcQKSGq,server,nowait \ -device isa-debugcon,chardev=seabioslog_id_20160519-043259-jrcQKSGq,iobase=0x402 \ -device nec-usb-xhci,id=usb1,bus=pci.0,addr=05 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pci.0,addr=06 \ -drive id=drive_image1,if=none,cache=none,snapshot=off,aio=native,format=qcow2,file=/usr/share/avocado/data/avocado-vt/images/win10-64-virtio-scsi.qcow2 \ -device scsi-hd,id=image1,drive=drive_image1 \ -device virtio-net-pci,mac=9a:9e:9f:a0:a1:a2,id=idukLEkU,vectors=4,netdev=idLhNMOQ,bus=pci.0,addr=07 \ -netdev tap,id=idLhNMOQ \ -m 16384 \ -smp 8,maxcpus=8,cores=4,threads=1,sockets=2 \ -cpu 'Opteron_G3',+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_vapic,hv_time \ -drive id=drive_cd1,if=none,cache=none,snapshot=off,aio=native,media=cdrom,file=/usr/share/avocado/data/avocado-vt/isos/windows/winutils.iso \ -device ide-cd,id=cd1,drive=drive_cd1,bus=ide.0,unit=0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -spice port=3000,password=123456 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot order=cdn,once=c,menu=off,strict=off \ -enable-kvm \ -monitor stdio \ 2.continue the guest 3. {'execute': 'drive-mirror', 'arguments': {'device': u'drive_image1', 'mode': 'absolute-paths', 'format': 'qcow2', 'target': '/usr/share/avocado/data/avocado-vt/images/target1.qcow2', 'sync': 'full'}, 'id': '46JUBq5D'} 4. check the status until it offset reached len {'execute': 'query-block-jobs', 'id': 'ZU9k0BEr'} {"return": [{"io-status": "ok", "device": "drive_image1", "busy": false, "len": 32212254720, "offset": 32212254720, "paused": false, "speed": 0, "type": "mirror"}], "id": "ZU9k0BEr"} 5.{'execute': 'block-job-complete', 'arguments': {'device': 'drive_image1'}, 'id': 'B5EBtNzs'} {"return": {}, "id": "B5EBtNzs"} {"timestamp": {"seconds": 1463726380, "microseconds": 249193}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "drive_image1", "len": 32212254720, "offset": 32212254720, "speed": 0, "type": "mirror"}} 6.compare [root@intel-e52650-16-4 ~]# /bin/qemu-img compare /usr/share/avocado/data/avocado-vt/images/win10-64-virtio-scsi.qcow2 /usr/share/avocado/data/avocado-vt/images/target2.qcow2 Content mismatch at offset 317201920! 7. i tried several times ,and always got the mismatch, do you have any suggesstions,thanks. (In reply to Yang Meng from comment #13) > kernel-3.10.0-229.35.1.el7.x86_64 > qemu-kvm-rhev-2.1.2-23.el7_1.12.x86_64 > > > i met this problem on the version above > Jeff, After add the block-job-complete step, we still met the problem both manually and in auto on the latest rhel7.1z qemu-kvm-rhev version with 100% reproduce rate; however, we failed to reproduce the bug in the latest qemu-kvm-rhev versions of both rhel7.2z and rhel7.3. tried on qemu-kvm-rhev-2.1.2-23.el7.x86_64 libvirt-daemon-driver-qemu-1.2.8-16.el7_1.5.x86_64 qemu-img-rhev-2.1.2-23.el7.x86_64 qemu-kvm-rhev-debuginfo-2.1.2-23.el7.x86_64 qemu-kvm-common-rhev-2.1.2-23.el7.x86_64 qemu-kvm-tools-rhev-2.1.2-23.el7.x86_64 also hit the problem,could you help to check, thanks. (In reply to Gu Nini from comment #14) > > After add the block-job-complete step, we still met the problem both > manually and in auto on the latest rhel7.1z qemu-kvm-rhev version with 100% > reproduce rate; however, we failed to reproduce the bug in the latest > qemu-kvm-rhev versions of both rhel7.2z and rhel7.3. So given 7.2.z and 7.3 work, I'm changing this BZ to CURRENTRELEASE. AFAIK there are no plans or requests to release a fix to 7.1.z. |