Bug 1200350

Summary: qemu-img compare error after drive-mirror with 'sync=full'
Product: Red Hat Enterprise Linux 6 Reporter: xiagao
Component: qemu-kvmAssignee: Jeff Cody <jcody>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.7CC: ailan, chayang, cloud, coli, hhuang, jcody, juzhang, michen, mkenneth, ngu, pezhang, qzhang, rbalakri, rpacheco, SButkeev, scui, shuang, virt-bugs, virt-maint, xuhan, xutian, yfu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1147358 Environment:
Last Closed: 2016-02-03 20:34:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1147358    
Bug Blocks:    

Comment 2 xiagao 2015-03-10 11:46:07 UTC
Version-Release number of selected component (if applicable):
kernel-2.6.32-541.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.457.el6.x86_64

Comment 4 CROC Cloud 2015-08-27 20:26:57 UTC
Hello, support
We execute mirroring between qcow2(sparse file) and raw device and after reopening we get errors of comparing. We have problem with transferring zero blocks from qcow2(it's unallocated blocks) to raw because qemu doesn't replace empty blocks to zero and we find old garbage within new device.

Comment 6 xiagao 2016-01-05 07:03:56 UTC
Hi developer,

I have tried to reproduce this issue on RHEL6.8 host. It still exist. The detail info is:

pkg version:
qemu-kvm-rhev-0.12.1.2-2.482.el6.x86_64
kernel-2.6.32-573.el6.x86_64

reproduce steps:
1. start mirroring:
{ "execute": "__com.redhat_drive-mirror", "arguments": { "device": "drive_image1", "target": 
"/mnt/nfs/target1.qcow2", "format": "qcow2", "mode": "absolute-paths", "full": true } }

2. wait for steady state
{'execute': 'query-block-jobs', 'id': 'eFwUcXhn'}
{"return": [{"device": "drive_image1", "len": 32212254720, "offset": 32212254720, "speed": 0, "type": "mirror"}], "id": "eFwUcXhn"}


3. do sync on host:
#sync

4. compare two images


actual result:
# /usr/bin/qemu-img compare /home/my_auto/autotest/client/tests/virt/shared/data/images/win2012-64r2-virtio-scsi.qcow2 /mnt/nfs/target1.qcow2
Content mismatch at offset 72876032!


Could you pls check it asap,because it ofter happend on acceptance test.

thanks,
xiaoling

Comment 8 Jeff Cody 2016-02-03 17:48:36 UTC
We do not have the final bdrv_drain_all() call occurring - if block-job-cancel is called after completion, then the target is drained, and the image compare should be correct.  Upstream and in RHEL7, we have block-job-complete to perform this action without "cancel".

Comment 9 Jeff Cody 2016-02-03 20:34:35 UTC
Closing this as NOTABUG, because after a drive-mirror, one of the following commands must be issued in RHEL6 for a successful mirror:

1.) block-job-cancel (this will stop mirroring, and not pivot to the new target), or

2.) drive-reopen (this will stop mirroring, and pivot to the new target)

You need to use one of these commands, but not both.

Without either of those steps, the mirror is not technically complete, and the there may be outstanding data due to caching, or ongoing dirty bitmaps.

Comment 10 Pei Zhang 2016-03-14 03:25:50 UTC
(In reply to Jeff Cody from comment #9)
> Closing this as NOTABUG, because after a drive-mirror, one of the following
> commands must be issued in RHEL6 for a successful mirror:
> 
> 1.) block-job-cancel (this will stop mirroring, and not pivot to the new
> target), or
> 
> 2.) drive-reopen (this will stop mirroring, and pivot to the new target)
> 
> You need to use one of these commands, but not both.
> 
> Without either of those steps, the mirror is not technically complete, and
> the there may be outstanding data due to caching, or ongoing dirty bitmaps.

Hello Jeff,

I test with rhel6 according to your comments, the compare works(original image and new image are same).

My question is: are these steps needed in rhel7? The status of Bug 1147358 is still New.

Comment 11 Jeff Cody 2016-03-15 20:05:03 UTC
(In reply to Pei Zhang from comment #10)
> (In reply to Jeff Cody from comment #9)
> > Closing this as NOTABUG, because after a drive-mirror, one of the following
> > commands must be issued in RHEL6 for a successful mirror:
> > 
> > 1.) block-job-cancel (this will stop mirroring, and not pivot to the new
> > target), or
> > 
> > 2.) drive-reopen (this will stop mirroring, and pivot to the new target)
> > 
> > You need to use one of these commands, but not both.
> > 
> > Without either of those steps, the mirror is not technically complete, and
> > the there may be outstanding data due to caching, or ongoing dirty bitmaps.
> 
> Hello Jeff,
> 
> I test with rhel6 according to your comments, the compare works(original
> image and new image are same).
> 
> My question is: are these steps needed in rhel7? The status of Bug 1147358
> is still New.

As mentioned in Comment #6 in BZ 1147358, for rhel7 the command needed is BLOCK_JOB_COMPLETE.  That bug should likely also be closed as NOTABUG.