Bug 511072 - KVM - qemu-img fail to copy a RAW format image over FCP storage
Summary: KVM - qemu-img fail to copy a RAW format image over FCP storage
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Kevin Wolf
QA Contact: Lawrence Lim
URL:
Whiteboard:
: 591037 (view as bug list)
Depends On: 537655
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-07-13 14:58 UTC by yeylon@redhat.com
Modified: 2016-04-18 06:22 UTC (History)
11 users (show)

Fixed In Version: kvm-83-120.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:56:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0271 0 normal SHIPPED_LIVE Important: kvm security, bug fix and enhancement update 2010-03-29 13:19:48 UTC

Comment 1 Kevin Wolf 2009-07-14 08:08:18 UTC
Sorry, I don't even understand the problem. You would have helped me a lot if you actually had filled the suggested bug report sections. Please provide at least the shell commands you're using, what happens and what you're expecting to happen.

How are you trying to copy an image with qemu-img create? As its name says, this is not for copying, but for creating images.

Comment 2 Shahar Frank 2009-09-15 13:31:19 UTC
there is a small error in the bug description: Yaniv meant "qemu-img convert", specifically from raw to raw on block devices. The bug is that the resulting block device is not identical to the source block device. In RHEV we use this to create a template out of a regular image and the above bug cause the template to be corrupted. A simple qemu-img convert and cmp after it can be used to verify it.

Comment 3 Kevin Wolf 2009-09-15 15:51:37 UTC
Ok, this makes more sense. Can you provide me a simple test case, preferrably a shell script that I can run? You suggest that it can't be reproduced using files, right? Would it be enough to use losetup with an image to get the wrong behaviour, do I need to take a real partition or even something more specific?

Comment 4 Shahar Frank 2009-09-16 08:35:59 UTC
Ok, I am not sure about the files. We found this bug long ago on a FCP system. If may not reproduced on file or other systems due timing issues.

I will try to provide more information and scripts.

Comment 5 Ayal Baron 2009-09-17 12:58:03 UTC
(In reply to comment #3)
> Ok, this makes more sense. Can you provide me a simple test case, preferrably a
> shell script that I can run? You suggest that it can't be reproduced using
> files, right? Would it be enough to use losetup with an image to get the wrong
> behaviour, do I need to take a real partition or even something more specific?  

We easily recreated the bug today:
1. create 2 LVs (block devices) - source and target
2. "dirty" source (e.g. dd from /dev/urandom)
3. zero out a section of the source (e.g. dd if=/dev/zero of=SOURCE bs=1M skip=250 count=100)
4. qemu-img convert -f raw SOURCE -O raw TARGET
5. cmp SOURCE TARGET

We believe the problem is that qemu-img always assumes that the underlying storage is sparse and therefore may omit copying zero blocks.  This is of course the wrong behaviour when the target is preallocated and the format is raw.

Comment 6 Kevin Wolf 2009-09-17 14:31:22 UTC
Thanks, this helps a lot. I can reproduce the behaviour you described and I think your analysis is right, too. Actually, converting to a host device isn't even meant to be supported in this version of the code, it was only added later. I'm not sure why qemu-img doesn't abort the conversion with an error message.

Upstream qemu, on the other hand, should have the feature, but still complains about an "Error while formatting '/dev/loop1'" I need to investigate more on both.

Comment 7 Kevin Wolf 2009-09-17 16:52:20 UTC
I can backport the upstream change introducing the feature. You still need to specify -O host_device then (I didn't do so at first, this is why I got the error message).

Comment 14 Miya Chen 2009-12-31 02:29:33 UTC
Tested in kvm-83-142, this problem does not exist.
1. create 2 lvs:
#lvs
test1 vg1 wi-a 1.00G
test2 vg1 wi-a 1.00G
2. dirty source:
dd if=/dev/urandom of=/dev/vg1/test1 bs=1M count=100
3. zero out a section of the source:
dd if=/dev/zero of=/dev/vg1/test1 bs=1M skip=250 count=100
4. qemu-img convert -f raw /dev/vg1/test1 -O host_device /dev/vg1/test2

actual result:
cmp /dev/vg1/test1 /dev/vg1/test2
cmp is silent, so No diff is found.

Comment 15 Ayal Baron 2009-12-31 07:45:11 UTC
Actually this should be tested with "-O raw" and not "-O host_device"
(under the hood it's supposed to do the same thing).

Comment 16 Miya Chen 2009-12-31 08:01:28 UTC
(In reply to comment #15)
> Actually this should be tested with "-O raw" and not "-O host_device"
> (under the hood it's supposed to do the same thing).  

yes, I can reproduce this problem in kvm-83-105 with "-O raw", but in kvm-83-142, if I use "-O raw", I will get the error like "Error while formatting ..".
comment#6 and comment#7 are about this problem.

Comment 17 Ayal Baron 2009-12-31 08:12:00 UTC
Those comments are outdated, since then, Kevin has fixed the "raw" format to identify whether the underlying storage is a "host_device" or a file and act accordingly.
If you get the error using "raw" that means that some of the flows have not been fixed.

Comment 20 Kevin Wolf 2010-01-08 13:48:34 UTC
(In reply to comment #17)
> Those comments are outdated, since then, Kevin has fixed the "raw" format to
> identify whether the underlying storage is a "host_device" or a file and act
> accordingly.
> If you get the error using "raw" that means that some of the flows have not
> been fixed.  

The comments are not outdated. There is a patch to allow using "raw" again, but it's not yet acked and therefore not included. It's the fix for 537655, which is a different issue. Please handle each fix in its own bug report, not in other reports which look slightly related.

Moving back to ON_QA.

Comment 21 Ayal Baron 2010-01-10 13:20:13 UTC
(In reply to comment #20)
> (In reply to comment #17)
> > Those comments are outdated, since then, Kevin has fixed the "raw" format to
> > identify whether the underlying storage is a "host_device" or a file and act
> > accordingly.
> > If you get the error using "raw" that means that some of the flows have not
> > been fixed.  
> 
> The comments are not outdated. There is a patch to allow using "raw" again, but
> it's not yet acked and therefore not included. It's the fix for 537655, which
> is a different issue. Please handle each fix in its own bug report, not in
> other reports which look slightly related.
> 
> Moving back to ON_QA.    
Sorry I was under the impression that it's already in, but in any event, we would not use "host_device" rather "raw" which is how this bug was opened (look at comment #5).  Seeing as we agreed on using "raw" when creating the volume, I think it is rather obvious we are going to do the same thing across the board (use only "raw" not "host_device"). So in the least I would make this bug dependent on 537655 and check again after it is accepted.

Comment 23 Ayal Baron 2010-01-12 13:22:47 UTC
Kevin, please review comment #21.
The test in c#14 is not the scenario for which the bug was opened.
It could be that once bug 537655 is solved this will work but it cannot be tested properly before that.

Comment 27 Miya Chen 2010-01-18 03:59:07 UTC
Tested in kvm-83-147, this problem does not exist.
1. create 2 lvs:
#lvs
test1 vg1 wi-a 1.00G
test2 vg1 wi-a 1.00G
2. dirty source:
dd if=/dev/urandom of=/dev/vg1/test1 bs=1M count=100
3. zero out a section of the source:
dd if=/dev/zero of=/dev/vg1/test1 bs=1M skip=250 count=100
4. qemu-img convert -f raw /dev/vg1/test1 -O raw /dev/vg1/test2

actual result:
cmp /dev/vg1/test1 /dev/vg1/test2
cmp is silent, so No diff is found.

Comment 30 errata-xmlrpc 2010-03-30 07:56:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0271.html

Comment 31 Bill Burns 2010-06-30 14:32:11 UTC
*** Bug 591037 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.