Bug 522720 - cloning a sparse file can create a non-sparse file
cloning a sparse file can create a non-sparse file
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: libvirt (Show other bugs)
rawhide
All Linux
high Severity medium
: ---
: ---
Assigned To: Libvirt Maintainers
Fedora Extras Quality Assurance
:
: 564462 667509 (view as bug list)
Depends On:
Blocks: F12VirtTarget
  Show dependency treegraph
 
Reported: 2009-09-11 03:42 EDT by Kris Buytaert
Modified: 2012-04-12 07:59 EDT (History)
19 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-04-12 07:59:11 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
screenshot of the error. (82.62 KB, image/png)
2009-09-11 14:29 EDT, Kris Buytaert
no flags Details
virt-manager.log (9.74 KB, application/x-gzip)
2009-09-11 14:30 EDT, Kris Buytaert
no flags Details
virt-manager.log 2009-12-22 (2.50 KB, application/x-gzip-compressed)
2009-12-22 14:41 EST, Kris Buytaert
no flags Details
Screenshot of the faulty size assumption (65.21 KB, image/png)
2009-12-22 14:49 EST, Kris Buytaert
no flags Details

  None (edit)
Description Kris Buytaert 2009-09-11 03:42:58 EDT
Description of problem:

When trying to clone an existing VM , I saw my progress bar growing 
beyond 100% of progress (cfr screenshot at@249% ) 



Version-Release number of selected component (if applicable):

[root@mine vm]# rpm -qa | grep virt-manager
virt-manager-0.8.0-1.fc11.noarch


How reproducible:

Tried 2x , grew above 130% both times .. 

  
Actual results:

I had to abort the clone due to lack of diskspace.

Expected results:

Copy 100% then stop.

Additional info:
Comment 1 Kris Buytaert 2009-09-11 03:50:27 EDT
Seems like a potential cause of the problem is in a problem with the VM image


[root@mine vm]# ls -alh Drupal.img 
-rw------- 1 root root 23G 2009-09-11 09:42 Drupal.img
[root@mine vm]# du -sh Drupal.img 
1.9G	Drupal.img


This might also explain other problems I have (segfaults within the VM etc, that disappear after a reboot and come back after a while , still have to figure out more about those however ) ...
Comment 2 Mark McLoughlin 2009-09-11 10:02:07 EDT
Could you include ~/.virt-manager/virt-manager.log

Also, you were going to attach a screenshot?

(In reply to comment #1)

> [root@mine vm]# ls -alh Drupal.img 
> -rw------- 1 root root 23G 2009-09-11 09:42 Drupal.img
> [root@mine vm]# du -sh Drupal.img 
> 1.9G Drupal.img

This is the original image? The difference in sizes just means that it's a sparse image

For example, try "dd if=/dev/zero of=foo.img bs=1M count=0 seek=10M" - it's a 10Gib image which takes up no disk space
Comment 3 Kris Buytaert 2009-09-11 14:29:26 EDT
Created attachment 360713 [details]
screenshot of the error.
Comment 4 Kris Buytaert 2009-09-11 14:30:55 EDT
Created attachment 360714 [details]
virt-manager.log
Comment 5 Kris Buytaert 2009-09-11 14:38:00 EDT
The 23Gb struck me as gigantic for a VM disk (even sparse)  that I remember creating as having a max disk of 3-4Gb

I assume that checking the log,  I am wrongfully concluding that the Filesize: 3.0 means 3.0 Gb ? 


(Fri, 21 Aug 2009 07:48:40 virt-manager 4103] DEBUG (create:1444) Creating a VM Drupal
  Type: kvm,hvm
  UUID: dc55fe15-129c-52b8-d73e-775bafee37f5
  Install Source: /dev/sr0
  OS: linux:rhel5
  Kernel args: None
  Memory: 512
  Max Memory: 512
  # VCPUs: 1
  Filesize: 3.0
)
Comment 6 Vladimir Benes 2009-09-17 07:48:53 EDT
maybe I found the problem .. I have 4GiB pre-allocated image and I cloned it.. The progress bar went beyond 100% when approx. 1.5GiB of cloning was reached. 
See this part of virt-manager.log:
 </source>
  <capacity>4294967296</capacity>
  <allocation>1594740736</allocation>
  <target>
  
maybe progress should be counted from capacity and not allocation size?
Comment 7 Mark McLoughlin 2009-09-17 08:15:36 EDT
This part of the log is interesting:

(CloneManager:370) Validating original guest parameters
(CloneManager:382) Original paths: ['/var/lib/libvirt/images/Drupal.img']
(CloneManager:383) Original sizes: [3.0]
(clone:357) Original path: /var/lib/libvirt/images/Drupal.img
-clone.img
(VirtualDisk:589) Path '/var/lib/libvirt/images' is target for pool 'default'. Creating volume 'Drupal-clone.img'.
(VirtualDisk:638) Overwriting 'size' with value from StorageVolume object
(CloneManager:370) Validating original guest parameters
(CloneManager:382) Original paths: [None]
(CloneManager:383) Original sizes: [None]
(VirtualDisk:589) Path '/var/lib/libvirt/images' is target for pool 'default'. Creating volume 'Drupal-clone.img'.
(VirtualDisk:638) Overwriting 'size' with value from StorageVolume object
(CloneManager:370) Validating original guest parameters
(CloneManager:382) Original paths: ['/var/lib/libvirt/images/Drupal.img']
(CloneManager:383) Original sizes: [22.0359787940979]
(VirtualDisk:589) Path '/var/lib/libvirt/images' is target for pool 'default'. Creating volume 'Drupal-clone.img'.
(VirtualDisk:638) Overwriting 'size' with value from StorageVolume object
(CloneManager:399) Validating clone parameters.
(CloneManager:405) Clone paths: ['/var/lib/libvirt/images/Drupal-clone.img']
(VirtualDisk:638) Overwriting 'size' with value from StorageVolume object
G (Storage:845) The requested volume capacity will exceed the available pool space when the volume is fully allocated. (2256\
4 M requested capacity > 22320 M available)
(VirtualDisk:638) Overwriting 'size' with value from StorageVolume object
G (VirtualDisk:763) The requested volume capacity will exceed the available pool space when the volume is fully allocated. (\
22564 M requested capacity > 22320 M available)
(clone:742) Threading off connection to clone VM.
(CloneManager:370) Validating original guest parameters
(CloneManager:382) Original paths: ['/var/lib/libvirt/images/Drupal.img']
(CloneManager:383) Original sizes: [22.0359787940979]

The original image was:

<volume>
  <name>Drupal.img</name>
  <capacity>3221225472</capacity>
  <allocation>0</allocation>
  <target>
    <format type='raw'/>
  </target>
</volume>

and the cloned volume is:

<volume>
  <name>Drupal-clone.img</name>
  <key>/var/lib/libvirt/images/Drupal.img</key>
  <source>
  </source>
  <capacity>23660952064</capacity>
  <allocation>2033123328</allocation>
  <target>
    <path>/var/lib/libvirt/images/Drupal.img</path>
    <format type="raw"/>
    <permissions>
      <mode>0600</mode>
      <owner>0</owner>
      <group>0</group>
    </permissions>
  </target>
</volume>

It more or less just looks like virt-manager is getting confused about the size of the original image
Comment 8 Cole Robinson 2009-09-17 12:49:52 EDT
I can't reproduce this, and the log snippet doesn't indicate what is going on.

Is this reliably reproducable for anyone? Are you doing anything besides launching the clone wizard and clicking 'Clone Virtual Machine'?
Comment 9 Kris Buytaert 2009-09-17 13:37:03 EDT
I can reproduce this on my system, however I think the underlying issues something that went wrong with the sparse file, however that is something I don't know how to reproduce.
Comment 10 Joachim Schröder 2009-12-08 04:15:43 EST
I can confirm that cloning sparse images with virt-manager doesn't work, the cloned images will not be sparse but preallocated.
Is virt-manager using dd to clone images?
Comment 11 Cole Robinson 2009-12-22 13:03:07 EST
(In reply to comment #10)
> I can confirm that cloning sparse images with virt-manager doesn't work, the
> cloned images will not be sparse but preallocated.
> Is virt-manager using dd to clone images?  

We either use libvirt APIs (which implements the copy in C code) or virtinst (which implements the copy in python) :) Both should have sparse detection. I've just tried that now and it doesn't seem to be working, so could be an issue in virt-manager or libvirt. Please file a separate bug against libvirt, since the original reported issue seems to be quite different.
Comment 12 Cole Robinson 2009-12-22 13:12:34 EST
I've tried reproducing again, and still can't. The above log snippet really confuses me:

<here is where the clone dialog is launched>

[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (CloneManager:370) Validating original guest parameters
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (CloneManager:382) Original paths: ['/var/lib/libvirt/images/Drupal.img']
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (CloneManager:383) Original sizes: [3.0]
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (clone:357) Original path: /var/lib/libvirt/images/Drupal.img
Generated clone path: /var/lib/libvirt/images/Drupal-clone.img
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (VirtualDisk:589) Path '/var/lib/libvirt/images' is target for pool 'default'. Creating volume 'Drupal-clone.img'.
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (VirtualDisk:638) Overwriting 'size' with value from StorageVolume object
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (CloneManager:370) Validating original guest parameters
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (CloneManager:382) Original paths: [None]
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (CloneManager:383) Original sizes: [None]
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (VirtualDisk:589) Path '/var/lib/libvirt/images' is target for pool 'default'. Creating volume 'Drupal-clone.img'.
[Thu, 10 Sep 2009 21:35:12 virt-manager 3295] DEBUG (VirtualDisk:638) Overwriting 'size' with value from StorageVolume object

<Notice that we actually correctly determine the capacity as 3.0 gb up above.
 Now, 4 seconds later, user clicks the 'clone' button>

[Thu, 10 Sep 2009 21:35:16 virt-manager 3295] DEBUG (CloneManager:370) Validating original guest parameters
[Thu, 10 Sep 2009 21:35:16 virt-manager 3295] DEBUG (CloneManager:382) Original paths: ['/var/lib/libvirt/images/Drupal.img']
[Thu, 10 Sep 2009 21:35:16 virt-manager 3295] DEBUG (CloneManager:383) Original sizes: [22.0359787940979]

WTF? Size now detected as 22gb.

After looking over the code and trying to reproduce, I have no clue. That value should be coming straight from libvirt storage volume XML.

Kris, can you still reproduce? If so, do repeated 'clone' attempts of the same guest yield the same result (trying to create a 23gb disk?). Can you reproduce by creating a new guest from scratch and then cloning it? Does the guest you are cloning have actual data written to the disk? What filesystem is your host machine using?

Please also attach the output of 'virsh pool-refresh default; virsh vol-dumpxml /var/lib/libvirt/images/Drupal.img'

Thanks!
Comment 13 Kris Buytaert 2009-12-22 14:40:03 EST
I`m on virt-manager-0.8.0-7.fc11.noarch now.
Upon trying to clone the instance it tells me it is about to clone Drupal.img 
which it now detects as 22.0 GB

When starting the clone however it tells me it is at e.g 41% and 915MB . which makes me think it realizes it's only using 2.2Gb (the correct size..) 

And yes indeed it goes still over 100% (up to 205% before I just abborted it) 



[root@mine log]# virsh vol-dumpxml /var/lib/libvirt/images/Drupal.img  
<volume>
  <name>Drupal.img</name>
  <key>/var/lib/libvirt/images/Drupal.img</key>
  <source>
  </source>
  <capacity>23660952064</capacity>
  <allocation>2273181696</allocation>
  <target>
    <path>/var/lib/libvirt/images/Drupal.img</path>
    <format type='raw'/>
    <permissions>
      <mode>0600</mode>
      <owner>0</owner>
      <group>0</group>
    </permissions>
  </target>
</volume>

I`ll attach todays log too
Comment 14 Kris Buytaert 2009-12-22 14:41:17 EST
Created attachment 379904 [details]
virt-manager.log 2009-12-22
Comment 15 Kris Buytaert 2009-12-22 14:49:06 EST
Created attachment 379906 [details]
Screenshot of the faulty size assumption

Screenshot of the faulty size assumption
Comment 16 Daniel Berrange 2009-12-22 14:53:26 EST
What do the following show


 virsh vol-info /var/lib/libvirt/images/Drupal.img
 qemu-img info /var/lib/libvirt/images/Drupal.img
Comment 17 Cole Robinson 2009-12-22 15:05:33 EST
Just to clarify:

$ python -c "print 23660952064 / 1024 / 1024 / 1024"
22

So libvirt is actually reporting the disk as 22 gb.

Kris, in addition to what Dan requested in comment #16, please also provide:

du --apparent-size -h /var/lib/libvirt/images/Drupal.img
du -h /var/lib/libvirt/images/Drupal.img
ls -alh /var/lib/libvirt/images/Drupal.img

(I know some of that was provided above, but just to make sure everything is up to date).
Comment 18 Kris Buytaert 2009-12-22 15:07:00 EST
root@mine ~]# virsh vol-info /var/lib/libvirt/images/Drupal.img
Name:           Drupal.img
Type:           file
Capacity:       22.04 GB
Allocation:     2.12 GB

[root@mine ~]#  qemu-img info /var/lib/libvirt/images/Drupal.img
image: /var/lib/libvirt/images/Drupal.img
file format: raw
virtual size: 22G (23660952064 bytes)
disk size: 2.1G
Comment 19 Kris Buytaert 2009-12-22 15:07:33 EST
du --apparent-size -h /var/lib/libvirt/images/Drupal.img
23G	/var/lib/libvirt/images/Drupal.img
[root@mine ~]# du -h /var/lib/libvirt/images/Drupal.img
2.2G	/var/lib/libvirt/images/Drupal.img
[root@mine ~]# ls -alh /var/lib/libvirt/images/Drupal.img
-rw------- 1 root root 23G 2009-12-13 22:59 /var/lib/libvirt/images/Drupal.img
Comment 20 Daniel Berrange 2009-12-22 15:26:43 EST
I think there are two bugs here

 - libvirt is not correctly creating a sparse image when cloning

 - virt-manager/virt-install are incorrectly using 'allocation' of the original volume as the upper bound for the % progress display. They should use 'capacity', since there is no guarantee that the cloned volume will be able to honour sparseness. eg cloning from a sparse file, to an LVM volume will require copying the full capacity, not merely the allocation.
Comment 21 Cole Robinson 2010-02-26 21:24:37 EST
Part 2 is tracked by bug 550870 (and is now fixed upstream).

As far as libvirt goes, sparse detection is a best effort attempt anyways. Fully allocating the disk image definitely seems broken though. Can anyone still easily reproduce? What is the host filesystem the disk image is on? What are the size differences between libvirt's clone, and just doing a straight 'cp' of the original file?

Reassigning to libvirt.
Comment 22 Cole Robinson 2010-02-28 23:13:24 EST
*** Bug 564462 has been marked as a duplicate of this bug. ***
Comment 23 Cole Robinson 2010-03-01 11:18:02 EST
*** Bug 564462 has been marked as a duplicate of this bug. ***
Comment 24 Bug Zapper 2010-03-15 08:49:18 EDT
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 25 Bug Zapper 2011-06-02 13:45:46 EDT
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 26 Cole Robinson 2011-06-10 13:58:17 EDT
Prettys sure people have still reproduced this with latest fedora, so assigning to rawhide.
Comment 27 Cole Robinson 2011-07-11 13:15:47 EDT
*** Bug 667509 has been marked as a duplicate of this bug. ***
Comment 28 Michael Hines 2011-07-20 16:13:21 EDT
I can confirm additional details about this bug. It is definitely a libvirt bug. Here's what I did to reproduce the problem:

1. I have a RHEL 6.0 Machine update to use libvirt version 0.8.7. From there, I have a 700GB sparse file (20GB on disk) defined inside of a libvirt volume. From there I proceed to try to "vol-clone" the command using virsh, at which point libvirtd then creates a 700GB *on disk* file instead of a 20GB on-disk file (confirmed with the du -h/ls -h commands.

2. Second step: I downgrade the RPMs from libvirt 0.8.7 to 0.8.1 and repeated the above steps. Sparse files were handled *correctly*. No problems at all with the vol-clone libvirt command.

3. Third step. I proceeded to download libvirt 0.9 from the internet to see whether this regression was still happening in more recent versions of libvirt. The result is that the bug still existed in the open source version and that indeed sparse files were being cloned as non-sparse files.

So, it doesn't look like this is a Redhat bug, but instead a libvirt bug.
Comment 29 Michael Hines 2011-09-15 11:56:46 EDT
Additional testing with running qemu-img manually (which is what libvirt uses) shows that it's the qemu-img command that's not handling the sparseness correctly.

There appears to be a regression in the qemu utility - which would make this QEMU bug, not a libvirt bug.
Comment 30 Fedora Admin XMLRPC Client 2011-09-22 13:51:11 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 31 Fedora Admin XMLRPC Client 2011-09-22 13:53:41 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 32 Fedora Admin XMLRPC Client 2011-09-22 13:59:24 EDT
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 33 Richard W.M. Jones 2011-10-06 13:21:14 EDT
I'll just throw this in here ...

http://libguestfs.org/virt-sparsify.1.html
Comment 34 Fedora Admin XMLRPC Client 2011-11-30 14:31:58 EST
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 35 Fedora Admin XMLRPC Client 2011-11-30 14:35:21 EST
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 36 Fedora Admin XMLRPC Client 2011-11-30 14:42:27 EST
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 37 Fedora Admin XMLRPC Client 2011-11-30 14:53:24 EST
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.
Comment 39 Eric Blake 2012-04-12 07:59:11 EDT
Given comment 38, this has been fixed since 0.9.10; closing out this bug.

Note You need to log in before you can comment on or make changes to this bug.