861425 – qemu-img should discard empty blocks on block devices

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 861425 - qemu-img should discard empty blocks on block devices

Summary: qemu-img should discard empty blocks on block devices

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	qemu-kvm-rhev
Sub Component:
Version:	7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	---
Assignee:	Kevin Wolf
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-09-28 15:09 UTC by Neil Wilson
Modified:	2016-09-06 09:10 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-08-31 08:36:01 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Neil Wilson 2012-09-28 15:09:23 UTC

Description of problem:

When qemu-img copies a sparse image file (qcow, raw, etc) onto a thinly provisioned LVM block device, it allocates all the sectors for the virtual size of the disk. 


Version-Release number of selected component (if applicable):

Version     : 0.12.1.2
Release     : 2.295.el6_3.2

How reproducible:




Steps to Reproduce:
1. Create a qcow or raw sparse Virtual machine image on the normal filesystem
2. Create a thinly provisioned LVM partition  e.g lvcreate --thin servers/mysql_thin --virtualsize 50G --name srv-msql3-vol
3. Note the block usage of the VM image on the normal filesystem with ls -lsh (in my case 2.3G on a 10G virtual size).
4. Copy the image to the partition with qemu-img: qemu-img convert -p /var/lib/libvirt/images/srv-testb-vol.raw -O host_device /dev/servers/srv-msql3-vol
5. Run 'lvs' and note the Origin Data size of the lvm partition  
  
Actual results:

the Origin Data size is 20% of a 50G thin pool, ie 10G


Expected results:

the Origin Data size should be neared 2.3G (ie 4.6% of a 50G thin pool).


Additional info:

You get the same effect with 'cp --sparse=always'.

Comment 1 Neil Wilson 2012-09-28 15:12:01 UTC

You get the same effect using '-O raw' on the qemu-img command line.

There is an incomplete patch on the Qemu mailing list to implement BLKDISCARD support on host devices:

http://lists.gnu.org/archive/html/qemu-devel/2011-11/msg01659.html

Comment 3 Neil Wilson 2012-09-28 15:30:48 UTC

Note this requires a kernel with dm-thin discard support enabled.

Comment 4 Neil Wilson 2012-09-28 15:34:40 UTC

See also: https://bugzilla.redhat.com/show_bug.cgi?id=835622

Comment 6 Ademar Reis 2013-05-17 15:25:32 UTC

Neil: Thanks for taking the time to enter a bug report with us. We appreciate
the feedback and look to use reports such as this to guide our efforts at
improving our products. That being said, we're not able to  guarantee the
timeliness or suitability of a resolution for issues entered here because this
is not a mechanism for requesting support.

If this issue is critical or in any way time sensitive, please raise a ticket
through your regular Red Hat support channels to make certain  it receives the
proper attention and prioritization to assure a timely resolution.

For information on how to contact the Red Hat production support team, please
visit: https://www.redhat.com/support/process/production/#howto

We'll target a fix upstream, which will probably be included in RHEL7.

Comment 9 Kevin Wolf 2014-07-17 13:44:58 UTC

I believe this works since upstream qemu 2.0 if the option '-t none' is used
for 'qemu-img convert'. The next rebase will take care of it then. It looks as
if the BLKDISCARD ioctl doesn't work reliably without O_DIRECT, so we can't use
it for other cache modes.

Neil, can you please try with a current upstream qemu if it behaves as
you intended when you reported this?

Comment 11 huiqingding 2014-08-13 02:26:02 UTC

I test qemu-img-rhev-2.1.0-1.el7.x86_64 using the steps of comment 4.

Before run "qemu-img convert",
1. check the raw image, virtual size is 10G and disk size is 2.5G:
# qemu-img info test.img 
image: test.img
file format: raw
virtual size: 10G (10737418240 bytes)
disk size: 2.5G
# ls -lsh test.img 
2.6G -rw-r--r--. 1 root root 10G Aug 13 10:10 test.img

2. check the thinly provisioned LVM partition,  Origin Data% is 0.00
# lvs
  LV            VG              Attr       LSize   Pool       Origin Data%  Move Log Cpy%Sync Convert
  home          rhel_dhcp-8-248 -wi-ao---- 407.50g                                                   
  root          rhel_dhcp-8-248 -wi-ao----  50.00g                                                   
  swap          rhel_dhcp-8-248 -wi-ao----   7.77g                                                   
  mysql_thin    servers         twi-a-tz--  30.00g                     0.00                          
  srv-msql3-vol servers         Vwi-a-tz--  30.00g mysql_thin          0.00   

3. copy the image to the thinly provisioned LVM partition with "-t none"]
# qemu-img convert -p test.img -O raw -t none /dev/servers/srv-msql3-vol
    (100.00/100%)

4. check the thinly provisioned LVM partition
# lvs
  LV            VG              Attr       LSize   Pool       Origin Data%  Move Log Cpy%Sync Convert
  home          rhel_dhcp-8-248 -wi-ao---- 407.50g                                                   
  root          rhel_dhcp-8-248 -wi-ao----  50.00g                                                   
  swap          rhel_dhcp-8-248 -wi-ao----   7.77g                                                   
  mysql_thin    servers         twi-a-tz--  30.00g                    33.33                          
  srv-msql3-vol servers         Vwi-a-tz--  30.00g mysql_thin         33.33       

after step4, the Origin Data% is 33.3% of a 30G thin pool, it is about 10G, but it should be about 2.5G.

Comment 12 huiqingding 2014-08-13 02:27:40 UTC

Based on the result of comment 11, I modify the status to "ASSIGNED". If I was wrong, please fix me.

Comment 15 Jeff Nelson 2015-11-05 15:34:33 UTC

Comment 9 suggests that the problem was fixed in upstream QEMU 2.0, but testing results in comment 11 reports that the problem still exists.

Meanwhile, rebasing to QEMU 2.0 has occurred, so the value in Fixed In Version is no longer relevant. Therefore, I'm clearing the Fixed In Version field. Please let me know if you have any questions.

Comment 16 Kevin Wolf 2016-08-05 16:41:57 UTC

I tried to reproduce the failure on a current QEMU version, and I could indeed
see that the full space was taken. On a closer look I saw that QEMU issued a
BLKDISCARDZEROES ioctl, i.e. it asked the kernel whether the device supported
zeroing data with BLKDISCARD, and the answer was no. So that's why it had to
write all of the data explicitly.

I'm not sure what needs to be done so that a thin LV actually advertises that
it supports this (if it's possible at all), but can you please check "blockdev
--getdiscardzeroes" for your LV to see if that's the same problem for you?

If so, then this isn't a QEMU problem because QEMU can only make use of
BLKDISCARD if that guarantees to zero out the image.

Comment 17 Kevin Wolf 2016-08-31 08:36:01 UTC

Almost four weeks without an answer and the 7.3 cycle is nearing its end, so
I'll just assume that my suspicion is right and close the bug. If you
eventually find out that your scenario is different, please reopen.

Note You need to log in before you can comment on or make changes to this bug.