Bug 1332019

Summary:	RFE: storage: report volume <physical> size in XML
Product:	Red Hat Enterprise Linux 7	Reporter:	Shahar Havivi <shavivi>
Component:	libvirt	Assignee:	John Ferlan <jferlan>
Status:	CLOSED ERRATA	QA Contact:	yisun
Severity:	high	Docs Contact:
Priority:	high
Version:	7.2	CC:	baptiste.agasse, crobinso, dyuan, jferlan, jsuchane, jtomko, lcheng, michal.skrivanek, mtessun, nsoffer, pkrempa, rbalakri, shavivi, xuzhang, yisun
Target Milestone:	rc	Keywords:	FutureFeature, Reopened
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	libvirt-3.0.0-1.el7	Doc Type:	Enhancement
Doc Text:	Feature: Display <physical> value for volume definition and allow fetching of "physical" via API call. Reason: Currently libvirt can display the <allocation> and <capacity> values for volumes, but can display the <physical> value only via the virDomainGetBlockInfo. Result: By adding a --physical to the 'virsh vol-info' command to allow displaying the "Physical" value instead of the "Allocation" value. In order to fetch via the API, use the VIR_STORAGE_VOL_GET_PHYSICAL API flag to a new API virStorageVolInfoFlags. Additionally the <physical> value has been added to the output of 'virsh vol-dumpxml'	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-08-01 17:09:12 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1382404, 1456801

Description Shahar Havivi 2016-05-01 09:03:55 UTC

virStorageVolGetInfo returns disk capacity and allocation but not the size in bytes.

Please add size in bytes for volume info.

Comment 2 Peter Krempa 2016-05-03 12:05:04 UTC

I'm not quite sure what you request.

The capacity field represents the maximum size of the storage volume.
The allocation field represents the current size of the storage volume. capacity and allocation differ for sparse volumes such as qcow2. Both are in bytes.

Comment 3 Shahar Havivi 2016-05-03 12:14:11 UTC

python sample:
-------------------------------------------
vol = con.storageVolLookupByPath('/home/images/kvm/fedora22.qcow2')
vol.info()
[0, 8589934592L, 6895493120L]
-------------------------------------------
the info() returns disk allocation and capacity but not size in bytes as you can see via the shell:
~ ls -l fedora22.qcow2
-rwxrwxrwx 1 shahar shahar 8591507456 Apr 26 11:46 fedora22.qcow2*

Comment 4 Peter Krempa 2016-05-03 15:15:15 UTC

(In reply to Shahar Havivi from comment #3)
> python sample:
> -------------------------------------------
> vol = con.storageVolLookupByPath('/home/images/kvm/fedora22.qcow2')
> vol.info()
> [0, 8589934592L, 6895493120L]
> -------------------------------------------
> the info() returns disk allocation and capacity but not size in bytes as you
> can see via the shell:
> ~ ls -l fedora22.qcow2
> -rwxrwxrwx 1 shahar shahar 8591507456 Apr 26 11:46 fedora22.qcow2*

The problem here is with ls and sparse file handling. 

ls -l returns the apparent file size, the same as you get with 'du --apparent-size --block-size=1' [1]. Libvirt returns the actual size that it takes on the disk. You can get the same number by dropping the --aparent-size argument.

[1] man du states:
--apparent-size
    print  apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in ('sparse') files, internal fragmentation, indirect blocks, and the like

Comment 5 Nir Soffer 2016-05-03 16:15:10 UTC

Vdsm would like to report download progress when downloading images using
virStorageVolDownload:
https://libvirt.org/html/libvirt-libvirt-storage.html#virStorageVolDownload

Testing show that we get the actual file size on disk, not the allocation or the capacity of the disk. We would like to get this value in virStorageVolGetInfo
https://libvirt.org/html/libvirt-libvirt-storage.html#virStorageVolGetInfo

For example, I created 102M empty image (using virt manager):

$ qemu-img info /path/to/tiny.qcow2
image: /path/to/tiny.qcow2
file format: qcow2
virtual size: 102M (107374592 bytes)
disk size: 304K
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: true
    refcount bits: 16
    corrupt: false

Downloading it using the download script bellow:

$ python download.py /path/to/tiny.qcow2
written 107741184 bytes

$ qemu-img info download.qcow2
image: download.qcow2
file format: qcow2
virtual size: 102M (107374592 bytes)
disk size: 103M
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: true
    refcount bits: 16
    corrupt: false

We discussed this issue here:
https://www.redhat.com/archives/libvirt-users/2016-April/msg00110.html

Daniel suggested to use <physical> - the actual on disk 
https://www.redhat.com/archives/libvirt-users/2016-April/msg00112.html

But this is not available, and Cole asked that we open an RFE for adding it:
https://www.redhat.com/archives/libvirt-users/2016-April/msg00114.html

----

import sys

import libvirt

if len(sys.argv) < 2:
    print "Usage: download PATH"
    sys.exit(2)

path = sys.argv[1]

con = libvirt.open('qemu:///system')
vol = con.storageVolLookupByPath(path)
stream = con.newStream()
vol.download(stream, 0, 0, 0)

total = 0
with open('download.qcow2', 'wb', 1048576) as f:
    while True:
        buf = stream.recv(1048576)
        if len(buf) == 0:
            stream.finish()
            break
        f.write(buf)
        total += len(buf)

print 'written %d bytes' % total

Comment 6 Cole Robinson 2016-05-03 17:44:07 UTC

We won't be able to extend VolGetInfo() since the return values are hardcoded in the API. But we can put a <physical> value in the XML. So the end result for something like qcow2 will be

capacity: the virtual disk size visible to the VM, read from qcow2 
allocation: the amount of actual host disk allocation that is used
physical: the size of the disk image on the host if it was fully allocated, which will be larger than capacity due to qcow2 metadata, internal snapshots, maybe other bits.

Example: I currently have a file f23.qcow2 with capacity=40G allocation=17G physical=45G, due to several internal snapshots (I suspect)

The physical value _is_ useful for the case Nir specifies: knowing the full size of the file we will download

Comment 7 John Ferlan 2016-06-22 15:04:30 UTC

Move to consideration for 7.4

Comment 11 John Ferlan 2016-11-28 17:09:46 UTC

So does parsing output XML really suffice for the needs here?  That would mean code would have to use the parse function, determine if the <physical> tag was supplied, and then if so make a different decision than when not found (which is no different than today).

One thing I am concerned about is that adding <physical> to the XML would lead down the path of confusion for input XML at least with respect to someone expecting that if they supplied physical on input.

It almost feels as though a new API is needed that could supply the physical, plus perhaps a few more things similar to stats API's added for domains, expect this would be for a volume.  Still, code would need to be created that would be able to use some new API and then handle the results.

The "details" you are looking for can be found in the virDomainBlockInfo - it's just the physical level of detail wasn't carried over into the storage/volume code.

Comment 12 Shahar Havivi 2016-11-29 06:26:35 UTC

does virDomainBlockInfo works fine with file as well,
if so we can use that for the physical size.

Comment 13 John Ferlan 2016-11-29 19:26:41 UTC

Another idea came to me...

How about if I create a virStorageVolInfoFlags API which accepts a 'flags' parameter where if the flags is some value (e.g. VIR_STORAGE_VOL_GET_PHYSICAL), then instead of returning the allocation value, the API will return the physical value in the allocation field (using the same logic as virDomainBlockInfo would use for an inactive domain).

I have a qcow2 file in my default pool:

# ls -al /home/vm-images/test-1g.qcow2
-rw-r--r--. 1 root root 1074135040 Jan 12  2016 /home/vm-images/test-1g.qcow2
# du -b /home/vm-images/test-1g.qcow2
1074135040	/home/vm-images/test-1g.qcow2

For virsh we have:

# virsh vol-info --pool default test-1g.qcow2  --bytes
Name:           test-1g.qcow2
Type:           file
Capacity:       1073741824 bytes
Allocation:     1074143232 bytes

But now I add a --physical option and I'd get:

# virsh vol-info --pool default test-1g.qcow2  --bytes --physical
Name:           test-1g.qcow2
Type:           file
Capacity:       1073741824 bytes
Physical:       1074135040 bytes

Which is I believe what you're looking for.

I have some patches which I'll post on libvir-list to see what kind of feedback I get.  Maybe someone else has a more brilliant idea or maybe they dislike reusing the allocation field.

The python equivalent would be:

>>> import os
>>> import libvirt
>>> con = libvirt.open('qemu:///system')
>>> vol = con.storageVolLookupByPath('/home/vm-images/test-1g.qcow2')
>>> vol.infoFlags(0)
[0, 1073741824L, 1074143232L]
>>> vol.infoFlags(libvirt.VIR_STORAGE_VOL_GET_PHYSICAL)
[0, 1073741824L, 1074135040L]
>>> quit()

Comment 14 Nir Soffer 2016-11-29 20:15:32 UTC

(In reply to John Ferlan from comment #13)
> Another idea came to me...
> 
> How about if I create a virStorageVolInfoFlags API which accepts a 'flags'
> parameter where if the flags is some value (e.g.
> VIR_STORAGE_VOL_GET_PHYSICAL), 

Sounds good...

> then instead of returning the allocation
> value, the API will return the physical value in the allocation field (using
> the same logic as virDomainBlockInfo would use for an inactive domain).

I think it should return physical additionally, not instead.

> But now I add a --physical option and I'd get:
> 
> # virsh vol-info --pool default test-1g.qcow2  --bytes --physical
> Name:           test-1g.qcow2
> Type:           file
> Capacity:       1073741824 bytes
> Physical:       1074135040 bytes

I think this should be:

# virsh vol-info --pool default test-1g.qcow2  --bytes --physical
Name:           test-1g.qcow2
Type:           file
Capacity:       1073741824 bytes
Allocation:     1074143232 bytes
Physical:       1074135040 bytes

> Which is I believe what you're looking for.
> 
> I have some patches which I'll post on libvir-list to see what kind of
> feedback I get.  Maybe someone else has a more brilliant idea or maybe they
> dislike reusing the allocation field.
> 
> The python equivalent would be:
> 
> >>> import os
> >>> import libvirt
> >>> con = libvirt.open('qemu:///system')
> >>> vol = con.storageVolLookupByPath('/home/vm-images/test-1g.qcow2')
> >>> vol.infoFlags(0)
> [0, 1073741824L, 1074143232L]
> >>> vol.infoFlags(libvirt.VIR_STORAGE_VOL_GET_PHYSICAL)
> [0, 1073741824L, 1074135040L]

If you add the new flag, you can return 4 values instead of 3, there is no
compatibility issue since old code cannot call the new flag.

So it will return:

>>> vol.infoFlags(libvirt.VIR_STORAGE_VOL_GET_PHYSICAL)
[0, 1073741824L, 1074135040L, 1074135040L]

Comment 15 John Ferlan 2016-11-30 12:13:11 UTC

So again, back to the original problem - virStorageVolGetInfo uses struct:

struct _virStorageVolInfo {
    int type;                      /* virStorageVolType flags */
    unsigned long long capacity;   /* Logical size bytes */
    unsigned long long allocation; /* Current allocation bytes */
};

We cannot extend that structure as part of our API contract.

So while virsh could make two calls to get all the data it needs, the API calls (what python ends up using) still would only be able to return two.

I'm adding a virStorageVolGetInfoFlags as a "shim" of sorts for virStorageVolGetInfo and that means I'm using the same data structure to pass data. This is "standard" usage model for other similar API's when the "Flags" argument (or something similar) has been added.

BTW: Clients would also have to be able to "handle" when the target server doesn't have the *Flags API and fall back to the "old" API.  That is, make a call to the *Flags API, then if there's an error check for VIR_ERR_NO_SUPPORT as being the reason. If so, the only the old API will work.

In order to return all 3 in one call, there'd have to be a new API, say for example virStorageVolGetStats that would mimic the other stats API's and would use the typed params model in order to allow the data collected to be extended more easily. That's a bit more involved adjustment though and would also require client side adjustments to parse returned data. There's plenty of examples, search around for 'nparams' usages.

Comment 16 Shahar Havivi 2016-11-30 12:19:04 UTC

btw does virDomainBlockInfo return the size info to file as well by design, e.g. it will continue to do return info in the future for local file?

Comment 17 John Ferlan 2016-11-30 13:48:22 UTC

Not exactly sure of the question - although I do see you've now posted on libvirt-users as well which I'll digest a bit later. Although I think what you could do is try things out yourself for the various formats you're concerned about and see what you get.

FWIW: The "public" docs on the structure are at:

http://libvirt.org/html/libvirt-libvirt-domain.html#virDomainBlockInfo

So, if I add that same test-1g.qcow2 to a domain, then I get:

# virsh domblkinfo f23 vdb
Capacity: 1073741824
Allocation: 1074139136
Physical: 1074135040

*when* the domain is *not* running. That I'd expect from the code algorithm. It's essentially making the same calls as the volume code would make (although different functions).

However, when the domain was running that changed to:

# virsh domblkinfo f23 vdb
Capacity: 1073741824
Allocation: 0
Physical: 1074139136

because when it's running we ask qemu via 'query-blockstats' and nothing touched the device yet, so nothing's been updated. I think that's a bug and have a patch for that which would set Allocation to 1074139136.

With or without that patch, once I do something like 'parted /dev/vdb mklabel gpt --script', then I get:

# virsh domblkinfo f23 vdb
Capacity: 1073741824
Allocation: 1074135040
Physical: 1074139136

Which, yes seems to be backwards unless you read the comments for virDomainBlockInfo struct in libvirt-domain.h which state (in part):

* - qcow2 file in filesystem
* * capacity: logical size from qcow2 header
* * allocation: disk space occupied by file
* * physical: reported size of qcow2 file
*
* - qcow2 file in a block device
* * capacity: logical size from qcow2 header
* * allocation: highest qcow extent written for an active domain
* * physical: size of the block device container

This would be the "former" where allocation is listed as "disk space occupied by file"... and of course now I'm thoroughly confused too (I didn't write the API, I'm just now reading/looking at it now). I haven't tried the qcow2 block device yet. The qemuDomainGetBlockInfo doesn't distinguish between raw/qcow2 when setting physical - it just takes what it got back from qemu. I have to investigate a bit more...

Comment 18 Shahar Havivi 2016-12-01 07:36:53 UTC

Thank you for the explanation,
We probe the information only when the VM is down, so getting the physical size via virDomainBlockInfo is fine from vdsm point of view,
we just need to know if virDomain.blockPeek is not an issue as we use virStorageVol.download (performance wise - as posted in the mailing list).

if so we can close this bug.

Comment 19 Yaniv Kaul 2016-12-14 09:34:50 UTC

(In reply to Shahar Havivi from comment #18)
> Thank you for the explanation,
> We probe the information only when the VM is down, so getting the physical
> size via virDomainBlockInfo is fine from vdsm point of view,
> we just need to know if virDomain.blockPeek is not an issue as we use
> virStorageVol.download (performance wise - as posted in the mailing list).
> 
> if so we can close this bug.

If we can close the bug:
1. Please do so.
2. Please solve bug 1382404 - which so far depended on this one.

Comment 20 Shahar Havivi 2016-12-14 09:49:55 UTC

I am not sure,
if we can use virDomainBlockInfo for file and get the physical size when the VM is down we can close this bug and bug 138404 can be solved with that solution.

Peter can we relay on that (libvirt didn't answer on my mailing list).

Comment 21 John Ferlan 2016-12-16 19:22:54 UTC

If only the answer to your questions were simple...  It's the details that make things a bit more difficult. 

I've been doing a bit of research. During my research, I found this gem of a commit - it really describes things well for at least the GetBlockInfo results:

http://libvirt.org/git/?p=libvirt.git;a=commit;h=0282ca45a0a4c384226d526d4878cacd77936ef4

which more or less summarizes to:

     capacity: logical size in bytes of the image (how much storage
               the guest will see)
     allocation: host storage in bytes occupied by the image (such
                 as highest allocated extent if there are no holes,
                 similar to 'du')
     physical: host physical size in bytes of the image container
               (last offset, similar to 'ls')

For a non running guest, the physical is the result of either a "fstat" call and return of "st_size" (for guest devices backed by a file) or an "lseek" call and return of a SEEK_END value (for guest devices backed by a block device).

Where this gets interesting/tricky is devices in a guest using a block storage to some backend file that's a container (such as qcow2).

I can create a 1G qcow "file", have it managed it through iSCSI as some /dev/sdX device and if the original file is not fully preallocated, the physical value will show up as essentially what is seen by the "allocation". An "active" system would be able to get the 1G size and it'd be known as the "capacity".

So how does this relate to blockPeek?  Well the implementation of blockPeek in the qemu_driver code (qemuDomainBlockPeek) is essentially an lseek(). 

In my limited testing it would seem that the blockPeek would return you the same value as DomainBlockInfo 'physical', but I haven't gone through an exhaustive list of combinations. I do know that if the domain is active, in more recent libvirt releases DomainBlockInfo seems to be providing the wrong answer especially when the backing source device is either sparse or some container (there's patches upstream on those).

So can virDomainBlockInfo be used for file backed storage, yes... Is it providing the right answer... Well I hope so - if not for some specific then we need to know... and would virDomainBlockPeek provide the same answer as virDomainBlockInfo for a file or block backed storage - it would seem so. Again, if it's not, then the details of which options are getting the right answer would need to be known.

Comment 22 John Ferlan 2016-12-20 20:38:52 UTC

FYI: I pushed a couple of patches related to this, so I'm going to move this to POST

commit f62e418c86d95acba938e8fcead8fd720bd58fb1
Author: John Ferlan <jferlan>
Date:   Tue Nov 29 13:25:42 2016 -0500

    virsh: Allow display of the physical volume size
    
    Add a new qualifier '--physical' to the 'vol-info' command in order to
    dispaly the physical size of the volume. The size can differ from the
    allocation value depending on the volume file time. In particular, qcow2
    volumes will have a physical value larger than allocation. This also occurs
    for sparse files, although for those the capacity is the largest size;
    whereas, for qcow2 capacity is the logical size.

which requires:

commit 0c234889c4b0bd21ab2103c8bbac0290db1ff600
Author: John Ferlan <jferlan>
Date:   Tue Nov 29 10:44:36 2016 -0500

    storage: Introduce virStorageVolInfoFlags

    This function will essentially be a wrapper to virStorageVolInfo in order
    to provide a mechanism to have the "physical" size of the volume returned
    instead of the "allocation" size. This will provide similar capabilities to
    the virDomainBlockInfo which can return both allocation and physical of a
    domain storage volume.
    
    NB: Since we're reusing the _virStorageVolInfo and not creating a new
    _virStorageVolInfoFlags structure, we'll need to generate the rpc APIs
    remoteStorageVolGetInfoFlags and remoteDispatchStorageVolGetInfoFlags
    (although both were originally created from gendispatch.pl and then
    just copied into daemon/remote.c and src/remote/remote_driver.c).
    
    The new API will allow the usage of a VIR_STORAGE_VOL_GET_PHYSICAL flag
    and will make the decision to return the physical or allocation value
    into the allocation field.
    
    In order to get that physical value, virStorageBackendUpdateVolTargetInfoFD
    adds logic to fill in physical value matching logic in qemuStorageLimitsRefresh
    used by virDomainBlockInfo when the domain is inactive.


I also added:

commit 78661cb1f45742f27430b2629056f0397ceb2fd2
Author: John Ferlan <jferlan>
Date:   Tue Dec 13 10:56:21 2016 -0500

    conf: Display <physical> in output of voldef
    
    Although the virStorageBackendUpdateVolTargetInfo will update the
    target.physical value, there is no way to provide that information
    via the virStorageGetVolInfo API since it only returns the capacity
    and allocation of a volume. So as described in commit id '0282ca45',
    it should be possible to generate an output only <physical> value
    for that purpose.
    
    This patch generates the <physical> value in the volume XML output
    for the sole purpose of being able to view/see the value to allow
    someone to parse the XML in order to obtain the value.
    
    Update the documentation to describe the output only nature.
...

Comment 24 yisun 2017-03-01 10:22:51 UTC

Test against: 
libvirt-3.0.0-2.el7.x86_64
qemu-kvm-rhev-2.8.0-4.el7.x86_64

PASSED

==========================================
Scenario 1: test with qcow2 img
==========================================
1. create a 100M qcow2 file
## qemu-img create -f qcow2 /var/lib/libvirt/images/test.qcow2 100M
Formatting '/var/lib/libvirt/images/test.qcow2', fmt=qcow2 size=104857600 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

2. start a vm using this qcow2 img
## virsh dumpxml vm1 | grep test.qcow2 -a5
...
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/var/lib/libvirt/images/test.qcow2'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
...

## virsh start vm1
Domain vm1 started

3. check the img size info
## virsh vol-info /var/lib/libvirt/images/test.qcow2
Name:           test.qcow2
Type:           file
Capacity:       100.00 MiB
Allocation:     196.00 KiB

## virsh vol-info /var/lib/libvirt/images/test.qcow2 --bytes
Name:           test.qcow2
Type:           file
Capacity:       104857600 bytes
Allocation:     200704 bytes

 ## du /var/lib/libvirt/images/test.qcow2 -h -B 1
200704    /var/lib/libvirt/images/test.qcow2
<== correct

## virsh vol-info /var/lib/libvirt/images/test.qcow2 --physical
Name:           test.qcow2
Type:           file
Capacity:       100.00 MiB
Physical:       192.01 KiB


## virsh vol-info /var/lib/libvirt/images/test.qcow2 --bytes --physical
Name:           test.qcow2
Type:           file
Capacity:       104857600 bytes
Physical:       196616 bytes

ll /var/lib/libvirt/images/test.qcow2
-rw-r--r--. 1 qemu qemu 196616 Mar  1 16:41 /var/lib/libvirt/images/test.qcow2
<==  correct

4. login vm and do some disk io to vdb
## virsh console vm1
Connected to domain vm1

[root@yisun_vm1 ~]# mkfs.ext4 /dev/vdb
...
Writing superblocks and filesystem accounting information: done

[root@yisun_vm1 ~]# mount /dev/vdb /mnt
[root@yisun_vm1 ~]# cd /mnt
[root@yisun_vm1 mnt]# dd if=/dev/urandom of=/mnt/file.10M bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.551078 s, 19.0 MB/s
[root@yisun_vm1 mnt]# sync


5. On host, check the vol-info again
 ## virsh vol-info /var/lib/libvirt/images/test.qcow2 --bytes
Name:           test.qcow2
Type:           file
Capacity:       104857600 bytes
Allocation:     33689600 bytes

## du /var/lib/libvirt/images/test.qcow2 -h -B 1
33689600    /var/lib/libvirt/images/test.qcow2
<== correct


## virsh vol-info /var/lib/libvirt/images/test.qcow2 --bytes --physical
Name:           test.qcow2
Type:           file
Capacity:       104857600 bytes
Physical:       19136512 bytes

## ll /var/lib/libvirt/images/test.qcow2
-rw-r--r--. 1 qemu qemu 19136512 Mar  1 16:56 /var/lib/libvirt/images/test.qcow2
<== correct

6. check the vol's xml output
## virsh vol-dumpxml /var/lib/libvirt/images/test.qcow2
<volume type='file'>
  <name>test.qcow2</name>
  <key>/var/lib/libvirt/images/test.qcow2</key>
  <source>
  </source>
  <capacity unit='bytes'>104857600</capacity>
  <allocation unit='bytes'>33689600</allocation> <====== correct
  <physical unit='bytes'>19136512</physical>   <====== correct
  <target>
    <path>/var/lib/libvirt/images/test.qcow2</path>
    <format type='qcow2'/>
    <permissions>
      <mode>0644</mode>
      <owner>107</owner>
      <group>107</group>
      <label>system_u:object_r:svirt_image_t:s0:c61,c223</label>
    </permissions>
    <timestamps>
      <atime>1488358654.397527003</atime>
      <mtime>1488358606.935899625</mtime>
      <ctime>1488358606.935899625</ctime>
    </timestamps>
    <compat>1.1</compat>
    <features/>
  </target>
</volume>


==========================================
Scenario 2: test with raw img
==========================================
1. prepare a 100M raw img
## qemu-img create -f raw /var/lib/libvirt/images/test.raw 100M
Formatting '/var/lib/libvirt/images/test.raw', fmt=raw size=104857600


2. start a vm using this img as vdb
 ## virsh dumpxml vm1 | grep test.raw -a5
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/test.raw'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>

## virsh start vm1
Domain vm1 started

3. login vm and create a 10M file in vdb
[root@yisun_vm1 ~]# mkfs.ext4 /dev/vdb
...      
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

[root@yisun_vm1 ~]# mount /dev/vdb /mnt
[root@yisun_vm1 ~]# dd if=/dev/urandom of=/mnt/file bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.552296 s, 19.0 MB/s
[root@yisun_vm1 ~]# sync


4. In host check the img's size info with vol-info
## virsh vol-info /var/lib/libvirt/images/test.raw
Name:           test.raw
Type:           file
Capacity:       100.00 MiB
Allocation:     17.51 MiB
## virsh vol-info /var/lib/libvirt/images/test.raw --bytes
Name:           test.raw
Type:           file
Capacity:       104857600 bytes
Allocation:     18362368 bytes
## du /var/lib/libvirt/images/test.raw -h -B 1
18362368    /var/lib/libvirt/images/test.raw
<=== correct

## virsh vol-info /var/lib/libvirt/images/test.raw --physical
Name:           test.raw
Type:           file
Capacity:       100.00 MiB
Physical:       100.00 MiB

## virsh vol-info /var/lib/libvirt/images/test.raw --physical --bytes
Name:           test.raw
Type:           file
Capacity:       104857600 bytes
Physical:       104857600 bytes
## ls -l /var/lib/libvirt/images/test.raw
-rw-r--r--. 1 qemu qemu 104857600 Mar  1 17:54 /var/lib/libvirt/images/test.raw
<== correct

5. check the vol's xml with vol-dumpxml
## virsh vol-dumpxml /var/lib/libvirt/images/test.raw
<volume type='file'>
  <name>test.raw</name>
  <key>/var/lib/libvirt/images/test.raw</key>
  <source>
  </source>
  <capacity unit='bytes'>104857600</capacity>
  <allocation unit='bytes'>18362368</allocation> <=== correct
  <physical unit='bytes'>104857600</physical> <=== correct
  <target>
    <path>/var/lib/libvirt/images/test.raw</path>
    <format type='raw'/>
    <permissions>
      <mode>0644</mode>
      <owner>107</owner>
      <group>107</group>
      <label>system_u:object_r:svirt_image_t:s0:c30,c975</label>
    </permissions>
    <timestamps>
      <atime>1488362201.883382323</atime>
      <mtime>1488362087.313870217</mtime>
      <ctime>1488362087.313870217</ctime>
    </timestamps>
  </target>
</volume>
==========================================
Scenario 3: test with disk pool volume (file system)
==========================================
1. prepare a disk pool "sdd" with a partition "sdd1"
## virsh pool-dumpxml sdd
<pool type='disk'>
  <name>sdd</name>
  <uuid>2c136b6d-8da1-43e7-b9a0-c1ed3f4fba67</uuid>
  <capacity unit='bytes'>8003197440</capacity>
  <allocation unit='bytes'>109738496</allocation>
  <available unit='bytes'>7893426688</available>
  <source>
    <device path='/dev/sdd'>
      <freeExtent start='109770752' end='8003197440'/>
    </device>
    <format type='dos'/>
  </source>
  <target>
    <path>/dev</path>
    <permissions>
      <mode>0700</mode>
      <owner>0</owner>
      <group>0</group>
    </permissions>
  </target>
</pool>


## virsh vol-list sdd --details
 Name  Path       Type     Capacity  Allocation
------------------------------------------------
 sdd1  /dev/sdd1  block  104.65 MiB  104.65 MiB
2. start a vm using this volume as virtual disk
## virsh dumpxml vm1 | grep sdd -a5
...
    <disk type='volume' device='disk'>
      <driver name='qemu' type='raw'/>
      <source pool='sdd' volume='sdd1'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
...

## virsh start vm1
Domain vm1 started
3. login vm and create a 10M file in vdb
## virsh console vm1
Connected to domain vm1
[root@yisun_vm1 ~]# mkfs.ext4 /dev/vdb
...
Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

[root@yisun_vm1 ~]# mount /dev/vdb /mnt
[root@yisun_vm1 ~]# dd if=/dev/urandom of=/mnt/file bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.550929 s, 19.0 MB/s
[root@yisun_vm1 ~]# sync
4. check the vol-info size in host
root@localhost~  ## virsh vol-info --pool sdd sdd1
Name:           sdd1
Type:           block
Capacity:       104.65 MiB
Allocation:     104.65 MiB

root@localhost~  ## virsh vol-info --pool sdd sdd1 --bytes
Name:           sdd1
Type:           block
Capacity:       109738496 bytes
Allocation:     109738496 bytes

root@localhost~  ## virsh vol-info --pool sdd sdd1 --physical
Name:           sdd1
Type:           block
Capacity:       104.65 MiB
Physical:       104.65 MiB

root@localhost~  ## virsh vol-info --pool sdd sdd1 --physical --bytes
Name:           sdd1
Type:           block
Capacity:       109738496 bytes
Physical:       109738496 bytes

since it's a partition, use fdisk to check the size info
root@localhost~  ## fdisk -l /dev/sdd1
Disk /dev/sdd1: 109 MB, 109738496 bytes, 214333 sectors   <=== correct
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


5. check the vol's xml info with vol-dumpxml
 ## virsh vol-dumpxml /dev/sdd1
<volume type='block'>
  <name>sdd1</name>
  <key>/dev/sdd1</key>
  <source>
    <device path='/dev/sdd'>
      <extent start='32256' end='109770752'/>
    </device>
  </source>
  <capacity unit='bytes'>109738496</capacity>
  <allocation unit='bytes'>109738496</allocation> <== correct
  <physical unit='bytes'>109738496</physical> <== correct
  <target>
    <path>/dev/sdd1</path>
    <format type='none'/>
    <permissions>
      <mode>0660</mode>
      <owner>0</owner>
      <group>6</group>
      <label>system_u:object_r:fixed_disk_device_t:s0</label>
    </permissions>
    <timestamps>
      <atime>1488356966.299259373</atime>
      <mtime>1488356966.299259373</mtime>
      <ctime>1488356966.299259373</ctime>
    </timestamps>
  </target>
</volume>

Comment 25 yisun 2017-04-06 06:14:26 UTC

(In reply to John Ferlan from comment #13)
> Another idea came to me...
> 
> How about if I create a virStorageVolInfoFlags API which accepts a 'flags'
> parameter where if the flags is some value (e.g.
> VIR_STORAGE_VOL_GET_PHYSICAL), then instead of returning the allocation
> value, the API will return the physical value in the allocation field (using
> the same logic as virDomainBlockInfo would use for an inactive domain).
> 
> I have a qcow2 file in my default pool:
> 
> # ls -al /home/vm-images/test-1g.qcow2
> -rw-r--r--. 1 root root 1074135040 Jan 12  2016 /home/vm-images/test-1g.qcow2
> # du -b /home/vm-images/test-1g.qcow2
> 1074135040	/home/vm-images/test-1g.qcow2

Hi John,
I met some failures of automated cases.
And the reason is that we used "ls -al /path/to/file" to get the file's size, and compare with the virsh's "physical size". sometimes it'll be different. But the cases is about "domblkinfo", not vol-info since it's not automated yet.

And I reread this bug and found some key info in comment 4 provided by Peter as follow:
"""
ls -l returns the apparent file size, the same as you get with 'du --apparent-size --block-size=1' [1]. Libvirt returns the actual size that it takes on the disk. You can get the same number by *** dropping the --aparent-size argument ***.
"""

So, as Peter said, I thought the physical size is returned by:
du /path/to/file --block-size=1

But in your comment 13, you mentioned:
du /path/to/file -b (= du /path/to/file --apparent-size --block-size=1)

So which one is correct?

And with domblkinfo test, when I remove the --apparent-size, it always passed, but with --apparent-size, it'll be failed occassionally

Comment 26 John Ferlan 2017-04-07 14:08:23 UTC

Needed to dredge up my recollections in this area as late Nov. and early Dec. seems so long ago.

First off I'll use du [-b|-k|-m] ... depending upon the output for which I'm comparing the calculation. I did not "dig into" the du code in order to determine the differences between using just '--block-size=1' and '--block-size=1 --apparent-size'. While working on this code I do really keeping a hand written table of the various ways the output was formatted (ls, du, vol-info, domblkinfo, and qemu-img). Suffice to say, there's a bit of variations.

IIRC, prior to any of the changes that went into altering the libvirt code there were 3 different methods to get block sizes, one via volume processing, one if the guest is running via getblkinfo, and one when the guest wasn't running via getblkinfo.

Furthermore, each had some sort of differing results when a sparse file was involved or even more confusing when a device backed by a file was being used (e.g. like iSCSI or for pure enjoyment a device backed by a qcow2 file).

In the long run, there were quite a few patches that got me to the point of being able to "converge" algorithms used by qemu and storage backends... The goal I had was getting a consistent response from libvirt for the output. A result that was also consistent with how qemu-img reported things and less so with the differences between how ls and du reported things.

See the series starting at:
https://www.redhat.com/archives/libvir-list/2016-December/msg00020.html

Then a single patch:
https://www.redhat.com/archives/libvir-list/2016-December/msg00568.html

and finally the series that was for this patch:
https://www.redhat.com/archives/libvir-list/2016-December/msg00570.html

So back to what seems to be the question/issue - whether QE should use 'ls' or 'du' in order to validate results... Perhaps this is best answered by reading the tea leaves of what "--apparent-size" output is (I think) hinting at and what it appears you've found from testing - the output generated by --apparent-size is somewhat indeterminate and can *change* depending on a number of factors (sparse, fragmentation, indirect blocks, etc.) - so is that a result you want to necessarily trust as the be all end all? Secondary to that is how much are the results off by? Is there a standard deviation? If the test doesn't change, but calculation provided by du does change between runs and it causes your tests to occasionally fail, is it the best value to use? Is that bad value always the same bad value or does it change for each of the failures? Does the bad value match the ls output value?

One thing I recall during testing of all this while also looking at the guest results is that it was "important" from the libvirt perspective at least that the guest actually opened the file; otherwise, the highwater mark wasn't necessarily known and would return 0. And by "guest results" I mean logging into a guest and seeing what it showed for volume size. As soon as I did that I would get similar results to qemu-img.

So, here's a suggestion - use qemu-img instead of du and see what happens.

Comment 27 yisun 2017-04-10 06:44:08 UTC

(In reply to John Ferlan from comment #26)
> Needed to dredge up my recollections in this area as late Nov. and early
> Dec. seems so long ago.
> 
> First off I'll use du [-b|-k|-m] ... depending upon the output for which I'm
> comparing the calculation. I did not "dig into" the du code in order to
> determine the differences between using just '--block-size=1' and
> '--block-size=1 --apparent-size'.  While working on this code I do really
> keeping a hand written table of the various ways the output was formatted
> (ls, du, vol-info, domblkinfo, and qemu-img).  Suffice to say, there's a bit
> of variations.

The "ls -l" returns the size of "1352859648"
"du --block-size=1 --apparent-szie" returns "2147418112"
and "qemu-img info --output=json" returns "2147418112"
As follow:
            02:37:46|INFO |the block size from 'ls' is 1352859648
            02:37:46|INFO |the Physical value is 2147418112
            02:37:46|ERROR |qemu-img info output is: {
    "virtual-size": 10737418240, 
    "filename": "/var/lib/libvirt/images/libvirt-test-api", 
    "cluster-size": 65536, 
    "format": "qcow2", 
    "actual-size": 2147418112, 
    "format-specific": {
        "type": "qcow2", 
        "data": {
            "compat": "1.1", 
            "lazy-refcounts": false
        }
    }, 
    "dirty-flag": false
}

So we'll use qemu-img info as baseline, thx for your confirmation

Comment 28 errata-xmlrpc 2017-08-01 17:09:12 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Comment 29 errata-xmlrpc 2017-08-01 23:51:16 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846