1077068 – Wrong allocation size when create/resize volumes in NFS pool

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1077068 - Wrong allocation size when create/resize volumes in NFS pool

Summary: Wrong allocation size when create/resize volumes in NFS pool

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	John Ferlan
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	1140250
Blocks:
TreeView+	depends on / blocked

Reported:	2014-03-17 06:36 UTC by yanbing du
Modified:	2016-11-03 18:07 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-11-03 18:07:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2016:2577	0	normal	SHIPPED_LIVE	Moderate: libvirt security, bug fix, and enhancement update	2016-11-03 12:07:06 UTC
Sourceware	15661	0	P2	RESOLVED	posix_fallocate fallback code buggy and dangerous	2020-07-01 15:14:32 UTC

Description yanbing du 2014-03-17 06:36:46 UTC

Description of problem:
When create or re-size(with --allocate option), a volume in a NFS type pool, the allocate size is not correct.

Version-Release number of selected component (if applicable):
libvirt-1.1.1-27.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.1. Define and start a NFS pool
# virsh pool-dumpxml temp_pool_1
<pool type='netfs'>
  <name>temp_pool_1</name>
  <uuid>80c94197-348e-4bb6-a35c-fef8e490c09c</uuid>
  <capacity unit='bytes'>105555951616</capacity>
  <allocation unit='bytes'>15330181120</allocation>
  <available unit='bytes'>90225770496</available>
  <source>
    <host name='localhost'/>
    <dir path='/home/ydu/Work/virt-test/tmp/nfs-export'/>
    <format type='auto'/>
  </source>
  <target>
    <path>/home/ydu/Work/virt-test/tmp/tmp_pool_target</path>
    <permissions>
      <mode>0755</mode>
      <owner>-1</owner>
      <group>-1</group>
    </permissions>
  </target>
</pool>

2. Create a volume in th pool
# virsh vol-create-as nfs-pool test-vol1 --capacity 10M --allocation 10M
Vol test-vol1 created

# virsh vol-info test-vol1 nfs-pool
Name:           test-vol1
Type:           file
Capacity:       10.00 MiB
Allocation:     664.00 KiB

# virsh vol-resize test-vol1 --pool nfs-pool 20M
Size of volume 'test-vol1' successfully changed to 20M

# virsh vol-info test-vol1 nfs-pool
Name:           test-vol1
Type:           file
Capacity:       20.00 MiB
Allocation:     668.00 KiB

# virsh vol-resize test-vol1 --pool nfs-pool 40M --allocate
Size of volume 'test-vol1' successfully changed to 40M

# virsh vol-info test-vol1 nfs-pool
Name:           test-vol1
Type:           file
Capacity:       40.00 MiB
Allocation:     1.30 MiB



Actual results:
After create/resize, allocation of the volume is wrong.

Expected results:
Allocation of the volume should equal to the specified size.


Additional info:

Comment 1 chhu 2014-03-21 02:58:47 UTC

Hit this issue, when the netfs pool with format type=glusterfs

Description of problem:
vol-create-as --allocation failed to allocate the given size in netfs pool with format type=glusterfs

Version-Release number of selected component (if applicable):
libvirt-1.1.1-28.el7.x86_64
qemu-kvm-1.5.3-53.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. define,build and start a netfs pool with format type=glusterfs
# virsh pool-dumpxml netfs-gluster
<pool type='netfs'>
  <name>netfs-gluster</name>
  <uuid>d5609ced-94b1-489e-b218-eff35c30336a</uuid>
  <capacity unit='bytes'>856564301824</capacity>
  <allocation unit='bytes'>43650121728</allocation>
  <available unit='bytes'>812914180096</available>
  <source>
    <host name='10.66.84.12'/>
    <dir path='/gluster-vol1'/>
    <format type='glusterfs'/>
  </source>
  <target>
    <path>/mnt/gluster</path>
    <permissions>
      <mode>0755</mode>
      <owner>-1</owner>
      <group>-1</group>
    </permissions>
  </target>
</pool>

# virsh pool-list --all
Name                 State      Autostart
-----------------------------------------
default              active     yes      
netfs-gluster        active     no

2. vol-create-as --allocation to create a volume in netfs pool, the volume allocation is wrong

# virsh vol-create-as netfs-gluster rh7-crtas.img --capacity 4G --allocation 1G --format raw
Vol rh7-crtas.img created

# virsh vol-info rh7-crtas.img netfs-gluster
Name:           rh7-crtas.img
Type:           file
Capacity:       4.00 GiB
Allocation:     32.10 MiB

# virsh vol-create-as netfs-gluster rh7-qcow2-crtas.img --capacity 4G --allocation 1G --format qcow2
Vol rh7-qcow2-crtas.img created

# virsh vol-info rh7-qcow2-crtas.img netfs-gluster
Name:           rh7-qcow2-crtas.img
Type:           file
Capacity:       4.00 GiB
Allocation:     192.50 KiB

3. vol-create-as --allocation to create a volume in default pool, the volume allocation is correct
# virsh vol-create-as default rh7-crtas-default.img --capacity 4G --allocation 1G --format raw
Vol rh7-crtas-default.img created

# virsh vol-info rh7-crtas-default.img default
Name:           rh7-crtas-default.img
Type:           file
Capacity:       4.00 GiB
Allocation:     1.00 GiB

# virsh vol-create-as default rh7-qcow2-crtas-default.img --capacity 4G --allocation 1G --format raw
Vol rh7-qcow2-crtas-default.img created

# virsh vol-info rh7-qcow2-crtas-default.img default
Name:           rh7-qcow2-crtas-default.img
Type:           file
Capacity:       4.00 GiB
Allocation:     1.00 GiB

Actual results:
In step2: vol-create-as --allocation to create a volume in netfs pool, the volume allocation is wrong

Expected results:
In step2: vol-create-as --allocation to create a volume in netfs pool, the volume allocation is correct

Comment 2 John Ferlan 2014-08-01 19:30:14 UTC

Reproducing problem for NFS pool is rather simple - although the expectations of the commands used are perhaps not necessarily true as compared to what would happen with a "raw" file in a file backed pool.

For example, using the default pool:

# virsh vol-create-as default test-vol1 --capacity 10M --allocation 10M
Vol test-vol1 created

# virsh vol-info  test-vol1 default
Name:           test-vol1
Type:           file
Capacity:       10.00 MiB
Allocation:     10.00 MiB

# virsh vol-resize test-vol1 --pool default 20M
Size of volume 'test-vol1' successfully changed to 20M

# virsh vol-info  test-vol1 default
Name:           test-vol1
Type:           file
Capacity:       20.00 MiB
Allocation:     10.00 MiB

# virsh vol-resize test-vol1 --pool default 40M --allocate
Size of volume 'test-vol1' successfully changed to 40M

# virsh vol-info  test-vol1 default
Name:           test-vol1
Type:           file
Capacity:       40.00 MiB
Allocation:     40.00 MiB

#

So for the resize of 20M - you'll note that the Allocation doesn't change and I wouldn't expect it to since you're just changing capacity with the 20M.

In any case, raw file for both the default and nfs pools share the same code in order to get the capacity/allocation data.  Unfortunately, using (f)stat() on an nfs pool file results in a different value than the default pool. For example in my environment I see the following results from an "fstat(fd, sb)" for the initial creation:

   st_size=10485760 st_blocks=88 bsize=1048576 BSIZE=512

The libvirt code will multiply st_blocks and BSIZE resulting in an allocation value of 45056 bytes or in my case 44 KiB. The stat() man page does describe possible issues using st_blocks and bsize in an NFS environment.

I'm investigating ways to resolve this.

With respect to the Gluster tacked on comment - I would assume it's a similar issue for Gluster, although I don't have one of those configurations to test.

Comment 3 John Ferlan 2014-08-04 13:43:27 UTC

A bit more digging on this makes me believe this issue has nothing to do with libvirt, but rather is either "how it works" or an NFS issue. 

My example pool is :

<pool type='netfs'>
  <name>test_nfs_pool</name>
  <source>
    <host name='localhost'/>
    <dir path='/home/nfs_pool/nfs-export'/>
    <format type='auto'/>
  </source>
  <target>
    <path>/home/nfs_pool/target</path>
    <permissions>
      <mode>0755</mode>
      <owner>-1</owner>
      <group>-1</group>
    </permissions>
  </target>
</pool>

# virsh vol-create-as test_nfs_pool test-vol1 --capacity 10M --allocation 10M
Vol test-vol1 created

# virsh vol-info test-vol1 test_nfs_pool
Name:           test-vol1
Type:           file
Capacity:       10.00 MiB
Allocation:     44.00 KiB

#

Let's also consider what 'du' shows:

# du -s /home/nfs_pool/target/*
44	/home/nfs_pool/target/test-vol1
# du -k /home/nfs_pool/target/*
44	/home/nfs_pool/target/test-vol1
# du -b /home/nfs_pool/target/*
10485760	/home/nfs_pool/target/test-vol1
# du -m /home/nfs_pool/target/*
1	/home/nfs_pool/target/test-vol1
#

where the 44K comes from stat st_blocks=88 * 512 / 1024 (512 being block size and 1024 for the KiB value.

Reading the man page on du, the '-b' option assumes '--apparent-size --block-size=1' is being used which thus uses the st_size value for output.

I suppose libvirt could do the same with respect to using the "st_size" value for 'nfs' volumes only - I'll try a patch with that and see what kind of response I get.

Comment 4 John Ferlan 2014-08-05 14:48:59 UTC

Since the base problem is an NFS issue and the Gluster generally goes through different paths - can you create a separate bug report for Gluster.  Please recheck your results using the top of the libvirt tree though as there's been many modifications to upstream libvirt since your comment was added

Comment 5 John Ferlan 2014-08-05 14:50:37 UTC

In order to resolve the base problem regarding NFS - I have sent a patch upstream for review, see:

http://www.redhat.com/archives/libvir-list/2014-August/msg00110.html

Note that the issue can also be seen from 'virsh domblkinfo' once the storage volume is added to the guest.

Comment 7 Yang Yang 2014-08-07 05:46:04 UTC

Cannot reproduce the bug in netfs pool with format type=glusterfs in libvirt-1.2.7-1.el7.x86_64

Steps:

1. Create a netfs pool with format type=glusterfs with the following xml

# virsh pool-dumpxml netfs-gluster
<pool type='netfs'>
  <name>netfs-gluster</name>
  <uuid>2075f686-9695-476d-8938-7bfe11a75a48</uuid>
  <capacity unit='bytes'>84551532544</capacity>
  <allocation unit='bytes'>26831412736</allocation>
  <available unit='bytes'>57720119808</available>
  <source>
    <host name='10.66.106.32'/>
    <dir path='gluster-vol1'/>
    <format type='glusterfs'/>
  </source>
  <target>
    <path>/tmp/gluster-test</path>
    <permissions>
      <mode>0755</mode>
      <owner>0</owner>
      <group>0</group>
    </permissions>
  </target>
</pool>

[root@yangyangtest yy]# virsh pool-list --all
 Name                 State      Autostart 
-------------------------------------------
 default              active     yes       
 netfs-gluster        active     no   

2. Create a volume with fomat=raw
# virsh vol-create-as netfs-gluster vol1 --capacity 1G --allocation 10M --format raw
Vol vol1 created

check the vol info
# virsh vol-info vol1 netfs-gluster 
Name:           vol1
Type:           file
Capacity:       1.00 GiB
Allocation:     10.00 MiB

3. Resize the volume
# virsh vol-resize vol1 --pool netfs-gluster 2G --allocate 
Size of volume 'vol1' successfully changed to 2G

check the vol info
# virsh vol-info vol1 netfs-gluster 
Name:           vol1
Type:           file
Capacity:       2.00 GiB
Allocation:     2.00 GiB

4. Create a vol and then add it to guest
# virsh vol-create-as netfs-gluster vol2 --capacity 1G --allocation 20M --format raw
Vol vol2 created

# virsh vol-info vol2 netfs-gluster
Name:           vol2
Type:           file
Capacity:       1.00 GiB
Allocation:     20.00 MiB

Create a guest with the following xml
....
<disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/tmp/gluster-test/vol2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>
....

check the disk info form "virsh domblkinfo"
# virsh domblkinfo rhel7 vda
Capacity:       1073741824
Allocation:     20971520
Physical:       20971520

I can get the expected results.

Comment 8 Shanzhi Yu 2014-08-07 09:44:12 UTC

To comment 4

With glusterfs-server-3.4.0.59rhs, I can reproduce https://bugzilla.redhat.com/show_bug.cgi?id=1077068#c1. While when update glusterfs-server to glusterfs-server-3.6.0.27rhs, I can't reproduce any more.

Comment 9 John Ferlan 2014-08-08 22:52:34 UTC

Some upstream comments have been made - of most relevance is:

http://www.redhat.com/archives/libvir-list/2014-August/msg00367.html

It seems there's some sort of disconnect when when using posix_fallocate() to an an NFS server.  In my example, the block size of the server directory (4096) doesn't match the mount point "wsize" value (1048567) - when writes are done the calculation of the number of writes seems to be on the wsize value < the requested size; however, the writes done are only done 4096 increments with 11 writes being completed (it would have been 10 if the math was a <= comparison).  This results in a 45046 byte file (11*4096) which equates to the 44 KiB seen in the vol-info output.

Trying to figure out if the bug exists in posix_fallocate() or somehow the unexpected configuration or write to an NFS share.  There are ways to work around the issue and I'm assuming when the NFS code was first added NFS block sizes were much smaller and thus we didn't run into this.

Perhaps not really a libvirt issue, although workaroundable by libvirt.

Comment 10 John Ferlan 2014-08-28 14:47:37 UTC

Just a quick update - I was asked to file a glibc bug since it's believed this could be a bug in posix_fallocate - this was done today:

https://sourceware.org/bugzilla/show_bug.cgi?id=17322

Most recent review:

http://www.redhat.com/archives/libvir-list/2014-August/msg00491.html

direct link to review with request:

http://www.redhat.com/archives/libvir-list/2014-August/msg01074.html

Comment 11 John Ferlan 2014-12-09 16:25:53 UTC

I am going to move this to rhel-7.2 since that's what the depend upon bug has done.  More research on this (details placed in the depends on bug) shows the problem is not that a 10 MiB file isn't created "properly" by posix_fallocate (and eventually resized), rather it's the stat.st_blocks are off by a factor of 256 resulting in libvirt (and du) having a math error when displaying the block size of the file.

Comment 14 John Ferlan 2015-06-12 10:05:03 UTC

Fix has been committed upstream: https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7fe9e2e089f4990b7d18d0798f591ab276b15f2b

Fix is in nfs 2.22, this will only be a validation effort once the fix has been applied downstream

Comment 17 John Ferlan 2015-07-09 12:38:59 UTC

The depends on bz was moved to 7.3, so this one moves as well.

Comment 19 John Ferlan 2016-04-18 14:03:15 UTC

bz 1140250 now in POST for RHEL 7.3

Comment 20 yisun 2016-05-09 10:34:08 UTC

Verified with:
libvirt-1.2.17-13.el7_2.4.x86_64
glibc-2.17-133.el7.x86_64

First, I tried to reproduce this with lower version of glibc, and it's reproducible.

# rpm -qa |grep glibc
glibc-common-2.17-105.el7.x86_64
glibc-headers-2.17-105.el7.x86_64
glibc-2.17-105.el7.x86_64
glibc-devel-2.17-105.el7.x86_64

# virsh pool-dumpxml netfs-gluster
<pool type='netfs'>
  <name>netfs-gluster</name>
  <uuid>04f7784c-16c5-4d7c-8a47-7f84c2ac8c29</uuid>
  <capacity unit='bytes'>481173700608</capacity>
  <allocation unit='bytes'>20691550208</allocation>
  <available unit='bytes'>460482150400</available>
  <source>
    <host name='10.66.5.88'/>
    <dir path='/var/lib/libvirt/images/nfs'/>
    <format type='auto'/>
  </source>
  <target>
    <path>/mnt/gluster</path>
    <permissions>
      <mode>0755</mode>
      <owner>107</owner>
      <group>107</group>
      <label>system_u:object_r:nfs_t:s0</label>
    </permissions>
  </target>
</pool>


#  virsh vol-create-as netfs-gluster rh7-crtas.img --capacity 4G --allocation 1G --format raw
Vol rh7-crtas.img created


# virsh vol-info rh7-crtas.img netfs-gluster
Name:           rh7-crtas.img
Type:           file
Capacity:       4.00 GiB
Allocation:     5.00 MiB   <== wrong allocation data



Then I tried this with latest glibc, and it's fixed.


# rpm -qa| grep glibc
glibc-headers-2.17-133.el7.x86_64
glibc-common-2.17-133.el7.x86_64
glibc-devel-2.17-133.el7.x86_64
glibc-2.17-133.el7.x86_64


#  virsh vol-create-as netfs-gluster rh7-crtas.img --capacity 4G --allocation 1G --format raw
Vol rh7-crtas.img created

# virsh vol-info rh7-crtas.img netfs-gluster
Name:           rh7-crtas.img
Type:           file
Capacity:       4.00 GiB
Allocation:     1.00 GiB <=== allocation data corrected now.

Comment 22 errata-xmlrpc 2016-11-03 18:07:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2577.html

Note You need to log in before you can comment on or make changes to this bug.