Bug 1450759
| Summary: | Creating fallocated image using qemu-img using gfapi fails | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | SATHEESARAN <sasundar> | ||||
| Component: | qemu-kvm-rhev | Assignee: | Jeff Cody <jcody> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Ping Li <pingl> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 7.3 | CC: | aliang, coli, knoel, ndevos, ngu, rhs-bugs, sabose, sasundar, storage-qa-internal, virt-maint | ||||
| Target Milestone: | pre-dev-freeze | Keywords: | Patch | ||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | qemu-kvm-rhev-2.9.0-8.el7 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | |||||||
| : | 1450903 (view as bug list) | Environment: |
Hyperconverged Infra
|
||||
| Last Closed: | 2017-08-02 04:38:29 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1411323, 1450903, 1485863 | ||||||
| Attachments: |
|
||||||
|
Description
SATHEESARAN
2017-05-15 06:52:17 UTC
Error message says that GlusterFS doesn't support zerofill API # qemu-img create -f qcow2 -o preallocation=none gluster://host1.example.com/vol/vm1.img 1G .... .... .... .... qemu-img: gluster://dhcp37-191.lab.eng.blr.redhat.com/voltemp/vm1.img: Invalid preallocation mode: 'falloc' or GlusterFS doesn't support zerofill API What type of volume are you using? Not all Gluster xlators implement the fallocate() FOP. We'll need to examine the (client+server) graph to see where this is missing. (In reply to Niels de Vos from comment #3) > What type of volume are you using? Not all Gluster xlators implement the > fallocate() FOP. We'll need to examine the (client+server) graph to see > where this is missing. Hi Niels, I was using replica 3 sharded volume. Here is the volume info: gluster volume info Volume Name: vmstore Type: Replicate Volume ID: da887213-8669-479c-88c0-f63554507528 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: host1.lab.eng.blr.redhat.com:/gluster/brick1/b1 Brick2: host2.lab.eng.blr.redhat.com:/gluster/brick1/b1 Brick3: host3.lab.eng.blr.redhat.com:/gluster/brick1/b1 Options Reconfigured: cluster.granular-entry-heal: enable performance.strict-o-direct: on network.ping-timeout: 30 server.allow-insecure: on storage.owner-gid: 36 storage.owner-uid: 36 features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable network.remote-dio: disable performance.low-prio-threads: 32 performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off auth.allow: * user.cifs: off transport.address-family: inet performance.readdir-ahead: on nfs.disable: on I can provide the live setup for analyzing more details QEMU only supports 'falloc' when glfs_zerofill() is available during build time (from 'configure' in the QEMU sources): 3565 ########################################## 3566 # glusterfs probe 3567 if test "$glusterfs" != "no" ; then 3568 if $pkg_config --atleast-version=3 glusterfs-api; then 3569 glusterfs="yes" 3570 glusterfs_cflags=$($pkg_config --cflags glusterfs-api) 3571 glusterfs_libs=$($pkg_config --libs glusterfs-api) 3572 if $pkg_config --atleast-version=4 glusterfs-api; then 3573 glusterfs_xlator_opt="yes" 3574 fi 3575 if $pkg_config --atleast-version=5 glusterfs-api; then 3576 glusterfs_discard="yes" 3577 fi 3578 if $pkg_config --atleast-version=6 glusterfs-api; then 3579 glusterfs_zerofill="yes" 3580 fi 3581 else 3582 if test "$glusterfs" = "yes" ; then 3583 feature_not_found "GlusterFS backend support" \ 3584 "Install glusterfs-api devel >= 3" 3585 fi 3586 glusterfs="no" 3587 fi 3588 fi QEMU version: qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64 This was built with glusterfs-api-devel-3.7.9-12.el7.x86_64.rpm that contains usr/lib64/pkgconfig/glusterfs-api.pc: ... Name: glusterfs-api Description: GlusterFS API /* This is the API version, NOT package version */ Version: 7.3.7.9 ... This shows that the version is high enough, and that the QEMU build should have enabled support for 'falloc'. A little more inspection is needed... Confirmation that QEMU was built to use the glfs_zerofill() functions: $ cd qemu-kvm-rhev-2.6.0-28.el7_3.9.x86_64.d $ ldd usr/libexec/qemu-kvm | grep gfapi libgfapi.so.0 => /lib64/libgfapi.so.0 (0x00007f7078f53000) $ objdump -T usr/libexec/qemu-kvm | grep glfs_zerofill 0000000000000000 DF *UND* 0000000000000000 GFAPI_3.5.0 glfs_zerofill_async 0000000000000000 DF *UND* 0000000000000000 GFAPI_3.5.0 glfs_zerofill Will need to find out why QEMU reports "GlusterFS doesn't support zerofill API". Hmm, it seems that 'falloc' is not a valid pre-allocation option for the block/gluster driver in qemu-kvm-rhev-2.6.0-28.el7_3.9:
[block/gluster.c:qemu_gluster_create()]
972 tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
973 if (!tmp || !strcmp(tmp, "off")) {
974 prealloc = 0;
975 } else if (!strcmp(tmp, "full") && gluster_supports_zerofill()) {
976 prealloc = 1;
977 } else {
978 error_setg(errp, "Invalid preallocation mode: '%s'"
979 " or GlusterFS doesn't support zerofill API", tmp);
980 ret = -EINVAL;
981 goto out;
982 }
The only pre-allocation options are "off" or "full". The implementation for the Gluster driver in QEMU is a little more simple than the raw-posix driver that is used for filesystem mounts (FUSE).
Compare this to the fuller implementation in block/raw-posix.c:raw_create()
1702 buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
1703 prealloc = qapi_enum_parse(PreallocMode_lookup, buf,
1704 PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
1705 &local_err);
....
1742 switch (prealloc) {
1743 #ifdef CONFIG_POSIX_FALLOCATE
1744 case PREALLOC_MODE_FALLOC:
1745 /* posix_fallocate() doesn't set errno. */
1746 result = -posix_fallocate(fd, 0, total_size);
1747 if (result != 0) {
1748 error_setg_errno(errp, -result,
1749 "Could not preallocate data for the new file");
1750 }
1751 break;
1752 #endif
1753 case PREALLOC_MODE_FULL:
1754 {
1755 int64_t num = 0, left = total_size;
1756 buf = g_malloc0(65536);
1757
1758 while (left > 0) {
1759 num = MIN(left, 65536);
1760 result = write(fd, buf, num);
1761 if (result < 0) {
1762 result = -errno;
1763 error_setg_errno(errp, -result,
1764 "Could not write to the new file");
1765 break;
1766 }
1767 left -= result;
1768 }
1769 if (result >= 0) {
1770 result = fsync(fd);
1771 if (result < 0) {
1772 result = -errno;
1773 error_setg_errno(errp, -result,
1774 "Could not flush new file to disk");
1775 }
1776 }
1777 g_free(buf);
1778 break;
1779 }
1780 case PREALLOC_MODE_OFF:
1781 break;
1782 default:
1783 result = -EINVAL;
1784 error_setg(errp, "Unsupported preallocation mode: %s",
1785 PreallocMode_lookup[prealloc]);
1786 break;
1787 }
So, in order to have QEMU support preallocation=falloc, the block/gluster.c sources need to be updated.
Created attachment 1278970 [details] gluster: add support for PREALLOC_MODE_FALLOC Initial patch, completely untested. I'll get some testing done before sending it upstream for review and inclusion. Updated versions will become available on https://github.com/nixpanic/qemu/tree/gfapi/fallocate/rhbz1450759 as well. Posted for review at http://lists.nongnu.org/archive/html/qemu-block/2017-05/msg00667.html Results of a test-build with the posted patch (testing done before posting the patch, of course): 1. verification of the error message with a relatively current QEMU [root@vm013 ~]# rpm -q qemu-img qemu-img-2.7.1-6.fc25.x86_64 [root@vm013 ~]# qemu-img create -f qcow2 -o preallocation=falloc gluster://vm015.example.com/one-brick/bz1450759.falloc.img 20M Formatting 'gluster://vm015.example.com/one-brick/bz1450759.falloc.img', fmt=qcow2 size=20971520 encryption=off cluster_size=65536 preallocation=falloc lazy_refcounts=off refcount_bits=16 qemu-img: gluster://vm015.example.com/one-brick/bz1450759.falloc.img: Invalid preallocation mode: 'falloc' or GlusterFS doesn't support zerofill API 2. verification of the fix with the test-build: [root@vm013 ~]# rpm -q qemu-img qemu-img-2.9.0-1.fc25.0.1bz1450759.x86_64 [root@vm013 ~]# qemu-img create -f qcow2 -o preallocation=falloc gluster://vm015.example.com/one-brick/bz1450759.falloc.img 20M Formatting 'gluster://vm015.example.com/one-brick/bz1450759.falloc.img', fmt=qcow2 size=20971520 encryption=off cluster_size=65536 preallocation=falloc lazy_refcounts=off refcount_bits=16 [root@vm013 ~]# qemu-img create -f qcow2 -o preallocation=full gluster://vm015.example.com/one-brick/bz1450759.full.img 20M Formatting 'gluster://vm015.example.com/one-brick/bz1450759.full.img', fmt=qcow2 size=20971520 encryption=off cluster_size=65536 preallocation=full lazy_refcounts=off refcount_bits=16 [root@vm013 ~]# qemu-img create -f qcow2 -o preallocation=off gluster://vm015.example.com/one-brick/bz1450759.off.img 20M Formatting 'gluster://vm015.example.com/one-brick/bz1450759.off.img', fmt=qcow2 size=20971520 encryption=off cluster_size=65536 preallocation=off lazy_refcounts=off refcount_bits=16 Requesting exception flag because this is important for Gluster support (layered products). The change is low risk and doesn't affect RHEL users, as this is for qemu-kvm-rhev. Fix included in qemu-kvm-rhev-2.9.0-8.el7 Reproduced bug with qemu-kvm-rhev-2.9.0-7.el7:
1. For qcow2 image
1.1 off mode ------> pass
# qemu-img create -f qcow2 -o preallocation=off gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=off lazy_refcounts=off refcount_bits=16
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 193K
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
1.2 metadata mode ------> pass
# qemu-img create -f qcow2 -o preallocation=metadata gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 516K
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
1.3 falloc mode ------> fail
# qemu-img create -f qcow2 -o preallocation=falloc gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
qemu-img: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img: Invalid preallocation mode: 'falloc' or GlusterFS doesn't support zerofill API
1.4 full mode ------> pass
# qemu-img create -f qcow2 -o preallocation=full gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
2. For raw image
2.1 off mode ------> pass
# qemu-img create -f raw -o preallocation=off gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=raw size=1073741824 preallocation=off
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 0
2.2 falloc mode ------> fail
# qemu-img create -f raw -o preallocation=falloc gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=raw size=1073741824 preallocation=falloc
qemu-img: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img: Invalid preallocation mode: 'falloc' or GlusterFS doesn't support zerofill API
2.3 full mode ------> pass
# qemu-img create -f raw -o preallocation=full gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=raw size=1073741824 preallocation=full
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G
Verified the issue with qemu-kvm-rhev-2.9.0-9.el7:
1. For qcow2 image
1.1 off mode ------> pass
# qemu-img create -f qcow2 -o preallocation=off gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=off lazy_refcounts=off refcount_bits=16
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 193K
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
1.2 metadata mode ------> pass
# qemu-img create -f qcow2 -o preallocation=metadata gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 516K
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
1.3 falloc mode ------> pass
# qemu-img create -f qcow2 -o preallocation=falloc gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=falloc lazy_refcounts=off refcount_bits=16
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
1.4 full mode ------> pass
# qemu-img create -f qcow2 -o preallocation=full gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=full lazy_refcounts=off refcount_bits=16
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: qcow2
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
2. For raw image
2.1 off mode ------> pass
# qemu-img create -f raw -o preallocation=off gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=raw size=1073741824 preallocation=off
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 0
2.2 falloc mode ------> pass
# qemu-img create -f raw -o preallocation=falloc gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=raw size=1073741824 preallocation=falloc
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G
2.3 full mode ------> pass
# qemu-img create -f raw -o preallocation=full gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img 1G
Formatting 'gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img', fmt=raw size=1073741824 preallocation=full
# qemu-img info gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
image: gluster://bootp-73-199-197.lab.eng.pek2.redhat.com/gv0/vm1.img
file format: raw
virtual size: 1.0G (1073741824 bytes)
disk size: 1.0G
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:2392 |