Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
qemu-img: block/gluster.c:1307: find_allocation: Assertion `offs >= start' failed. when Convert the image on gluster server.
Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.9.0-3.el7.x86_64
How reproducible:
100%
Steps to Reproduce:
1.
qemu-img convert -f qcow2 -O raw gluster://bootp-73-199-200.lab.eng.pek2.redhat.com:0/gv0/rhel74-64-virtio.qcow2 convert.img
2.
3.
Actual results:
qemu-img: block/gluster.c:1307: find_allocation: Assertion `offs >= start' failed and core dump
Expected results:
Additional info:
1. gluster log:
[2017-05-16 05:23:39.056896] I [MSGID: 104045] [glfs-master.c:91:notify] 0-gfapi: New graph 696e7465-6c2d-6533-3132-32352d382d32 (0) coming up
[2017-05-16 05:23:39.056943] I [MSGID: 114020] [client.c:2356:notify] 0-gv0-client-0: parent translators are ready, attempting connect on transport
[2017-05-16 05:23:39.060411] I [MSGID: 114020] [client.c:2356:notify] 0-gv0-client-1: parent translators are ready, attempting connect on transport
[2017-05-16 05:23:39.062864] I [rpc-clnt.c:1999:rpc_clnt_reconfig] 0-gv0-client-0: changing port to 49152 (from 0)
[2017-05-16 05:23:39.065748] I [rpc-clnt.c:1999:rpc_clnt_reconfig] 0-gv0-client-1: changing port to 49152 (from 0)
[2017-05-16 05:23:39.067621] I [MSGID: 114057] [client-handshake.c:1439:select_server_supported_programs] 0-gv0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2017-05-16 05:23:39.068830] I [MSGID: 114046] [client-handshake.c:1215:client_setvolume_cbk] 0-gv0-client-0: Connected to gv0-client-0, attached to remote volume '/data/brick1/gv0'.
[2017-05-16 05:23:39.068847] I [MSGID: 114047] [client-handshake.c:1226:client_setvolume_cbk] 0-gv0-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2017-05-16 05:23:39.068894] I [MSGID: 108005] [afr-common.c:4659:afr_notify] 0-gv0-replicate-0: Subvolume 'gv0-client-0' came back up; going online.
[2017-05-16 05:23:39.069815] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 0-gv0-client-0: Server lk version = 1
[2017-05-16 05:23:39.070667] I [MSGID: 114057] [client-handshake.c:1439:select_server_supported_programs] 0-gv0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2017-05-16 05:23:39.071812] I [MSGID: 114046] [client-handshake.c:1215:client_setvolume_cbk] 0-gv0-client-1: Connected to gv0-client-1, attached to remote volume '/data/brick1/gv0'.
[2017-05-16 05:23:39.071828] I [MSGID: 114047] [client-handshake.c:1226:client_setvolume_cbk] 0-gv0-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2017-05-16 05:23:39.082302] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 0-gv0-client-1: Server lk version = 1
[2017-05-16 05:23:39.085735] I [MSGID: 104041] [glfs-resolve.c:885:__glfs_active_subvol] 0-gv0: switched to graph 696e7465-6c2d-6533-3132-32352d382d32 (0)
[2017-05-16 05:23:39.089087] W [MSGID: 114031] [client-rpc-fops.c:2211:client3_3_seek_cbk] 0-gv0-client-0: remote operation failed [No such device or address]
2. qemu-img: block/gluster.c:1307: find_allocation: Assertion `offs >= start' failed.
3. call trace
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `qemu-img convert -f qcow2 -O raw gluster://bootp-73-199-200.lab.eng.pek2.redhat'.
Program terminated with signal 6, Aborted.
#0 0x00007f2f7128d1d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x00007f2f7128d1d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f2f7128e8c8 in __GI_abort () at abort.c:90
#2 0x00007f2f71286146 in __assert_fail_base (fmt=0x7f2f713d73a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f2f87d7601b "offs >= start",
file=file@entry=0x7f2f87d7b658 "block/gluster.c", line=line@entry=1307, function=function@entry=0x7f2f87d7bdd0 <__PRETTY_FUNCTION__.24790> "find_allocation") at assert.c:92
#3 0x00007f2f712861f2 in __GI___assert_fail (assertion=assertion@entry=0x7f2f87d7601b "offs >= start", file=file@entry=0x7f2f87d7b658 "block/gluster.c", line=line@entry=1307,
function=function@entry=0x7f2f87d7bdd0 <__PRETTY_FUNCTION__.24790> "find_allocation") at assert.c:101
#4 0x00007f2f87cfdf42 in find_allocation (bs=0x7f2f88d31400, hole=<synthetic pointer>, data=<synthetic pointer>, start=2358575104) at block/gluster.c:1307
#5 qemu_gluster_co_get_block_status (bs=0x7f2f88d31400, sector_num=<optimized out>, nb_sectors=1408, pnum=0x7f2f5d0dbd6c, file=0x7f2f5d0dbd70) at block/gluster.c:1368
#6 0x00007f2f87ce5c96 in bdrv_co_get_block_status (bs=0x7f2f88d31400, sector_num=sector_num@entry=4606592, nb_sectors=1408, pnum=pnum@entry=0x7f2f5d0dbd6c,
file=file@entry=0x7f2f5d0dbd70) at block/io.c:1786
#7 0x00007f2f87ce5df1 in bdrv_co_get_block_status (bs=bs@entry=0x7f2f88d2e000, sector_num=sector_num@entry=7733248, nb_sectors=<optimized out>, nb_sectors@entry=4194303,
pnum=pnum@entry=0x7f2f5d0dbe8c, file=file@entry=0x7f2f5d0dbe90) at block/io.c:1820
#8 0x00007f2f87ce5e8b in bdrv_co_get_block_status_above (file=0x7f2f5d0dbe90, pnum=0x7f2f5d0dbe8c, nb_sectors=4194303, sector_num=7733248, base=0x0, bs=<optimized out>)
at block/io.c:1856
#9 bdrv_get_block_status_above_co_entry (opaque=opaque@entry=0x7f2f5d0dbe00) at block/io.c:1872
#10 0x00007f2f87ce5b90 in bdrv_get_block_status_above (bs=0x7f2f88d2e000, base=<optimized out>, sector_num=<optimized out>, nb_sectors=<optimized out>, pnum=<optimized out>,
file=<optimized out>) at block/io.c:1904
#11 0x00007f2f87c9b8e2 in convert_iteration_sectors (s=0x7ffe4be4f730, sector_num=7733248) at qemu-img.c:1557
#12 0x00007f2f87c9cadc in convert_co_do_copy (opaque=0x7ffe4be4f730) at qemu-img.c:1766
#13 0x00007f2f87d63aba in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79
#14 0x00007f2f7129ecf0 in ?? () from /usr/lib64/libc-2.17.so
#15 0x00007ffe4be4ee80 in ?? ()
#16 0x0000000000000000 in ?? ()
Here is the code that is asserting:
offs = glfs_lseek(s->fd, start, SEEK_HOLE);
if (offs < 0) {
return -errno; /* D1 and (H3 or H4) */
}
assert(offs >= start);
I think this means that glfs_lseek() is returning bad data. SEEK_HOLE means "Adjust the file offset to the next hole in the file greater than or equal to offset", i.e. for this call it means adjust the file offset to the next hole >= start.
So the call to glfs_lseek() should either return error (offs < 0), or offs >= start. I can't think of any valid way glfs_lseek() could return offs < start && offs >= 0.
That said, if versions of gluster already released are returning invalid glfs_lseek() results, we might need a more graceful way to handle it in QEMU.
I am able to reproduce this locally now, with source images > ~6GB. What is being returned is not offs == 0, but (offs > 0 && offs < start), which seems to be an invalid lseek return.
This is a duplicate of BZ #1425293 (see comment #13 on that BZ), however I am leaving this BZ open for QEMU, because we will need some sort of workaround for gluster versions already released.
Comment 7Miroslav Rezanina
2017-06-06 08:54:36 UTC
Result: PASS
convert the image successfully and boot up image successfully
Package: qemu-kvm-rhev-2.9.0-8.el7.x86_64
Steps:
1. Convert image on glusterfs server
qemu-img convert -f qcow2 -O raw gluster://bootp-73-199-200.lab.eng.pek2.redhat.com:0/gv0/rhel74-64-virtio.qcow2 convert.img
2. boot up convert.img image
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2017:2392
Description of problem: qemu-img: block/gluster.c:1307: find_allocation: Assertion `offs >= start' failed. when Convert the image on gluster server. Version-Release number of selected component (if applicable): qemu-kvm-rhev-2.9.0-3.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. qemu-img convert -f qcow2 -O raw gluster://bootp-73-199-200.lab.eng.pek2.redhat.com:0/gv0/rhel74-64-virtio.qcow2 convert.img 2. 3. Actual results: qemu-img: block/gluster.c:1307: find_allocation: Assertion `offs >= start' failed and core dump Expected results: Additional info: 1. gluster log: [2017-05-16 05:23:39.056896] I [MSGID: 104045] [glfs-master.c:91:notify] 0-gfapi: New graph 696e7465-6c2d-6533-3132-32352d382d32 (0) coming up [2017-05-16 05:23:39.056943] I [MSGID: 114020] [client.c:2356:notify] 0-gv0-client-0: parent translators are ready, attempting connect on transport [2017-05-16 05:23:39.060411] I [MSGID: 114020] [client.c:2356:notify] 0-gv0-client-1: parent translators are ready, attempting connect on transport [2017-05-16 05:23:39.062864] I [rpc-clnt.c:1999:rpc_clnt_reconfig] 0-gv0-client-0: changing port to 49152 (from 0) [2017-05-16 05:23:39.065748] I [rpc-clnt.c:1999:rpc_clnt_reconfig] 0-gv0-client-1: changing port to 49152 (from 0) [2017-05-16 05:23:39.067621] I [MSGID: 114057] [client-handshake.c:1439:select_server_supported_programs] 0-gv0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-05-16 05:23:39.068830] I [MSGID: 114046] [client-handshake.c:1215:client_setvolume_cbk] 0-gv0-client-0: Connected to gv0-client-0, attached to remote volume '/data/brick1/gv0'. [2017-05-16 05:23:39.068847] I [MSGID: 114047] [client-handshake.c:1226:client_setvolume_cbk] 0-gv0-client-0: Server and Client lk-version numbers are not same, reopening the fds [2017-05-16 05:23:39.068894] I [MSGID: 108005] [afr-common.c:4659:afr_notify] 0-gv0-replicate-0: Subvolume 'gv0-client-0' came back up; going online. [2017-05-16 05:23:39.069815] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 0-gv0-client-0: Server lk version = 1 [2017-05-16 05:23:39.070667] I [MSGID: 114057] [client-handshake.c:1439:select_server_supported_programs] 0-gv0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2017-05-16 05:23:39.071812] I [MSGID: 114046] [client-handshake.c:1215:client_setvolume_cbk] 0-gv0-client-1: Connected to gv0-client-1, attached to remote volume '/data/brick1/gv0'. [2017-05-16 05:23:39.071828] I [MSGID: 114047] [client-handshake.c:1226:client_setvolume_cbk] 0-gv0-client-1: Server and Client lk-version numbers are not same, reopening the fds [2017-05-16 05:23:39.082302] I [MSGID: 114035] [client-handshake.c:201:client_set_lk_version_cbk] 0-gv0-client-1: Server lk version = 1 [2017-05-16 05:23:39.085735] I [MSGID: 104041] [glfs-resolve.c:885:__glfs_active_subvol] 0-gv0: switched to graph 696e7465-6c2d-6533-3132-32352d382d32 (0) [2017-05-16 05:23:39.089087] W [MSGID: 114031] [client-rpc-fops.c:2211:client3_3_seek_cbk] 0-gv0-client-0: remote operation failed [No such device or address] 2. qemu-img: block/gluster.c:1307: find_allocation: Assertion `offs >= start' failed. 3. call trace [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `qemu-img convert -f qcow2 -O raw gluster://bootp-73-199-200.lab.eng.pek2.redhat'. Program terminated with signal 6, Aborted. #0 0x00007f2f7128d1d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) bt #0 0x00007f2f7128d1d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f2f7128e8c8 in __GI_abort () at abort.c:90 #2 0x00007f2f71286146 in __assert_fail_base (fmt=0x7f2f713d73a8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7f2f87d7601b "offs >= start", file=file@entry=0x7f2f87d7b658 "block/gluster.c", line=line@entry=1307, function=function@entry=0x7f2f87d7bdd0 <__PRETTY_FUNCTION__.24790> "find_allocation") at assert.c:92 #3 0x00007f2f712861f2 in __GI___assert_fail (assertion=assertion@entry=0x7f2f87d7601b "offs >= start", file=file@entry=0x7f2f87d7b658 "block/gluster.c", line=line@entry=1307, function=function@entry=0x7f2f87d7bdd0 <__PRETTY_FUNCTION__.24790> "find_allocation") at assert.c:101 #4 0x00007f2f87cfdf42 in find_allocation (bs=0x7f2f88d31400, hole=<synthetic pointer>, data=<synthetic pointer>, start=2358575104) at block/gluster.c:1307 #5 qemu_gluster_co_get_block_status (bs=0x7f2f88d31400, sector_num=<optimized out>, nb_sectors=1408, pnum=0x7f2f5d0dbd6c, file=0x7f2f5d0dbd70) at block/gluster.c:1368 #6 0x00007f2f87ce5c96 in bdrv_co_get_block_status (bs=0x7f2f88d31400, sector_num=sector_num@entry=4606592, nb_sectors=1408, pnum=pnum@entry=0x7f2f5d0dbd6c, file=file@entry=0x7f2f5d0dbd70) at block/io.c:1786 #7 0x00007f2f87ce5df1 in bdrv_co_get_block_status (bs=bs@entry=0x7f2f88d2e000, sector_num=sector_num@entry=7733248, nb_sectors=<optimized out>, nb_sectors@entry=4194303, pnum=pnum@entry=0x7f2f5d0dbe8c, file=file@entry=0x7f2f5d0dbe90) at block/io.c:1820 #8 0x00007f2f87ce5e8b in bdrv_co_get_block_status_above (file=0x7f2f5d0dbe90, pnum=0x7f2f5d0dbe8c, nb_sectors=4194303, sector_num=7733248, base=0x0, bs=<optimized out>) at block/io.c:1856 #9 bdrv_get_block_status_above_co_entry (opaque=opaque@entry=0x7f2f5d0dbe00) at block/io.c:1872 #10 0x00007f2f87ce5b90 in bdrv_get_block_status_above (bs=0x7f2f88d2e000, base=<optimized out>, sector_num=<optimized out>, nb_sectors=<optimized out>, pnum=<optimized out>, file=<optimized out>) at block/io.c:1904 #11 0x00007f2f87c9b8e2 in convert_iteration_sectors (s=0x7ffe4be4f730, sector_num=7733248) at qemu-img.c:1557 #12 0x00007f2f87c9cadc in convert_co_do_copy (opaque=0x7ffe4be4f730) at qemu-img.c:1766 #13 0x00007f2f87d63aba in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79 #14 0x00007f2f7129ecf0 in ?? () from /usr/lib64/libc-2.17.so #15 0x00007ffe4be4ee80 in ?? () #16 0x0000000000000000 in ?? ()