Bug 1088817
| Summary: | qemu-img segfault while create a large number of images w/ gluster backend | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | mazhang <mazhang> | ||||
| Component: | glusterfs | Assignee: | Bug Updates Notification Mailing List <rhs-bugs> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | storage-qa-internal <storage-qa-internal> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 6.5 | CC: | jcody, juzhang, michen, mkenneth, qzhang, rbalakri, rpacheco, virt-maint | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-12-06 11:12:27 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Created attachment 887104 [details]
dmesg
Already configure the gluster volume to allow connections from unprivileged ports. From the gluster server, by run: # gluster volume set gv0 server.allow-insecure on gluster> volume info Volume Name: gv0 Type: Distribute Volume ID: 6e6a0709-0dc0-4500-ae55-81bc062c0d6c Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.66.4.217:/home/brick Options Reconfigured: server.allow-insecure: on (In reply to mazhang from comment #0) > 2. qemu-img segfault > Do you have the backtrace (from the coredump) of this segfault? The call trace below is from the kernel. > [root@amd-9600b-8-1 ~]# dmesg > qemu-img[2086]: segfault at 20 ip 00007f408b3ac0e6 sp 00007fffc6ca9510 error > 4 in libgfapi.so.0.0.0[7f408b3a0000+16000] > qemu-img[2202]: segfault at 20 ip 00007f99dc4500e6 sp 00007fffaa6bcfd0 error > 4 in libgfapi.so.0.0.0[7f99dc444000+16000] > qemu-img[3178]: segfault at 20 ip 00007f5af33ac0e6 sp 00007fffd266fb50 error > 4 in libgfapi.so.0.0.0[7f5af33a0000+16000] > INFO: task qemu-img:2130 blocked for more than 120 seconds. > Not tainted 2.6.32-431.17.1.el6.x86_64 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > qemu-img D 0000000000000002 0 2130 1 0x00000080 > ffff88022938fc98 0000000000000082 0000000000000000 ffff88022938fc00 > ffff88022938fc68 ffffffff810aefd0 0000000000000202 0000000000000000 > ffff88022938bab8 ffff88022938ffd8 000000000000fbc8 ffff88022938bab8 > Call Trace: > [<ffffffff810aefd0>] ? exit_robust_list+0x90/0x160 > [<ffffffff81076745>] exit_mm+0x95/0x180 > [<ffffffff81076b8f>] do_exit+0x15f/0x870 > [<ffffffff810772f8>] do_group_exit+0x58/0xd0 > [<ffffffff8108cca6>] get_signal_to_deliver+0x1f6/0x460 > [<ffffffff8100a265>] do_signal+0x75/0x800 > [<ffffffff810b186b>] ? sys_futex+0x7b/0x170 > [<ffffffff8100aa80>] do_notify_resume+0x90/0xc0 > [<ffffffff8100b341>] int_signal+0x12/0x17 > INFO: task qemu-img:2134 blocked for more than 120 seconds. > Not tainted 2.6.32-431.17.1.el6.x86_64 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > (gdb) bt full
#0 0x00007f1b64a620e6 in glfs_lseek () from /usr/lib64/libgfapi.so.0
No symbol table info available.
#1 0x00007f1b65e21568 in qemu_gluster_getlength (bs=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/block/gluster.c:514
s = <value optimized out>
ret = <value optimized out>
#2 0x00007f1b65df489f in refresh_total_sectors (bs=<value optimized out>, hint=<value optimized out>) at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:532
length = <value optimized out>
drv = <value optimized out>
#3 0x00007f1b65dfac5f in bdrv_open_common (bs=0x7f1b671f8af0, filename=0x7fff11738776 "gluster://10.66.65.117/gv1/test-718", flags=<value optimized out>, drv=0x7f1b66053200)
at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:626
ret = 0
open_flags = 2
__PRETTY_FUNCTION__ = "bdrv_open_common"
#4 0x00007f1b65dfadeb in bdrv_file_open (pbs=0x7fff11736c80, filename=0x7fff11738776 "gluster://10.66.65.117/gv1/test-718", flags=2) at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:664
bs = 0x7f1b671f8af0
drv = 0x7f1b66053200
ret = <value optimized out>
#5 0x00007f1b65e1492a in qcow2_create2 (filename=0x7fff11738776 "gluster://10.66.65.117/gv1/test-718", total_size=4096, backing_file=0x0, backing_format=0x0, flags=0,
cluster_size=<value optimized out>, prealloc=0, options=0x7f1b671cb570) at /usr/src/debug/qemu-kvm-0.12.1.2/block/qcow2.c:1072
cluster_bits = 16
bs = <value optimized out>
header = {magic = 0, version = 0, backing_file_offset = 219043332111, backing_file_size = 91, cluster_bits = 124, size = 472446402679, crypt_method = 1709013400, l1_size =
32539, l1_table_offset = 224, refcount_table_offset = 224, refcount_table_clusters = 1682046592, nb_snapshots = 32539, snapshots_offset = 140733486172022}
refcount_table = <value optimized out>
ret = <value optimized out>
drv = <value optimized out>
__PRETTY_FUNCTION__ = "qcow2_create2"
#6 0x00007f1b65e14e5f in qcow2_create (filename=0x7fff11738776 "gluster://10.66.65.117/gv1/test-718", options=<value optimized out>)
at /usr/src/debug/qemu-kvm-0.12.1.2/block/qcow2.c:1214
backing_file = <value optimized out>
backing_fmt = <value optimized out>
sectors = <value optimized out>
flags = <value optimized out>
cluster_size = <value optimized out>
prealloc = <value optimized out>
#7 0x00007f1b65dfb50d in bdrv_img_create (filename=0x7fff11738776 "gluster://10.66.65.117/gv1/test-718", fmt=0x7fff11738770 "qcow2", base_filename=<value optimized out>, base_fmt=
0x0, options=<value optimized out>, img_size=2097152, flags=64, errp=0x7fff11737018) at /usr/src/debug/qemu-kvm-0.12.1.2/block.c:4678
param = 0x7f1b671cb4b0
create_options = 0x7f1b671cb3a0
backing_fmt = <value optimized out>
backing_file = <value optimized out>
bs = 0x0
drv = 0x7f1b66051740
proto_drv = <value optimized out>
backing_drv = 0x0
ret = <value optimized out>
#8 0x00007f1b65dec365 in img_create (argc=<value optimized out>, argv=0x7fff11737140) at /usr/src/debug/qemu-kvm-0.12.1.2/qemu-img.c:390
c = <value optimized out>
---Type <return> to continue, or q <return> to quit---
img_size = <value optimized out>
fmt = 0x7fff11738770 "qcow2"
base_fmt = 0x0
filename = 0x7fff11738776 "gluster://10.66.65.117/gv1/test-718"
base_filename = 0x0
options = 0x0
local_err = 0x0
#9 0x00007f1b640aed1d in __libc_start_main () from /lib64/libc.so.6
No symbol table info available.
#10 0x00007f1b65deb659 in _start ()
No symbol table info available.
I believe this is a libglusterfs issue, rather than a QEMU issue. Moving to gluster to investigate. Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available. The official life cycle policy can be reviewed here: http://redhat.com/rhel/lifecycle This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL: https://access.redhat.com/ |
Description of problem: qemu-img segfault while create a large number of images Version-Release number of selected component (if applicable): Host: amd-9600b-8-1.englab.nay.redhat.com qemu-kvm-debuginfo-0.12.1.2-2.424.el6.x86_64 qemu-img-0.12.1.2-2.424.el6.x86_64 qemu-kvm-tools-0.12.1.2-2.424.el6.x86_64 gpxe-roms-qemu-0.9.7-6.10.el6.noarch qemu-kvm-0.12.1.2-2.424.el6.x86_64 kernel-2.6.32-431.17.1.el6.x86_64 glusterfs-api-3.4.0.59rhs-1.el6.x86_64 glusterfs-libs-3.4.0.59rhs-1.el6.x86_64 glusterfs-fuse-3.4.0.59rhs-1.el6.x86_64 glusterfs-3.4.0.59rhs-1.el6.x86_64 Gluster Server: glusterfs-fuse-3.4.0.59rhs-1.el6rhs.x86_64 glusterfs-debuginfo-3.4.0.59rhs-1.el6rhs.x86_64 glusterfs-libs-3.4.0.59rhs-1.el6rhs.x86_64 glusterfs-api-3.4.0.59rhs-1.el6rhs.x86_64 glusterfs-geo-replication-3.4.0.59rhs-1.el6rhs.x86_64 glusterfs-api-devel-3.4.0.59rhs-1.el6rhs.x86_64 glusterfs-3.4.0.59rhs-1.el6rhs.x86_64 glusterfs-server-3.4.0.59rhs-1.el6rhs.x86_64 How reproducible: 100% Steps to Reproduce: 1.Use below script create qcow2 image by gluster. #!/bin/bash COUNT=0 while [ $COUNT -lt 1000 ] do qemu-img create -f qcow2 gluster://rhs/gv0/test-$COUNT 2M & COUNT=$((1+$COUNT)) done 2. 3. Actual results: 1. I/O error. [2014-04-17 16:40:40.666506] I [client.c:2103:client_rpc_notify] 0-gv0-client-0: disconnected from 10.66.4.217:49153. Client process will keep trying to connect to glusterd until brick's port is available. [2014-04-17 16:40:40.666752] I [client.c:2103:client_rpc_notify] 0-gv0-client-0: disconnected from 10.66.4.217:49153. Client process will keep trying to connect to glusterd until brick's port is available. gluster://10.66.4.217/gv0/test-700: error while creating qcow2: Connection reset by peer gluster://10.66.4.217/gv0/test-655: error while creating qcow2: Input/output error gluster://10.66.4.217/gv0/test-783: error while creating qcow2: Input/output error gluster://10.66.4.217/gv0/test-538: error while creating qcow2: Input/output error gluster://10.66.4.217/gv0/test-809: error while creating qcow2: No such file or directory [2014-04-17 16:40:42.178949] I [client.c:2103:client_rpc_notify] 0-gv0-client-0: disconnected from 10.66.4.217:49153. Client process will keep trying to connect to glusterd until brick's port is available. 2. qemu-img segfault [root@amd-9600b-8-1 ~]# dmesg qemu-img[2086]: segfault at 20 ip 00007f408b3ac0e6 sp 00007fffc6ca9510 error 4 in libgfapi.so.0.0.0[7f408b3a0000+16000] qemu-img[2202]: segfault at 20 ip 00007f99dc4500e6 sp 00007fffaa6bcfd0 error 4 in libgfapi.so.0.0.0[7f99dc444000+16000] qemu-img[3178]: segfault at 20 ip 00007f5af33ac0e6 sp 00007fffd266fb50 error 4 in libgfapi.so.0.0.0[7f5af33a0000+16000] INFO: task qemu-img:2130 blocked for more than 120 seconds. Not tainted 2.6.32-431.17.1.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. qemu-img D 0000000000000002 0 2130 1 0x00000080 ffff88022938fc98 0000000000000082 0000000000000000 ffff88022938fc00 ffff88022938fc68 ffffffff810aefd0 0000000000000202 0000000000000000 ffff88022938bab8 ffff88022938ffd8 000000000000fbc8 ffff88022938bab8 Call Trace: [<ffffffff810aefd0>] ? exit_robust_list+0x90/0x160 [<ffffffff81076745>] exit_mm+0x95/0x180 [<ffffffff81076b8f>] do_exit+0x15f/0x870 [<ffffffff810772f8>] do_group_exit+0x58/0xd0 [<ffffffff8108cca6>] get_signal_to_deliver+0x1f6/0x460 [<ffffffff8100a265>] do_signal+0x75/0x800 [<ffffffff810b186b>] ? sys_futex+0x7b/0x170 [<ffffffff8100aa80>] do_notify_resume+0x90/0xc0 [<ffffffff8100b341>] int_signal+0x12/0x17 INFO: task qemu-img:2134 blocked for more than 120 seconds. Not tainted 2.6.32-431.17.1.el6.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Expected results: qemu-img works well. Additional info: