Bug 1439068 - Segmentation fault when creating a qcow2 with qemu-img
Summary: Segmentation fault when creating a qcow2 with qemu-img
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Xavi Hernandez
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1438813 1442933 1448345 1502812
TreeView+ depends on / blocked
 
Reported: 2017-04-05 07:23 UTC by Xavi Hernandez
Modified: 2017-10-26 14:36 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.12.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1438813
: 1442933 1448345 (view as bug list)
Environment:
Last Closed: 2017-08-11 10:31:10 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Xavi Hernandez 2017-04-05 07:23:47 UTC
+++ This bug was initially created as a clone of Bug #1438813 +++

Description of problem:

When you create a qemu qcow2 image you get a Segmentation fault.

Version-Release number of selected component (if applicable):
glusterfs-client               3.10.1-1                           amd64
glusterfs-common               3.10.1-1                           amd64  
glusterfs-server               3.10.1-1                           amd64

qemu 2.7.1
Ubuntu kernel 4.4.0-68.88

How reproducible:

1.) create a cluster 3 nodes

#gluster pool list
UUID					Hostname     	State
8bbaa8f0-fc2d-4696-b0cc-e2c7348eb63c	192.168.18.52	Connected 
8bbac8f0-fc2d-4696-b0cc-e2c7348eb63c	192.168.18.56	Connected 
ed1896d9-9772-428a-a0e5-c7f5e09adbdf	localhost    	Connected

2.) creat a volume 

#gluster volume create volume1 redundancy 1  transport tcp 192.168.18.51:/export/sdb1/brick 192.168.18.52:/export/sdb1/brick 192.168.18.56:/export/sdb1/brick

#gluster volume status
Status of volume: volume1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.18.51:/export/sdb1/brick      49152     0          Y       30009
Brick 192.168.18.52:/export/sdb1/brick      49152     0          Y       12738
Brick 192.168.18.56:/export/sdb1/brick      49152     0          Y       10165
Self-heal Daemon on localhost               N/A       N/A        Y       30030
Self-heal Daemon on 192.168.18.56           N/A       N/A        Y       10185
Self-heal Daemon on 192.168.18.52           N/A       N/A        Y       12758
 
Task Status of Volume volume1
------------------------------------------------------------------------------
There are no active volume tasks


#gluster volume info volume1
 
Volume Name: volume1
Type: Disperse
Volume ID: b272849c-c5b4-48a2-b8df-bf4c2a1dfb51
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.18.51:/export/sdb1/brick
Brick2: 192.168.18.52:/export/sdb1/brick
Brick3: 192.168.18.56:/export/sdb1/brick
Options Reconfigured:
transport.address-family: inet
nfs.disable: on


3.create a qemu image;

#qemu-img create -f qcow2 -o 'preallocation=metadata'  gluster://127.0.0.1/volume1/images/101/vm-101-disk-1.qcow2 1G

Formatting 'gluster://127.0.0.1/volume1/images/101/vm-101-disk-1.qcow2', fmt=qcow2 size=1073741824 encryption=off cluster_size=65536 preallocation=metadata lazy_refcounts=off refcount_bits=16

Segmentation fault

Additional info:

Reason of seg fault

frame 0
#0  0x00007fffe91394c4 in ec_manager_seek (fop=0x7fffe782a020, state=0) at ec-inode-read.c:1577
1577	in ec-inode-read.c
(gdb) print cbk
$1 = (ec_cbk_data_t *) 0x0
(gdb) print fop
$2 = (ec_fop_data_t *) 0x7fffe782a020
(gdb) print fop->answer
$3 = (ec_cbk_data_t *) 0x7fffe6208020
(gdb) print fop->answer
$4 = (ec_cbk_data_t *) 0x7fffe6208020

gdb output 

(gdb) bt
#0  0x00007fffe91394c4 in ec_manager_seek (fop=0x7fffe782a020, state=0) at ec-inode-read.c:1577
#1  0x00007fffe9123187 in __ec_manager (fop=0x7fffe782a020, error=0) at ec-common.c:2330
#2  0x00007fffe9123358 in ec_resume (fop=0x7fffe782a020, error=0) at ec-common.c:333
#3  0x00007fffe9123481 in ec_complete (fop=0x7fffe782a020) at ec-common.c:406
#4  0x00007fffe91367df in ec_seek_cbk (frame=0x7fffe6208020, cookie=0x0, this=0x7fffea8a2040, op_ret=-1, op_errno=-371255444, offset=4, xdata=0x0) at ec-inode-read.c:1534
#5  0x00007fffe93b3e03 in client3_3_seek_cbk (req=0x7fffea89fc40, iov=0x7fffe7817d20, count=0, myframe=0x7fffea83aa20) at client-rpc-fops.c:2205
#6  0x00007ffff5e0c380 in rpc_clnt_handle_reply (clnt=0x7fffe6208020, clnt@entry=0x7fffea80fc80, pollin=0x7fffea8781c0) at rpc-clnt.c:793
#7  0x00007ffff5e0c683 in rpc_clnt_notify (trans=<optimized out>, mydata=0x7fffea80fcb0, event=<optimized out>, data=0x7fffea8781c0) at rpc-clnt.c:986
#8  0x00007ffff5e089a3 in rpc_transport_notify (this=<optimized out>, event=<optimized out>, data=<optimized out>) at rpc-transport.c:538
#9  0x00007fffebc5c236 in socket_event_poll_in (this=0x7fffeaa5c240) at socket.c:2268
#10 0x00007fffebc5e3d5 in socket_event_handler (fd=<optimized out>, idx=4, data=0x7fffeaa5c240, poll_in=1, poll_out=0, poll_err=0) at socket.c:2398
#11 0x00007ffff6099746 in event_dispatch_epoll_handler (event=0x7fffe9df1cc0, event_pool=0x7ffff0a60040) at event-epoll.c:572
#12 event_dispatch_epoll_worker (data=0x7fffeb80f0c0) at event-epoll.c:675
#13 0x00007ffff3f00064 in start_thread (arg=0x7fffe9df2700) at pthread_create.c:309
#14 0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

(gdb) thread apply all bt

Thread 17 (Thread 0x7fffe3bff700 (LWP 8246)):
#0  0x00007ffff3c35c03 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007ffff609963b in event_dispatch_epoll_worker (data=0x7fffeab2b2a0) at event-epoll.c:665
#2  0x00007ffff3f00064 in start_thread (arg=0x7fffe3bff700) at pthread_create.c:309
#3  0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 16 (Thread 0x7ffff104c700 (LWP 8245)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007fffe4fcfd90 in iot_worker (data=0x7fffea87f240) at io-threads.c:191
#2  0x00007ffff3f00064 in start_thread (arg=0x7ffff104c700) at pthread_create.c:309
#3  0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 15 (Thread 0x7fffe9df2700 (LWP 8244)):
#0  0x00007fffe91394c4 in ec_manager_seek (fop=0x7fffe782a020, state=0) at ec-inode-read.c:1577
#1  0x00007fffe9123187 in __ec_manager (fop=0x7fffe782a020, error=0) at ec-common.c:2330
#2  0x00007fffe9123358 in ec_resume (fop=0x7fffe782a020, error=0) at ec-common.c:333
#3  0x00007fffe9123481 in ec_complete (fop=0x7fffe782a020) at ec-common.c:406
#4  0x00007fffe91367df in ec_seek_cbk (frame=0x7fffe6208020, cookie=0x0, this=0x7fffea8a2040, op_ret=-1, op_errno=-371255444, offset=4, xdata=0x0) at ec-inode-read.c:1534
#5  0x00007fffe93b3e03 in client3_3_seek_cbk (req=0x7fffea89fc40, iov=0x7fffe7817d20, count=0, myframe=0x7fffea83aa20) at client-rpc-fops.c:2205
#6  0x00007ffff5e0c380 in rpc_clnt_handle_reply (clnt=0x7fffe6208020, clnt@entry=0x7fffea80fc80, pollin=0x7fffea8781c0) at rpc-clnt.c:793
#7  0x00007ffff5e0c683 in rpc_clnt_notify (trans=<optimized out>, mydata=0x7fffea80fcb0, event=<optimized out>, data=0x7fffea8781c0) at rpc-clnt.c:986
#8  0x00007ffff5e089a3 in rpc_transport_notify (this=<optimized out>, event=<optimized out>, data=<optimized out>) at rpc-transport.c:538
#9  0x00007fffebc5c236 in socket_event_poll_in (this=0x7fffeaa5c240) at socket.c:2268
#10 0x00007fffebc5e3d5 in socket_event_handler (fd=<optimized out>, idx=4, data=0x7fffeaa5c240, poll_in=1, poll_out=0, poll_err=0) at socket.c:2398
#11 0x00007ffff6099746 in event_dispatch_epoll_handler (event=0x7fffe9df1cc0, event_pool=0x7ffff0a60040) at event-epoll.c:572
#12 event_dispatch_epoll_worker (data=0x7fffeb80f0c0) at event-epoll.c:675
#13 0x00007ffff3f00064 in start_thread (arg=0x7fffe9df2700) at pthread_create.c:309
#14 0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 14 (Thread 0x7fffea5f3700 (LWP 8243)):
#0  0x00007ffff3f0149b in pthread_join (threadid=140737117103872, thread_return=thread_return@entry=0x0) at pthread_join.c:92
#1  0x00007ffff6099b8b in event_dispatch_epoll (event_pool=0x7ffff0a60040) at event-epoll.c:759
#2  0x00007ffff631a1d4 in glfs_poller (data=<optimized out>) at glfs.c:654
#3  0x00007ffff3f00064 in start_thread (arg=0x7fffea5f3700) at pthread_create.c:309
#4  0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 13 (Thread 0x7fffeb7ff700 (LWP 8242)):
#0  0x00007ffff3f0714d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007ffff604f926 in gf_timer_proc (data=0x7fffea840380) at timer.c:164
#2  0x00007ffff3f00064 in start_thread (arg=0x7fffeb7ff700) at pthread_create.c:309
#3  0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 12 (Thread 0x7fffe77ff700 (LWP 8241)):
---Type <return> to continue, or q <return> to quit---
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007ffff6079b68 in syncenv_task (proc=proc@entry=0x7fffea8f4400) at syncop.c:603
#2  0x00007ffff607a8a0 in syncenv_processor (thdata=0x7fffea8f4400) at syncop.c:695
#3  0x00007ffff3f00064 in start_thread (arg=0x7fffe77ff700) at pthread_create.c:309
#4  0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 11 (Thread 0x7fffed072700 (LWP 8240)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007ffff6079b68 in syncenv_task (proc=proc@entry=0x7fffea8f4040) at syncop.c:603
#2  0x00007ffff607a8a0 in syncenv_processor (thdata=0x7fffea8f4040) at syncop.c:695
#3  0x00007ffff3f00064 in start_thread (arg=0x7fffed072700) at pthread_create.c:309
#4  0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 3 (Thread 0x7fffef5fe700 (LWP 8231)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007ffff6079b68 in syncenv_task (proc=proc@entry=0x7ffff0a24400) at syncop.c:603
#2  0x00007ffff607a8a0 in syncenv_processor (thdata=0x7ffff0a24400) at syncop.c:695
#3  0x00007ffff3f00064 in start_thread (arg=0x7fffef5fe700) at pthread_create.c:309
#4  0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 2 (Thread 0x7fffefdff700 (LWP 8230)):
#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
#1  0x00007ffff6079b68 in syncenv_task (proc=proc@entry=0x7ffff0a24040) at syncop.c:603
#2  0x00007ffff607a8a0 in syncenv_processor (thdata=0x7ffff0a24040) at syncop.c:695
#3  0x00007ffff3f00064 in start_thread (arg=0x7fffefdff700) at pthread_create.c:309
#4  0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x7ffff7fd9940 (LWP 8226)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007ffff6089153 in syncop_seek (subvol=subvol@entry=0x7fffea8aa440, fd=fd@entry=0x7ffff0832420, offset=offset@entry=0, what=what@entry=GF_SEEK_DATA, xdata_in=xdata_in@entry=0x0, 
    off=off@entry=0x7ffff09c0c48) at syncop.c:2865
#2  0x00007ffff632127d in glfs_seek (whence=3, offset=0, glfd=0x7fffea8c5100) at glfs-fops.c:597
#3  pub_glfs_lseek (glfd=glfd@entry=0x7fffea8c5100, offset=0, whence=whence@entry=3) at glfs-fops.c:646
#4  0x00005555555dcfc3 in qemu_gluster_test_seek (fd=0x7fffea8c5100) at block/gluster.c:672
#5  qemu_gluster_open (bs=<optimized out>, options=<optimized out>, bdrv_flags=<optimized out>, errp=<optimized out>) at block/gluster.c:738
#6  0x000055555558334d in bdrv_open_common (errp=<optimized out>, options=<optimized out>, file=<optimized out>, bs=<optimized out>) at block.c:984
#7  bdrv_open_inherit (filename=0x7ffff08be100 "", reference=0x7fffea904000 "", options=0x7fffea8de000, flags=32770, parent=0x0, child_role=child_role@entry=0x0, errp=0x7ffff09c0f30) at block.c:1687
#8  0x0000555555584391 in bdrv_open (filename=filename@entry=0x7ffff080e280 "gluster://127.0.0.1/volume1/images/101/vm-101-disk-1.qcow2", reference=reference@entry=0x0, options=options@entry=0x0, 
    flags=flags@entry=32770, errp=errp@entry=0x7ffff09c0f30) at block.c:1778
#9  0x00005555555bcbcb in blk_new_open (filename=0x7ffff080e280 "gluster://127.0.0.1/volume1/images/101/vm-101-disk-1.qcow2", reference=0x0, options=0x0, flags=32770, errp=0x7ffff09c0f30)
    at block/block-backend.c:160
#10 0x00005555555a1579 in qcow2_create2 (errp=<optimized out>, refcount_order=<optimized out>, version=<optimized out>, opts=<optimized out>, prealloc=<optimized out>, cluster_size=<optimized out>, 
    flags=<optimized out>, backing_format=<optimized out>, backing_file=<optimized out>, total_size=<optimized out>, filename=<optimized out>) at block/qcow2.c:2179
#11 qcow2_create (filename=0x7ffff09c06c4 "\001", opts=0x0, errp=0x1) at block/qcow2.c:2405
---Type <return> to continue, or q <return> to quit---
#12 0x000055555557e697 in bdrv_create_co_entry (opaque=0x7fffffffe950) at block.c:305
#13 0x00005555556369fa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:78
#14 0x00007ffff3b92f00 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#15 0x00007fffffffe1c0 in ?? ()
#16 0x0000000000000000 in ?? ()
(gdb) bt
#0  0x00007fffe91394c4 in ec_manager_seek (fop=0x7fffe782a020, state=0) at ec-inode-read.c:1577
#1  0x00007fffe9123187 in __ec_manager (fop=0x7fffe782a020, error=0) at ec-common.c:2330
#2  0x00007fffe9123358 in ec_resume (fop=0x7fffe782a020, error=0) at ec-common.c:333
#3  0x00007fffe9123481 in ec_complete (fop=0x7fffe782a020) at ec-common.c:406
#4  0x00007fffe91367df in ec_seek_cbk (frame=0x7fffe6208020, cookie=0x0, this=0x7fffea8a2040, op_ret=-1, op_errno=-371255444, offset=4, xdata=0x0) at ec-inode-read.c:1534
#5  0x00007fffe93b3e03 in client3_3_seek_cbk (req=0x7fffea89fc40, iov=0x7fffe7817d20, count=0, myframe=0x7fffea83aa20) at client-rpc-fops.c:2205
#6  0x00007ffff5e0c380 in rpc_clnt_handle_reply (clnt=0x7fffe6208020, clnt@entry=0x7fffea80fc80, pollin=0x7fffea8781c0) at rpc-clnt.c:793
#7  0x00007ffff5e0c683 in rpc_clnt_notify (trans=<optimized out>, mydata=0x7fffea80fcb0, event=<optimized out>, data=0x7fffea8781c0) at rpc-clnt.c:986
#8  0x00007ffff5e089a3 in rpc_transport_notify (this=<optimized out>, event=<optimized out>, data=<optimized out>) at rpc-transport.c:538
#9  0x00007fffebc5c236 in socket_event_poll_in (this=0x7fffeaa5c240) at socket.c:2268
#10 0x00007fffebc5e3d5 in socket_event_handler (fd=<optimized out>, idx=4, data=0x7fffeaa5c240, poll_in=1, poll_out=0, poll_err=0) at socket.c:2398
#11 0x00007ffff6099746 in event_dispatch_epoll_handler (event=0x7fffe9df1cc0, event_pool=0x7ffff0a60040) at event-epoll.c:572
#12 event_dispatch_epoll_worker (data=0x7fffeb80f0c0) at event-epoll.c:675
#13 0x00007ffff3f00064 in start_thread (arg=0x7fffe9df2700) at pthread_create.c:309
#14 0x00007ffff3c3562d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

--- Additional comment from Xavier Hernandez on 2017-04-05 08:43:09 CEST ---



--- Additional comment from Xavier Hernandez on 2017-04-05 08:45:19 CEST ---

This seems a bug in the disperse module. I'll debug it.

Comment 1 Worker Ant 2017-04-05 07:56:00 UTC
REVIEW: https://review.gluster.org/16998 (cluster/ec: fix incorrect answer check in seek fop) posted (#1) for review on master by Xavier Hernandez (xhernandez@datalab.es)

Comment 2 Worker Ant 2017-05-09 09:51:22 UTC
COMMIT: https://review.gluster.org/16998 committed in master by Pranith Kumar Karampuri (pkarampu@redhat.com) 
------
commit af226d250bcced782d19412bd7de1ca32834c8eb
Author: Xavier Hernandez <xhernandez@datalab.es>
Date:   Wed Apr 5 09:52:39 2017 +0200

    cluster/ec: fix incorrect answer check in seek fop
    
    A bad check in the answer of a seek request caused a segmentation
    fault when seek reported an error.
    
    Change-Id: Ifb25ae8bf7cc4019d46171c431f7b09b376960e8
    BUG: 1439068
    Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
    Reviewed-on: https://review.gluster.org/16998
    Smoke: Gluster Build System <jenkins@build.gluster.org>
    NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
    CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
    Reviewed-by: Amar Tumballi <amarts@redhat.com>
    Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>

Comment 3 Shyamsundar 2017-09-05 17:26:58 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report.

glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.