Bug 810450
Summary: | glusterfs process crashed while running parallel dbench on multiple clients | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Vijaykumar Koppad <vkoppad> | ||||
Component: | fuse | Assignee: | shishir gowda <sgowda> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Vijaykumar Koppad <vkoppad> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | mainline | CC: | bbandari, gluster-bugs, nsathyan, shmohan, vbellur, vbhat | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | glusterfs-3.4.0 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-07-24 17:37:58 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 817967 | ||||||
Attachments: |
|
CHANGE: http://review.gluster.com/3190 (stripe: make sure we have complete set of subvolumes before making fop) merged in master by Vijay Bellur (vijay) *** Bug 804274 has been marked as a duplicate of this bug. *** *** Bug 786094 has been marked as a duplicate of this bug. *** |
Created attachment 575653 [details] Client log file Description of problem: While running dbench on multiple clients simultaneously, in a geo-replication setup with distribute-stripe as master. Version-Release number of selected component (if applicable): [cc2e9ad0751da55dfdcd86fea2d5b312a1cbd1b5] Steps to Reproduce: 1.setup ssh-geo-replication with distributed-stripe as master. 2.Run dbench on some 10 clients simultaneously. This is back-trace -- ############################################################################ #0 0x00007f6583b1299b in stripe_writev (frame=0x7f658729ade4, this=0x16199d0, fd=0x16e4db4, vector=0x7f657401c240, count=1, offset=131072, flags=32770, iobref=0x7f65740330c0, xdata=0x0) at stripe.c:3508 #1 0x00007f65838d7469 in dht_writev (frame=0x7f6587291a14, this=0x161a590, fd=0x16e4db4, vector=0x7f657401c240, count=1, off=131072, flags=32770, iobref=0x7f65740330c0, xdata=0x0) at dht-inode-write.c:158 #2 0x00007f6583687251 in wb_sync (frame=0x7f6587093bf4, file=0x17b1420, winds=0x7f657bffe630) at write-behind.c:547 #3 0x00007f658368d8bd in wb_do_ops (frame=0x7f6587093bf4, file=0x17b1420, winds=0x7f657bffe630, unwinds=0x7f657bffe620, other_requests=0x7f657bffe610) at write-behind.c:1884 #4 0x00007f658368e131 in wb_process_queue (frame=0x7f6587093bf4, file=0x17b1420) at write-behind.c:2074 #5 0x00007f658368ebca in wb_writev (frame=0x7f658729a5d4, this=0x161b870, fd=0x16e4db4, vector=0x7f657400f6f0, count=1, offset=131072, flags=32770, iobref=0x7f657400fd90, xdata=0x0) at write-behind.c:2197 #6 0x00007f658347bea1 in ra_writev (frame=0x7f658729512c, this=0x161caa0, fd=0x16e4db4, vector=0x7f657400f6f0, count=1, offset=131072, flags=32770, iobref=0x7f657400fd90, xdata=0x0) at read-ahead.c:691 #7 0x00007f658326a258 in ioc_writev (frame=0x7f6587296758, this=0x161dc50, fd=0x16e4db4, vector=0x7f657400f6f0, count=1, offset=131072, flags=32770, iobref=0x7f657400fd90, xdata=0x0) at io-cache.c:1250 #8 0x00007f658304e884 in qr_writev (frame=0x7f6587298238, this=0x161ee00, fd=0x16e4db4, vector=0x7f657400f6f0, count=1, off=131072, wr_flags=32770, iobref=0x7f657400fd90, xdata=0x0) at quick-read.c:1544 #9 0x00007f6582e3e3d2 in mdc_writev (frame=0x7f658729ade4, this=0x1620010, fd=0x16e4db4, vector=0x7f657400f6f0, count=1, offset=131072, flags=32770, iobref=0x7f657400fd90, xdata=0x0) at md-cache.c:1342 #10 0x00007f6582c2c666 in io_stats_writev (frame=0x7f6587291a14, this=0x1621290, fd=0x16e4db4, vector=0x7f657400f6f0, count=1, offset=131072, flags=32770, iobref=0x7f657400fd90, xdata=0x0) at io-stats.c:2082 #11 0x00007f6586bb0b96 in fuse_write_resume (state=0x7f657400eff0) at fuse-bridge.c:2042 #12 0x00007f6586ba4164 in fuse_resolve_done (state=0x7f657400eff0) at fuse-resolve.c:453 #13 0x00007f6586ba423a in fuse_resolve_all (state=0x7f657400eff0) at fuse-resolve.c:482 #14 0x00007f6586ba412d in fuse_resolve (state=0x7f657400eff0) at fuse-resolve.c:439 #15 0x00007f6586ba4211 in fuse_resolve_all (state=0x7f657400eff0) at fuse-resolve.c:478 #16 0x00007f6586ba42b4 in fuse_resolve_continue (state=0x7f657400eff0) at fuse-resolve.c:498 #17 0x00007f6586ba3ef0 in fuse_resolve_fd (state=0x7f657400eff0) at fuse-resolve.c:351 #18 0x00007f6586ba40db in fuse_resolve (state=0x7f657400eff0) at fuse-resolve.c:428 #19 0x00007f6586ba41bc in fuse_resolve_all (state=0x7f657400eff0) at fuse-resolve.c:471 #20 0x00007f6586ba42f2 in fuse_resolve_and_resume (state=0x7f657400eff0, fn=0x7f6586bb05e5 <fuse_write_resume>) at fuse-resolve.c:511 #21 0x00007f6586bb0db3 in fuse_write (this=0x1601ad0, finh=0x7f6574002ce0, msg=0x7f658787c000) at fuse-bridge.c:2089 #22 0x00007f6586bba6e8 in fuse_thread_proc (data=0x1601ad0) at fuse-bridge.c:3962 #23 0x0000003259c077f1 in start_thread () from /lib64/libpthread.so.0 #24 0x00000032594e5ccd in clone () from /lib64/libc.so.6 (gdb) f 0 #0 0x00007f6583b1299b in stripe_writev (frame=0x7f658729ade4, this=0x16199d0, fd=0x16e4db4, vector=0x7f657401c240, count=1, offset=131072, flags=32770, iobref=0x7f65740330c0, xdata=0x0) at stripe.c:3508 3508 STACK_WIND (frame, stripe_writev_cbk, fctx->xl_array[idx], (gdb) f 1 #1 0x00007f65838d7469 in dht_writev (frame=0x7f6587291a14, this=0x161a590, fd=0x16e4db4, vector=0x7f657401c240, count=1, off=131072, flags=32770, iobref=0x7f65740330c0, xdata=0x0) at dht-inode-write.c:158 158 STACK_WIND (frame, dht_writev_cbk, (gdb) f 3 #3 0x00007f658368d8bd in wb_do_ops (frame=0x7f6587093bf4, file=0x17b1420, winds=0x7f657bffe630, unwinds=0x7f657bffe620, other_requests=0x7f657bffe610) at write-behind.c:1884 1884 ret = wb_sync (frame, file, winds); (gdb) f 2 #2 0x00007f6583687251 in wb_sync (frame=0x7f6587093bf4, file=0x17b1420, winds=0x7f657bffe630) at write-behind.c:547 547 STACK_WIND (sync_frame, wb_sync_cbk, ####################################################### Back-trace from the log . ####################################################### [2012-04-05 04:10:25.870674] I [rpc-clnt.c:1669:rpc_clnt_reconfig] 0-doa-client-0: changing port to 24009 (from 0) [2012-04-05 04:10:25.870873] I [rpc-clnt.c:1669:rpc_clnt_reconfig] 0-doa-client-1: changing port to 24010 (from 0) [2012-04-05 04:10:25.871021] I [rpc-clnt.c:1669:rpc_clnt_reconfig] 0-doa-client-2: changing port to 24011 (from 0) [2012-04-05 04:10:25.871178] I [rpc-clnt.c:1669:rpc_clnt_reconfig] 0-doa-client-3: changing port to 24012 (from 0) [2012-04-05 04:10:25.871332] I [client.c:136:client_register_grace_timer] 0-doa-client-0: Registering a grace timer [2012-04-05 04:10:25.871389] I [client.c:136:client_register_grace_timer] 0-doa-client-1: Registering a grace timer [2012-04-05 04:10:25.871428] I [client.c:136:client_register_grace_timer] 0-doa-client-2: Registering a grace timer [2012-04-05 04:10:25.871465] I [client.c:136:client_register_grace_timer] 0-doa-client-3: Registering a grace timer [2012-04-05 04:10:29.833480] W [client.c:2078:client_rpc_notify] 0-doa-client-0: Cancelling the grace timer [2012-04-05 04:10:29.833684] I [client-handshake.c:1632:select_server_supported_programs] 0-doa-client-0: Using Program GlusterFS 3git, Num (1298437), Version (330) [2012-04-05 04:10:29.834137] I [client-handshake.c:1429:client_setvolume_cbk] 0-doa-client-0: Connected to 172.17.251.54:24009, attached to remote volume '/exportdir/d1'. [2012-04-05 04:10:29.834170] I [client-handshake.c:1441:client_setvolume_cbk] 0-doa-client-0: Server and Client lk-version numbers are not same, reopening the fds [2012-04-05 04:10:29.892033] I [client-handshake.c:456:client_set_lk_version_cbk] 0-doa-client-0: Server lk version = 1 [2012-04-05 04:10:29.892582] W [client.c:2078:client_rpc_notify] 0-doa-client-1: Cancelling the grace timer [2012-04-05 04:10:29.892777] I [client-handshake.c:1632:select_server_supported_programs] 0-doa-client-1: Using Program GlusterFS 3git, Num (1298437), Version (330) [2012-04-05 04:10:29.893162] I [client-handshake.c:1429:client_setvolume_cbk] 0-doa-client-1: Connected to 172.17.251.54:24010, attached to remote volume '/exportdir/d2'. [2012-04-05 04:10:29.893195] I [client-handshake.c:1441:client_setvolume_cbk] 0-doa-client-1: Server and Client lk-version numbers are not same, reopening the fds [2012-04-05 04:10:29.893858] I [client-handshake.c:456:client_set_lk_version_cbk] 0-doa-client-1: Server lk version = 1 [2012-04-05 04:10:29.899012] W [client.c:2078:client_rpc_notify] 0-doa-client-2: Cancelling the grace timer [2012-04-05 04:10:29.899178] I [client-handshake.c:1632:select_server_supported_programs] 0-doa-client-2: Using Program GlusterFS 3git, Num (1298437), Version (330) [2012-04-05 04:10:29.899548] I [client-handshake.c:1429:client_setvolume_cbk] 0-doa-client-2: Connected to 172.17.251.54:24011, attached to remote volume '/exportdir/d3'. [2012-04-05 04:10:29.899591] I [client-handshake.c:1441:client_setvolume_cbk] 0-doa-client-2: Server and Client lk-version numbers are not same, reopening the fds [2012-04-05 04:10:29.899868] I [client-handshake.c:456:client_set_lk_version_cbk] 0-doa-client-2: Server lk version = 1 [2012-04-05 04:10:29.904640] W [client.c:2078:client_rpc_notify] 0-doa-client-3: Cancelling the grace timer [2012-04-05 04:10:29.904841] I [client-handshake.c:1632:select_server_supported_programs] 0-doa-client-3: Using Program GlusterFS 3git, Num (1298437), Version (330) [2012-04-05 04:10:29.905198] I [client-handshake.c:1429:client_setvolume_cbk] 0-doa-client-3: Connected to 172.17.251.54:24012, attached to remote volume '/exportdir/d4'. [2012-04-05 04:10:29.905229] I [client-handshake.c:1441:client_setvolume_cbk] 0-doa-client-3: Server and Client lk-version numbers are not same, reopening the fds [2012-04-05 04:10:29.932341] I [fuse-bridge.c:4081:fuse_graph_setup] 0-fuse: switched to graph 0 [2012-04-05 04:10:29.932548] I [client-handshake.c:456:client_set_lk_version_cbk] 0-doa-client-3: Server lk version = 1 [2012-04-05 04:10:29.932877] I [fuse-bridge.c:3358:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.13 [2012-04-05 04:19:01.353877] W [client3_1-fops.c:263:client3_1_mknod_cbk] 0-doa-client-2: remote operation failed: File exists. Path: /clients/client13/~dmtmp/PM/MOVED.DOC [2012-04-05 04:19:02.085299] W [client3_1-fops.c:263:client3_1_mknod_cbk] 0-doa-client-2: remote operation failed: File exists. Path: /clients/client10/~dmtmp/PM/T1.XLS pending frames: frame : type(1) op(WRITE) frame : type(1) op(WRITE) frame : type(1) op(WRITE) frame : type(1) op(WRITE) frame : type(1) op(WRITE) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2012-04-05 04:19:34 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3git /lib64/libc.so.6[0x3259432900] /usr/local/lib/glusterfs/3git/xlator/cluster/stripe.so(stripe_writev+0x798)[0x7f8fa8c7d99b] /usr/local/lib/glusterfs/3git/xlator/cluster/distribute.so(dht_writev+0x50d)[0x7f8fa8a42469] /usr/local/lib/glusterfs/3git/xlator/performance/write-behind.so(wb_sync+0x852)[0x7f8fa87f2251] /usr/local/lib/glusterfs/3git/xlator/performance/write-behind.so(wb_do_ops+0x144)[0x7f8fa87f88bd] /usr/local/lib/glusterfs/3git/xlator/performance/write-behind.so(wb_process_queue+0x2d1)[0x7f8fa87f9131] /usr/local/lib/glusterfs/3git/xlator/performance/write-behind.so(wb_writev+0x8e0)[0x7f8fa87f9bca] /usr/local/lib/glusterfs/3git/xlator/performance/read-ahead.so(ra_writev+0x407)[0x7f8fa85e6ea1] /usr/local/lib/glusterfs/3git/xlator/performance/io-cache.so(ioc_writev+0x49a)[0x7f8fa83d5258] /usr/local/lib/glusterfs/3git/xlator/performance/quick-read.so(qr_writev+0x790)[0x7f8fa81b9884] /usr/local/lib/glusterfs/3git/xlator/performance/md-cache.so(mdc_writev+0x28f)[0x7f8fa3dfa3d2] /usr/local/lib/glusterfs/3git/xlator/debug/io-stats.so(io_stats_writev+0x433)[0x7f8fa3be8666] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(fuse_write_resume+0x5b1)[0x7f8fabd1bb96] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x8164)[0x7f8fabd0f164] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x823a)[0x7f8fabd0f23a] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x812d)[0x7f8fabd0f12d] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x8211)[0x7f8fabd0f211] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(fuse_resolve_continue+0x24)[0x7f8fabd0f2b4] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x7ef0)[0x7f8fabd0eef0] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x80db)[0x7f8fabd0f0db] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x81bc)[0x7f8fabd0f1bc] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(fuse_resolve_and_resume+0x37)[0x7f8fabd0f2f2] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x14db3)[0x7f8fabd1bdb3] /usr/local/lib/glusterfs/3git/xlator/mount/fuse.so(+0x1e6e8)[0x7f8fabd256e8] /lib64/libpthread.so.0[0x3259c077f1] /lib64/libc.so.6(clone+0x6d)[0x32594e5ccd]