Bug 1242423 - Disperse volume : client glusterfs crashed while running IO
Summary: Disperse volume : client glusterfs crashed while running IO
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: disperse
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: RHGS 3.1.0
Assignee: Bug Updates Notification Mailing List
QA Contact: Bhaskarakiran
URL:
Whiteboard:
Depends On:
Blocks: 1202842 1223636 1243187
TreeView+ depends on / blocked
 
Reported: 2015-07-13 10:10 UTC by Bhaskarakiran
Modified: 2016-11-23 23:11 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.7.1-10
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1243187 (view as bug list)
Environment:
Last Closed: 2015-07-29 05:11:57 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 0 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 08:26:26 UTC

Description Bhaskarakiran 2015-07-13 10:10:47 UTC
Description of problem:
======================

Seen the crash while running IO on fuse mount. (files creation with dd varying block size, directory creations and linux untar's)

Backtrace:
=========

(gdb) bt
#0  ec_unlock_timer_del (link=0x7fe43c231c2c) at ec-common.c:1696
#1  0x00007fe451f28713 in gf_timer_proc (ctx=0x7fe452d71010) at timer.c:194
#2  0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fe45095b96d in clone () from /lib64/libc.so.6
(gdb)  t a a  bt

Thread 9 (Thread 0x7fe437fff700 (LWP 15718)):
#0  0x00007fe450953a57 in writev () from /lib64/libc.so.6
#1  0x00007fe4492251ff in send_fuse_iov (this=<value optimized out>, finh=<value optimized out>, iov_out=0x7fe437ffe120, count=2) at fuse-bridge.c:191
#2  0x00007fe44922541a in send_fuse_data (this=<value optimized out>, finh=<value optimized out>, data=<value optimized out>, size=<value optimized out>)
    at fuse-bridge.c:230
#3  0x00007fe449235bda in fuse_writev_cbk (frame=0x7fe44f9c0958, cookie=<value optimized out>, this=0x7fe452d9e600, op_ret=65536, op_errno=0, 
    stbuf=0x7fe437ffe500, postbuf=0x7fe437ffe500, xdata=0x0) at fuse-bridge.c:2269
#4  0x00007fe43eb2a3ae in io_stats_writev_cbk (frame=0x7fe44fb0b380, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7fe437ffe500, postbuf=0x7fe437ffe500, xdata=0x0) at io-stats.c:1404
#5  0x00007fe43ef4ddb0 in mdc_writev_cbk (frame=0x7fe44fb57fb0, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, 
    op_errno=<value optimized out>, prebuf=0x7fe437ffe500, postbuf=0x7fe437ffe500, xdata=0x0) at md-cache.c:1524
#6  0x00007fe451f1706c in default_writev_cbk (frame=0x7fe44fb28dd8, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=<value optimized out>, postbuf=0x7fe437ffe500, xdata=0x0) at defaults.c:1021
#7  0x00007fe451f1706c in default_writev_cbk (frame=0x7fe44fb3a0a4, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=<value optimized out>, postbuf=0x7fe437ffe500, xdata=0x0) at defaults.c:1021
#8  0x00007fe43f56e974 in ioc_writev_cbk (frame=0x7fe44fb2da70, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7fe437ffe500, postbuf=0x7fe437ffe500, xdata=0x0) at io-cache.c:1244
#9  0x00007fe43f982bd9 in ra_writev_cbk (frame=0x7fe44fb35360, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7fe437ffe500, postbuf=0x7fe437ffe500, xdata=0x0) at read-ahead.c:661
#10 0x00007fe43fb8f30c in wb_do_unwinds (wb_inode=<value optimized out>, lies=0x7fe437ffe5d0) at write-behind.c:924
#11 0x00007fe43fb91b9c in wb_process_queue (wb_inode=0x7fe4258cb4a0) at write-behind.c:1213
#12 0x00007fe43fb924f8 in wb_writev (frame=0x7fe44fadae2c, this=<value optimized out>, fd=0x7fe441da9cd0, vector=0x7fe3eb17e6f0, count=1, offset=6160384, 
    flags=32769, iobref=0x7fe401611f30, xdata=0x0) at write-behind.c:1333
#13 0x00007fe43f98288a in ra_writev (frame=0x7fe44fb35360, this=0x7fe439c61a20, fd=0x7fe441da9cd0, vector=0x7fe3eb17e6f0, count=1, offset=6160384, 
    flags=32769, iobref=0x7fe401611f30, xdata=0x0) at read-ahead.c:689
#14 0x00007fe451f0f960 in default_writev (frame=0x7fe44fb35360, this=0x7fe439c62780, fd=0x7fe441da9cd0, vector=0x7fe3eb17e6f0, count=<value optimized out>, 
    off=<value optimized out>, flags=32769, iobref=0x7fe401611f30, xdata=0x0) at defaults.c:1836
#15 0x00007fe43f56e6c6 in ioc_writev (frame=0x7fe44fb2da70, this=0x7fe439c63610, fd=0x7fe441da9cd0, vector=0x7fe3eb17e6f0, count=1, offset=6160384, 
    flags=32769, iobref=0x7fe401611f30, xdata=0x0) at io-cache.c:1285
#16 0x00007fe43f362040 in qr_writev (frame=0x7fe44fb3a0a4, this=0x7fe439c64360, fd=0x7fe441da9cd0, iov=0x7fe3eb17e6f0, count=1, offset=6160384, flags=32769, 
    iobref=0x7fe401611f30, xdata=0x0) at quick-read.c:636
#17 0x00007fe451f13a23 in default_writev_resume (frame=0x7fe44fb28dd8, this=0x7fe439c65120, fd=0x7fe441da9cd0, vector=0x7fe3eb17e6f0, count=1, off=6160384, 
    flags=32769, iobref=0x7fe401611f30, xdata=0x0) at defaults.c:1395
#18 0x00007fe451f2d3aa in call_resume_wind (stub=<value optimized out>) at call-stub.c:2124
#19 0x00007fe451f31640 in call_resume (stub=0x7fe44f5a5c24) at call-stub.c:2576
#20 0x00007fe43f158760 in open_and_resume (this=0x7fe439c65120, fd=0x7fe441da9cd0, stub=0x7fe44f5a5c24) at open-behind.c:242
#21 0x00007fe43f159dfb in ob_writev (frame=0x7fe44fb28dd8, this=0x7fe439c65120, fd=0x7fe441da9cd0, iov=<value optimized out>, count=<value optimized out>, 
    offset=<value optimized out>, flags=32769, iobref=0x7fe401611f30, xdata=0x0) at open-behind.c:414
#22 0x00007fe43ef4ab3b in mdc_writev (frame=0x7fe44fb57fb0, this=0x7fe439c65ee0, fd=0x7fe441da9cd0, vector=0x7fe405cc40c0, count=1, offset=6160384, 
    flags=32769, iobref=0x7fe401611f30, xdata=0x0) at md-cache.c:1542
#23 0x00007fe451f0f960 in default_writev (frame=0x7fe44fb57fb0, this=0x7fe439c68030, fd=0x7fe441da9cd0, vector=0x7fe405cc40c0, count=<value optimized out>, 
    off=<value optimized out>, flags=32769, iobref=0x7fe401611f30, xdata=0x0) at defaults.c:1836
#24 0x00007fe43eb227a7 in io_stats_writev (frame=0x7fe44fb0b380, this=0x7fe439c69180, fd=0x7fe441da9cd0, vector=0x7fe405cc40c0, count=1, offset=6160384, 
    flags=32769, iobref=0x7fe401611f30, xdata=0x0) at io-stats.c:2164
#25 0x00007fe451f0f960 in default_writev (frame=0x7fe44fb0b380, this=0x7fe439c6a180, fd=0x7fe441da9cd0, vector=0x7fe405cc40c0, count=<value optimized out>, 
    off=<value optimized out>, flags=32769, iobref=0x7fe401611f30, xdata=0x0) at defaults.c:1836
#26 0x00007fe43e90904e in meta_writev (frame=0x7fe44fb0b380, this=0x7fe439c6a180, fd=0x7fe441da9cd0, iov=0x7fe405cc40c0, count=<value optimized out>, 
    offset=<value optimized out>, flags=32769, iobref=0x7fe401611f30, xdata=0x0) at meta.c:147
#27 0x00007fe4492313eb in fuse_write_resume (state=<value optimized out>) at fuse-bridge.c:2309
#28 0x00007fe4492245f5 in fuse_resolve_done (state=<value optimized out>) at fuse-resolve.c:644
#29 fuse_resolve_all (state=<value optimized out>) at fuse-resolve.c:671
#30 0x00007fe449224326 in fuse_resolve (state=0x7fe405cc39d0) at fuse-resolve.c:635
#31 0x00007fe44922463e in fuse_resolve_all (state=<value optimized out>) at fuse-resolve.c:667
#32 0x00007fe4492246a3 in fuse_resolve_continue (state=0x7fe405cc39d0) at fuse-resolve.c:687
#33 0x00007fe4492242ce in fuse_resolve_fd (state=0x7fe405cc39d0) at fuse-resolve.c:547
#34 fuse_resolve (state=0x7fe405cc39d0) at fuse-resolve.c:624
#35 0x00007fe44922461e in fuse_resolve_all (state=<value optimized out>) at fuse-resolve.c:660
#36 0x00007fe449224668 in fuse_resolve_and_resume (state=0x7fe405cc39d0, fn=0x7fe449231170 <fuse_write_resume>) at fuse-resolve.c:699
---Type <return> to continue, or q <return> to quit---
#37 0x00007fe449239240 in fuse_thread_proc (data=0x7fe452d9e600) at fuse-bridge.c:4903
#38 0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#39 0x00007fe45095b96d in clone () from /lib64/libc.so.6

Thread 8 (Thread 0x7fe44881a700 (LWP 15709)):
#0  0x00007fe450ff9535 in sigwait () from /lib64/libpthread.so.0
#1  0x00007fe4523d002b in glusterfs_sigwaiter ()
#2  0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fe45095b96d in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x7fe4523b3740 (LWP 15707)):
#0  0x00007fe450ff22ad in pthread_join () from /lib64/libpthread.so.0
#1  0x00007fe451f6a53d in event_dispatch_epoll (event_pool=0x7fe452d8fc90) at event-epoll.c:762
#2  0x00007fe4523d1ef1 in main ()

Thread 6 (Thread 0x7fe4375fe700 (LWP 15719)):
#0  0x00007fe450ff87dd in read () from /lib64/libpthread.so.0
#1  0x00007fe44922c9b3 in read (data=<value optimized out>) at /usr/include/bits/unistd.h:45
#2  notify_kernel_loop (data=<value optimized out>) at fuse-bridge.c:3824
#3  0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fe45095b96d in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7fe43e903700 (LWP 15715)):
#0  0x00007fe45095bf63 in epoll_wait () from /lib64/libc.so.6
#1  0x00007fe451f6a8c1 in event_dispatch_epoll_worker (data=0x7fe440028e40) at event-epoll.c:668
#2  0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fe45095b96d in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7fe447418700 (LWP 15711)):
#0  0x00007fe450ff5a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fe451f4dd6b in syncenv_task (proc=0x7fe452db1a50) at syncop.c:607
#2  0x00007fe451f52b80 in syncenv_processor (thdata=0x7fe452db1a50) at syncop.c:699
#3  0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fe45095b96d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7fe447e19700 (LWP 15710)):
#0  0x00007fe450ff5a0e in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fe451f4dd6b in syncenv_task (proc=0x7fe452db1690) at syncop.c:607
#2  0x00007fe451f52b80 in syncenv_processor (thdata=0x7fe452db1690) at syncop.c:699
#3  0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#4  0x00007fe45095b96d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7fe444fdf700 (LWP 15714)):
#0  0x00007fe45095bf63 in epoll_wait () from /lib64/libc.so.6
#1  0x00007fe451f6a8c1 in event_dispatch_epoll_worker (data=0x7fe452de4c00) at event-epoll.c:668
#2  0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fe45095b96d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7fe44921b700 (LWP 15708)):
#0  ec_unlock_timer_del (link=0x7fe43c231c2c) at ec-common.c:1696
#1  0x00007fe451f28713 in gf_timer_proc (ctx=0x7fe452d71010) at timer.c:194
#2  0x00007fe450ff1a51 in start_thread () from /lib64/libpthread.so.0
#3  0x00007fe45095b96d in clone () from /lib64/libc.so.6
(gdb) 


Version-Release number of selected component (if applicable):
=============================================================
3.7.1-9

How reproducible:
=================
Seen once

Steps to Reproduce:
As in description. Run IO on fuse mount.

Actual results:


Expected results:


Additional info:
================
corefile will be copied to sosreports folder.

Comment 2 Bhaskarakiran 2015-07-13 11:25:15 UTC
Have reproduced the bug second time with plain IO. (dd, mkdir and linux untar's)

Comment 5 Bhaskarakiran 2015-07-17 13:08:24 UTC
verified this on 3.7.1-10 and didn't see the crash. Moving this to fixed.

Comment 6 errata-xmlrpc 2015-07-29 05:11:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html


Note You need to log in before you can comment on or make changes to this bug.