Hide Forgot
This looks to be a duplicate of bug 762600. Bug 868 has been fixed in the 3.0.5 release candidates. Can you please check if your crash is fixed by 3.0.5rc9 or by applying this patch on to GlusterFS 3.0.4? http://patches.gluster.com/patch/3213
Replicated setup. 2 servers and 6 clients. On 4 clients everything is ok on 2 other clients I see multiple crashs during the day. On these 2 "bad" clients we are using a script which does many reads in one particular directory (mounted with glusterfs). This script is in PHP and uses threads. I post below the (same) configuration file used on all clients and full gdb backtrace. ** Servers [...] ################# # CONF BK2010 ################# volume posix4 type storage/posix option directory /data/exports/export_conf2010 end-volume volume locks4 type features/locks subvolumes posix4 end-volume volume brick4 type performance/io-threads option thread-count 8 subvolumes locks4 end-volume [...] ** Clients ## file auto generated by /bin/glusterfs-volgen (mount.vol) # Cmd line: # $ /bin/glusterfs-volgen --name bkconf2010 --raid 1 10.110.0.41:/data/exports/export_conf2010 10.110.0.4 2:/data/exports/export_conf2010 # RAID 1 # TRANSPORT-TYPE tcp volume 10.110.0.42-4 type protocol/client option transport-type tcp option remote-host 10.110.0.42 option transport.socket.nodelay on option transport.remote-port 6996 option remote-subvolume brick4 end-volume volume 10.110.0.41-4 type protocol/client option transport-type tcp option remote-host 10.110.0.41 option transport.socket.nodelay on option transport.remote-port 6996 option remote-subvolume brick4 end-volume volume mirror-4 type cluster/replicate subvolumes 10.110.0.41-4 10.110.0.42-4 end-volume volume readahead4 type performance/read-ahead option page-count 4 subvolumes mirror-4 end-volume volume iocache4 type performance/io-cache option cache-size `echo $(( $(grep 'MemTotal' /proc/meminfo | sed 's/[^0-9]//g') / 5120 ))`MB option cache-timeout 1 subvolumes readahead4 end-volume volume quickread4 type performance/quick-read option cache-timeout 1 option max-file-size 64kB subvolumes iocache4 end-volume volume writebehind4 type performance/write-behind option enable-trickling-writes false option flush-behind on option cache-size 4MB subvolumes quickread4 end-volume volume statprefetch4 type performance/stat-prefetch subvolumes writebehind4 end-volume ** Backtrace shell# gdb /sbin/glusterfs /core.17091 GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu"... warning: Can't read pathname for load map: Input/output error. Reading symbols from /lib/libglusterfs.so.0...done. Loaded symbols for /lib/libglusterfs.so.0 Reading symbols from /lib/libdl.so.2...done. Loaded symbols for /lib/libdl.so.2 Reading symbols from /lib/libpthread.so.0...done. Loaded symbols for /lib/libpthread.so.0 Reading symbols from /lib/libc.so.6...done. Loaded symbols for /lib/libc.so.6 Reading symbols from /lib/ld-linux-x86-64.so.2...done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Reading symbols from /lib/glusterfs/3.0.4/xlator/protocol/client.so...done. Loaded symbols for /lib/glusterfs/3.0.4/xlator/protocol/client.so Reading symbols from /lib/glusterfs/3.0.4/xlator/cluster/replicate.so...done. Loaded symbols for /lib/glusterfs/3.0.4/xlator/cluster/replicate.so Reading symbols from /lib/glusterfs/3.0.4/xlator/performance/read-ahead.so...done. Loaded symbols for /lib/glusterfs/3.0.4/xlator/performance/read-ahead.so Reading symbols from /lib/glusterfs/3.0.4/xlator/performance/io-cache.so...done. Loaded symbols for /lib/glusterfs/3.0.4/xlator/performance/io-cache.so Reading symbols from /lib/glusterfs/3.0.4/xlator/performance/quick-read.so...done. Loaded symbols for /lib/glusterfs/3.0.4/xlator/performance/quick-read.so Reading symbols from /lib/glusterfs/3.0.4/xlator/performance/write-behind.so...done. Loaded symbols for /lib/glusterfs/3.0.4/xlator/performance/write-behind.so Reading symbols from /lib/glusterfs/3.0.4/xlator/performance/stat-prefetch.so...done. Loaded symbols for /lib/glusterfs/3.0.4/xlator/performance/stat-prefetch.so Reading symbols from /lib/glusterfs/3.0.4/xlator/mount/fuse.so...done. Loaded symbols for /lib/glusterfs/3.0.4/xlator/mount/fuse.so Reading symbols from /lib/glusterfs/3.0.4/transport/socket.so...done. Loaded symbols for /lib/glusterfs/3.0.4/transport/socket.so Reading symbols from /lib/libnss_files.so.2...done. Loaded symbols for /lib/libnss_files.so.2 Reading symbols from /lib/libgcc_s.so.1...done. Loaded symbols for /lib/libgcc_s.so.1 Core was generated by `/sbin/glusterfs --log-level=NORMAL --volfile=/etc/glusterfs/bkconf2010.vol /dat'. Program terminated with signal 11, Segmentation fault. [New process 17097] [New process 17091] [New process 17092] #0 0x00007f40d3fea114 in ra_fstat (frame=0x7f40c404f5b0, this=0x2534bd0, fd=0x25eaca0) at read-ahead.c:750 750 read-ahead.c: No such file or directory. in read-ahead.c (gdb) thread apply all bt full Thread 3 (process 17092): #0 0x00007f40d4f12071 in nanosleep () from /lib/libc.so.6 No symbol table info available. #1 0x00007f40d4f3d534 in usleep () from /lib/libc.so.6 No symbol table info available. #2 0x00007f40d5610652 in gf_timer_proc (ctx=0x252d010) at timer.c:177 now = 1278460080777811 now_tv = {tv_sec = 1278460080, tv_usec = 777811} event = (gf_timer_t *) 0x253c550 reg = (gf_timer_registry_t *) 0x25380a0 __FUNCTION__ = "gf_timer_proc" #3 0x00007f40d51cdfc7 in start_thread () from /lib/libpthread.so.0 No symbol table info available. #4 0x00007f40d4f4364d in clone () from /lib/libc.so.6 No symbol table info available. #5 0x0000000000000000 in ?? () No symbol table info available. Thread 2 (process 17091): #0 0x00007f40d51d2645 in ?? () from /lib/libpthread.so.0 No symbol table info available. #1 0x00007f40d3dda73c in ioc_open_cbk (frame=0x7f40c404cf40, cookie=0x7f40c404bd80, this=0x25354b0, op_ret=0, op_errno=117, fd=0x25903c0) at io-cache.c:474 tmp_ioc_inode = 0 local = (ioc_local_t *) 0x7f40c404bcb0 table = (ioc_table_t *) 0x2538730 ioc_inode = (ioc_inode_t *) 0x0 inode = (inode_t *) 0x7f40cc000bb0 weight = 4294967295 path = 0x7f40c401a9c0 "/extMapping.php" __FUNCTION__ = "ioc_open_cbk" #2 0x00007f40d3fe7454 in ra_open_cbk (frame=0x7f40c404bd80, cookie=0x7f40c404bde0, this=0x2534bd0, op_ret=0, op_errno=117, fd=0x25903c0) at read-ahead.c:116 fn = (fop_open_cbk_t) 0x7f40d3dda698 <ioc_open_cbk> _parent = (call_frame_t *) 0x7f40c404cf40 old_THIS = (xlator_t *) 0x2534bd0 conf = (ra_conf_t *) 0x2538810 file = (ra_file_t *) 0x7f40c404c400 ret = 0 wbflags = 0 __FUNCTION__ = "ra_open_cbk" #3 0x00007f40d421a79a in afr_open_cbk (frame=0x7f40c404bde0, cookie=0x0, this=0x2534350, op_ret=0, op_errno=0, fd=0x25903c0) at afr-open.c:140 fn = (fop_open_cbk_t) 0x7f40d3fe7150 <ra_open_cbk> _parent = (call_frame_t *) 0x7f40c404bd80 old_THIS = (xlator_t *) 0x2534350 __local = (afr_local_t *) 0x7f40c404d040 __this = (xlator_t *) 0x2534350 local = (afr_local_t *) 0x7f40c404d040 priv = (afr_private_t *) 0x25389c0 child_index = 0 ctx = 139916143515392 fd_ctx = (afr_fd_ctx_t *) 0x7f40c4085700 ret = 0 call_count = 0 __FUNCTION__ = "afr_open_cbk" #4 0x00007f40d4462f49 in client_open_cbk (frame=0x7f40c401aab0, hdr=0x7f40c404c760, hdrlen=116, iobuf=0x0) at client-protocol.c:4090 fn = (ret_fn_t) 0x7f40d421a2f5 <afr_open_cbk> _parent = (call_frame_t *) 0x7f40c404bde0 old_THIS = (xlator_t *) 0x2533dd0 op_ret = 0 op_errno = 0 fd = (fd_t *) 0x25903c0 remote_fd = 7 rsp = (gf_fop_open_rsp_t *) 0x7f40c404c7cc local = (client_local_t *) 0x7f40c404b850 conf = (client_conf_t *) 0x2538b80 fdctx = (client_fd_ctx_t *) 0x7f40c404c7e0 ino = 36135004 gen = 5480397927761314920 #5 0x00007f40d446ac19 in protocol_client_interpret (this=0x2533dd0, trans=0x2539700, hdr_p=0x7f40c404c760 "", hdrlen=116, iobuf=0x0) at client-protocol.c:6529 ret = -1 frame = (call_frame_t *) 0x7f40c401aab0 hdr = (gf_hdr_common_t *) 0x7f40c404c760 callid = 292901 type = 4 op = 10 conn = (client_connection_t *) 0x2539b60 __FUNCTION__ = "protocol_client_interpret" #6 0x00007f40d446b8df in protocol_client_pollin (this=0x2533dd0, trans=0x2539700) at client-protocol.c:6827 conf = (client_conf_t *) 0x2538b80 ret = 0 iobuf = (struct iobuf *) 0x0 hdr = 0x7f40c404c760 "" hdrlen = 116 #7 0x00007f40d446bf53 in notify (this=0x2533dd0, event=2, data=0x2539700) at client-protocol.c:6946 i = 0 ret = -1 child_down = 1 was_not_down = 0 trans = (transport_t *) 0x2539700 conn = (client_connection_t *) 0x0 conf = (client_conf_t *) 0x2538b80 parent = (xlator_list_t *) 0x0 __FUNCTION__ = "notify" #8 0x00007f40d56015dc in xlator_notify (xl=0x2533dd0, event=2, data=0x2539700) at xlator.c:924 old_THIS = (xlator_t *) 0x7f40d58417c0 ret = 0 #9 0x00007f40d337f1ef in socket_event_poll_in (this=0x2539700) at socket.c:731 ret = 0 #10 0x00007f40d337f4e9 in socket_event_handler (fd=9, idx=1, data=0x2539700, poll_in=1, poll_out=0, poll_err=0) at socket.c:831 this = (transport_t *) 0x2539700 priv = (socket_private_t *) 0x2539a90 ret = 0 #11 0x00007f40d5626203 in event_dispatch_epoll_handler (event_pool=0x252e900, events=0x253b790, i=0) at event.c:804 event_data = (struct event_data *) 0x253b794 handler = (event_handler_t) 0x7f40d337f432 <socket_event_handler> data = (void *) 0x2539700 idx = 1 ret = -1 __FUNCTION__ = "event_dispatch_epoll_handler" #12 0x00007f40d56263d2 in event_dispatch_epoll (event_pool=0x252e900) at event.c:867 events = (struct epoll_event *) 0x253b790 size = 1 i = 0 ret = 1 __FUNCTION__ = "event_dispatch_epoll" #13 0x00007f40d56266e3 in event_dispatch (event_pool=0x252e900) at event.c:975 ret = -1 __FUNCTION__ = "event_dispatch" #14 0x00000000004063d3 in main (argc=4, argv=0x7fffdda5e748) at glusterfsd.c:1425 ctx = (glusterfs_ctx_t *) 0x252d010 cmd_args = (cmd_args_t *) 0x252d010 pool = (call_pool_t *) 0x252e9a0 stbuf = {st_dev = 139916439052352, st_ino = 15773923, st_nlink = 4294967487, st_mode = 3584355008, st_uid = 32576, st_gid = 3584410456, pad0 = 32576, st_rdev = 140736912025280, st_size = 139916439044096, st_blksize = 4200312, st_blocks = 0, st_atim = {tv_sec = 4429057797, tv_nsec = 705412764499181569}, st_mtim = {tv_sec = 139916438987952, tv_nsec = 140736912025416}, st_ctim = {tv_sec = 1970169159, tv_nsec = 11}, __unused = {32768, 4, 140736912025416}} tmp_logfile = '\0' <repeats 1023 times> tmp_logfile_dyn = 0x0 tmp_logfilebase = 0x0 timestr = '\0' <repeats 255 times> utime = 1278420147 tm = (struct tm *) 0x0 ret = 63 lim = {rlim_cur = 18446744073709551615, rlim_max = 18446744073709551615} specfp = (FILE *) 0x252ec60 graph = (xlator_t *) 0x252eec0 trav = (xlator_t *) 0x0 fuse_volume_found = 0 xl_count = 8 process_mode = 1 '\001' pipe_fd = {6, 7} gf_success = 0 gf_failure = -1 __FUNCTION__ = "main" Thread 1 (process 17097): #0 0x00007f40d3fea114 in ra_fstat (frame=0x7f40c404f5b0, this=0x2534bd0, fd=0x25eaca0) at read-ahead.c:750 file = (ra_file_t *) 0x7f40c404c400 iter_fd = (fd_t *) 0x25903c0 inode = (inode_t *) 0x7f40cc000bb0 ret = 0 tmp_file = 139916143281152 #1 0x00007f40d5606c1c in default_fstat (frame=0x7f40c404f550, this=0x25354b0, fd=0x25eaca0) at defaults.c:673 _new = (call_frame_t *) 0x7f40c404f5b0 old_THIS = (xlator_t *) 0x25354b0 tmp_cbk = (fop_fstat_cbk_t) 0x7f40d56069c6 <default_fstat_cbk> #2 0x00007f40d3bceca5 in qr_fstat (frame=0x7f40c404f4f0, this=0x2535e50, fd=0x25eaca0) at quick-read.c:1184 _new = (call_frame_t *) 0x7f40c404f550 old_THIS = (xlator_t *) 0x2535e50 tmp_cbk = (fop_fstat_cbk_t) 0x7f40d3bce64f <qr_fstat_cbk> qr_fd_ctx = (qr_fd_ctx_t *) 0x253f600 need_open = 0 '\0' can_wind = 1 '\001' need_unwind = 0 '\0' value = 39056896 ret = 0 op_ret = -1 op_errno = -1 stub = (call_stub_t *) 0x0 loc = {path = 0x0, name = 0x0, ino = 0, inode = 0x0, parent = 0x0} path = 0x254f0e0 "/extMapping.php" flags = 32768 #3 0x00007f40d39be44d in wb_fstat_helper (frame=0x7f40c404c3a0, this=0x25367f0, fd=0x25eaca0) at write-behind.c:698 _new = (call_frame_t *) 0x7f40c404f4f0 old_THIS = (xlator_t *) 0x25367f0 tmp_cbk = (fop_fstat_cbk_t) 0x7f40d39be170 <wb_fstat_cbk> #4 0x00007f40d561bacd in call_resume_wind (stub=0x7f40c404f170) at call-stub.c:2607 __FUNCTION__ = "call_resume_wind" #5 0x00007f40d562180a in call_resume (stub=0x7f40c404f170) at call-stub.c:4304 old_THIS = (xlator_t *) 0x25367f0 __FUNCTION__ = "call_resume" #6 0x00007f40d39c15ce in wb_resume_other_requests (frame=0x7f40c404c3a0, file=0x254f1c0, other_requests=0x42656e20) at write-behind.c:1621 ret = 0 request = (wb_request_t *) 0x7f40c404f480 dummy = (wb_request_t *) 0x42656df0 fops_removed = 0 wind = 1 '\001' stub = (call_stub_t *) 0x7f40c404f170 #7 0x00007f40d39c1678 in wb_do_ops (frame=0x7f40c404c3a0, file=0x254f1c0, winds=0x42656e40, unwinds=0x42656e30, other_requests=0x42656e20) at write-behind.c:1649 ret = 0 #8 0x00007f40d39c1cef in wb_process_queue (frame=0x7f40c404c3a0, file=0x254f1c0, flush_all=1 '\001') at write-behind.c:1817 winds = {next = 0x42656e40, prev = 0x42656e40} unwinds = {next = 0x42656e30, prev = 0x42656e30} other_requests = {next = 0x7f40c404f4b0, prev = 0x7f40c404f4b0} size = 131072 conf = (wb_conf_t *) 0x25384d0 count = 1 ret = -1 #9 0x00007f40d39be7b7 in wb_fstat (frame=0x7f40c404c3a0, this=0x25367f0, fd=0x25eaca0) at write-behind.c:755 file = (wb_file_t *) 0x254f1c0 local = (wb_local_t *) 0x7f40c404c570 tmp_file = 39121344 stub = (call_stub_t *) 0x7f40c404f170 request = (wb_request_t *) 0x7f40c404f480 ret = -1 op_errno = 22 __FUNCTION__ = "wb_fstat" #10 0x00007f40d5606c1c in default_fstat (frame=0x7f40c404c9f0, this=0x2537270, fd=0x25eaca0) at defaults.c:673 _new = (call_frame_t *) 0x7f40c404c3a0 old_THIS = (xlator_t *) 0x2537270 tmp_cbk = (fop_fstat_cbk_t) 0x7f40d56069c6 <default_fstat_cbk> #11 0x00007f40d358cf34 in fuse_getattr (this=0x252eec0, finh=0x7f40c404cd40, msg=0x7f40c404cd68) at fuse-bridge.c:812 _new = (call_frame_t *) 0x7f40c404c9f0 old_THIS = (xlator_t *) 0x252eec0 tmp_cbk = (fop_fstat_cbk_t) 0x7f40d358c057 <fuse_attr_cbk> frame = (call_frame_t *) 0x7f40c404c988 xl = (xlator_t *) 0x2537270 state = (fuse_state_t *) 0x7f40c404c830 fd = (fd_t *) 0x25eaca0 ret = 0 __FUNCTION__ = "fuse_getattr" #12 0x00007f40d3598cb9 in fuse_thread_proc (data=0x252eec0) at fuse-bridge.c:3190 mount_point = 0x0 this = (xlator_t *) 0x252eec0 priv = (fuse_private_t *) 0x2538280 res = 56 iobuf = (struct iobuf *) 0x252e8c8 finh = (fuse_in_header_t *) 0x7f40c404cd40 iov_in = {{iov_base = 0x7f40c404cd40, iov_len = 80}, {iov_base = 0x7f40d4e54000, iov_len = 131072}} msg = (void *) 0x7f40c404cd68 msg0_size = 168 ret = 0 now = {tv_sec = 1278420147, tv_usec = 623243} timeout = {tv_sec = 1278420148, tv_nsec = 623243000} __FUNCTION__ = "fuse_thread_proc" #13 0x00007f40d51cdfc7 in start_thread () from /lib/libpthread.so.0 No symbol table info available. #14 0x00007f40d4f4364d in clone () from /lib/libc.so.6 No symbol table info available. #15 0x0000000000000000 in ?? () No symbol table info available.
(En réponse au commentaire 1) > This looks to be a duplicate of bug 762600. Bug 868 has been fixed in the 3.0.5 > release candidates. Can you please check if your crash is fixed by 3.0.5rc9 or > by applying this patch on to GlusterFS 3.0.4? > > http://patches.gluster.com/patch/3213 I applied your patch on glusterfs 3.0.4 and now everything is perfect. I set this ticket has "solved". Thanks !