Hide Forgot
Commit ID might not be proper. Date is Sep26. Git pulled at 11AM on master. (if this helps)
Reported by Lakshmipathi. The crash happened on a replicated volume. the client crashed due to possible split-brain errors. Valgrind log says ==8508== Invalid read of size 8 ==8508== at 0x81D65FA: ioc_open_cbk (io-cache.c:554) ==8508== by 0x7FC680F: ra_open_cbk (read-ahead.c:119) ==8508== by 0x7DB9936: wb_open_cbk (write-behind.c:1390) ==8508== by 0x7B5CB6B: afr_open_cbk (afr-open.c:210) ==8508== by 0x790FB27: client3_1_open_cbk (client3_1-fops.c:375) ==8508== by 0x4EA0904: rpc_clnt_handle_reply (rpc-clnt.c:789) ==8508== by 0x4EA0C51: rpc_clnt_notify (rpc-clnt.c:902) ==8508== by 0x4E9CD87: rpc_transport_notify (rpc-transport.c:498) ==8508== by 0x8E46212: socket_event_poll_in (socket.c:1675) ==8508== by 0x8E46796: socket_event_handler (socket.c:1790) ==8508== by 0x4C561E3: event_dispatch_epoll_handler (event.c:794) ==8508== by 0x4C56406: event_dispatch_epoll (event.c:856) ==8508== Address 0x0 is not stack'd, malloc'd or (recently) free'd [2011-09-26 11:56:51.620256] I [afr-self-heal-common.c:713:afr_mark_sources] 0-afr-replicate-0: split-brain possible, no source detected [2011-09-26 11:56:51.620546] E [afr-self-heal-data.c:697:afr_sh_data_fix] 0-afr-replicate-0: Unable to self-heal contents of '/run8525/ltp/large_file' (possible split-brain). Please delete the file from all but the preferred subvolume. [2011-09-26 11:56:51.629579] E [afr-self-heal-common.c:2009:afr_self_heal_completion_cbk] 0-afr-replicate-0: background data missing-entry gfid self-heal failed on /run8525/ltp/large_file [2011-09-26 12:04:36.273144] W [client3_1-fops.c:554:client3_1_rmdir_cbk] 0-afr-client-0: remote operation failed: Directory not empty [2011-09-26 12:04:36.293733] W [client3_1-fops.c:554:client3_1_rmdir_cbk] 0-afr-client-1: remote operation failed: Directory not empty [2011-09-26 12:05:16.861358] W [client3_1-fops.c:554:client3_1_rmdir_cbk] 0-afr-client-0: remote operation failed: Directory not empty [2011-09-26 12:05:16.869475] W [client3_1-fops.c:554:client3_1_rmdir_cbk] 0-afr-client-1: remote operation failed: Directory not empty [2011-09-26 12:05:21.898045] I [afr-self-heal-common.c:713:afr_mark_sources] 0-afr-replicate-0: split-brain possible, no source detected [2011-09-26 12:05:21.900249] W [stat-prefetch.c:2634:sp_unlink_helper] 0-afr-stat-prefetch: lookup-behind has failed for path (/run8525/ltp/large_file)(Input/output error), unwinding unlink call waiting on it [2011-09-26 12:05:21.901963] W [fuse-bridge.c:908:fuse_unlink_cbk] 0-glusterfs-fuse: 91779: UNLINK() /run8525/ltp/large_file => -1 (Input/output error) pending frames: frame : type(1) op(OPEN) frame : type(1) op(OPEN) frame : type(1) op(OPEN) frame : type(1) op(OPEN) frame : type(1) op(OPEN) frame : type(1) op(OPEN) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2011-09-26 12:05:21 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3git /lib64/libc.so.6[0x3d920332f0] /opt/glusterfs/nightly_valgrind/lib/glusterfs/3git/xlator/performance/io-cache.so(ioc_open_cbk+0x125)[0x81d65fa] /opt/glusterfs/nightly_valgrind/lib/glusterfs/3git/xlator/performance/read-ahead.so(ra_open_cbk+0x458)[0x7fc6810] /opt/glusterfs/nightly_valgrind/lib/glusterfs/3git/xlator/performance/write-behind.so(wb_open_cbk+0x356)[0x7db9937] /opt/glusterfs/nightly_valgrind/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_open_cbk+0x676)[0x7b5cb6c] /opt/glusterfs/nightly_valgrind/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_open_cbk+0x433)[0x790fb28] /opt/glusterfs/nightly_valgrind/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x211)[0x4ea0905] /opt/glusterfs/nightly_valgrind/lib/libgfrpc.so.0(rpc_clnt_notify+0x285)[0x4ea0c52] /opt/glusterfs/nightly_valgrind/lib/libgfrpc.so.0(rpc_transport_notify+0x130)[0x4e9cd88] /opt/glusterfs/nightly_valgrind/lib/glusterfs/3git/rpc-transport/socket.so(socket_event_poll_in+0x54)[0x8e46213] /opt/glusterfs/nightly_valgrind/lib/glusterfs/3git/rpc-transport/socket.so(socket_event_handler+0x21d)[0x8e46797] /opt/glusterfs/nightly_valgrind/lib/libglusterfs.so.0[0x4c561e4] /opt/glusterfs/nightly_valgrind/lib/libglusterfs.so.0[0x4c56407] /opt/glusterfs/nightly_valgrind/lib/libglusterfs.so.0(event_dispatch+0x88)[0x4c56792] /opt/glusterfs/nightly_valgrind/sbin/glusterfs(main+0x238)[0x407a00] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3d9201ea4d] /opt/glusterfs/nightly_valgrind/sbin/glusterfs[0x403ce9] Lakshmipathi will be updating the proper commit-id later.
Trying to reproduce this on master. Will update soon.
unable to reproduce this on master "b6eee04da4a699c7cd850bf2121825cc67f14707" with replicate on master.