Description of problem: I created a DHT volume, the I used postmark to write 100000 of 100k files in /mountpoint/a/b/c.When postmark runinng, I exec force replace-brick and fix-layout.Like this, I would add-bricks many times.Some times, one of the glusterfsds would core dump(not the new brick).This is the bt info of the core: Core was generated by `/usr/sbin/glusterfsd -s node-1.jdm --volfile-id test.node-1.jdm.glusterfs-w'. Program terminated with signal 6, Aborted. #0 0x00007fd41c14b8a5 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glusterfs-3.6.1-0.44.git890879a.el6.x86_64 (gdb) bt #0 0x00007fd41c14b8a5 in raise () from /lib64/libc.so.6 #1 0x00007fd41c14d085 in abort () from /lib64/libc.so.6 #2 0x00007fd41c144a1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007fd41c144ae0 in __assert_fail () from /lib64/libc.so.6 #4 0x00007fd41d0f0512 in __inode_path (inode=0x7fd3ec00dd2c, name=0x7fd3ec00dfb8 "a2", bufp=0x7fd40ae815f8) at inode.c:1175 #5 0x00007fd41d0f08e5 in inode_path (inode=0x7fd3ec00dd2c, name=0x7fd3ec00dfb8 "a2", bufp=0x7fd40ae815f8) at inode.c:1274 #6 0x00007fd40c727ea9 in marker_inode_loc_fill (inode=0x7fd3ec00dd2c, name=0x7fd3ec00dfb8 "a2", loc=0x7fd40ae816d0) at marker.c:108 #7 0x00007fd40c737016 in marker_readdirp_cbk (frame=0x79c27c, cookie=0x79c6ac, this=0x655bc0, op_ret=754, op_errno=0, entries=0x7fd40ae81970, xdata=0x0) at marker.c:2791 #8 0x00007fd41d0dbedb in default_readdirp_cbk (frame=0x79c6ac, cookie=0x7fd3ec00d04c, this=0x651890, op_ret=754, op_errno=0, entries=0x7fd40ae81970, xdata=0x0) at defaults.c:1235 #9 0x00007fd40cf7e137 in pl_readdirp_cbk (frame=0x7fd3ec00d04c, cookie=0x7fd3ec00e29c, this=0x650570, op_ret=754, op_errno=0, entries=0x7fd40ae81970, xdata=0x0) at posix.c:2120 #10 0x00007fd40d198042 in posix_acl_readdirp_cbk (frame=0x7fd3ec00e29c, cookie=0x7fd3ec00de4c, this=0x64f170, op_ret=754, op_errno=0, entries=0x7fd40ae81970, xdata=0x0) at posix-acl.c:1580 #11 0x00007fd40d7de5d3 in posix_do_readdir (frame=0x7fd3ec00de4c, this=0x64ac30, fd=0x76ecbc, size=130944, off=0, whichop=40, dict=0x83468c) at posix.c:5157 #12 0x00007fd40d7de947 in posix_readdirp (frame=0x7fd3ec00de4c, this=0x64ac30, fd=0x76ecbc, size=130944, off=0, dict=0x83468c) at posix.c:5204 #13 0x00007fd41d0e52fa in default_readdirp (frame=0x7fd3ec00de4c, this=0x64c700, fd=0x76ecbc, size=130944, off=0, xdata=0x83468c) at defaults.c:2078 #14 0x00007fd40d1983e4 in posix_acl_readdirp (frame=0x7fd3ec00e29c, this=0x64f170, fd=0x76ecbc, size=130944, offset=0, dict=0x83468c) at posix-acl.c:1614 #15 0x00007fd40cf7e501 in pl_readdirp (frame=0x7fd3ec00d04c, this=0x650570, fd=0x76ecbc, size=130944, offset=0, dict=0x83468c) at posix.c:2150 #16 0x00007fd41d0e21e2 in default_readdirp_resume (frame=0x79c6ac, this=0x651890, fd=0x76ecbc, size=130944, off=0, xdata=0x83468c) at defaults.c:1645 #17 0x00007fd41d0fd1b8 in call_resume_wind (stub=0xf189fc) at call-stub.c:2492 #18 0x00007fd41d104b5c in call_resume (stub=0xf189fc) at call-stub.c:2841 #19 0x00007fd40cd662e7 in iot_worker (data=0x674c60) at io-threads.c:214 #20 0x00007fd41c84d851 in start_thread () from /lib64/libpthread.so.0 #21 0x00007fd41c20190d in clone () from /lib64/libc.so.6 (gdb) I also met this problem in other scene, because the assert in _inode_path() as follow: if (!inode || uuid_is_null (inode->gfid)) { 1186 GF_ASSERT (0); 1187 gf_log_callingfn (THIS->name, GF_LOG_WARNING, "invalid inode"); 1188 return -1; 1189 } 1190 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Can you please put the exact commands, so that it would be easier for us to reproduce ? Also I am assuming you have done compile+install (make, make install) of mainline code on the nodes.
*** This bug has been marked as a duplicate of bug 1176393 ***
(In reply to Lalatendu Mohanty from comment #1) > Can you please put the exact commands, so that it would be easier for us to > reproduce ? > > Also I am assuming you have done compile+install (make, make install) of > mainline code on the nodes. yes,it's a duplicate of bug 1176393.