+++ This bug was initially created as a clone of Bug #768840 +++ Description of problem: In the nightly sanity tests glusterfs client crashed due to assert (gfid was NULL). Version-Release number of selected component (if applicable): git build with head at cd24be007c64bd10d8c28e8e9b1c988478a32c8c. How reproducible: Nightly sanity tests some times crashes. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: Backtrace: 59: volume vol 60: type debug/io-stats 61: option latency-measurement off 62: option count-fop-hits off 63: subvolumes vol-stat-prefetch 64: end-volume +------------------------------------------------------------------------------+ [2011-12-17 05:05:50.937004] I [rpc-clnt.c:1597:rpc_clnt_reconfig] 0-vol-client-0: changing port to 24009 (from 0) [2011-12-17 05:05:50.937292] I [rpc-clnt.c:1597:rpc_clnt_reconfig] 0-vol-client-1: changing port to 24010 (from 0) [2011-12-17 05:05:50.937518] I [rpc-clnt.c:1597:rpc_clnt_reconfig] 0-vol-client-2: changing port to 24011 (from 0) [2011-12-17 05:05:50.937664] I [rpc-clnt.c:1597:rpc_clnt_reconfig] 0-vol-client-3: changing port to 24012 (from 0) [2011-12-17 05:05:54.936181] I [client-handshake.c:1085:select_server_supported_programs] 0-vol-client-1: Using Program GlusterFS 3git, Num (1 298437), Version (310) [2011-12-17 05:05:54.936419] I [client-handshake.c:1085:select_server_supported_programs] 0-vol-client-2: Using Program GlusterFS 3git, Num (1 298437), Version (310) [2011-12-17 05:05:54.936631] I [client-handshake.c:917:client_setvolume_cbk] 0-vol-client-2: Connected to 127.0.0.1:24011, attached to remote volume '/export/nightly/data/export3'. [2011-12-17 05:05:54.936677] I [client-handshake.c:917:client_setvolume_cbk] 0-vol-client-1: Connected to 127.0.0.1:24010, attached to remote volume '/export/nightly/data/export2'. [2011-12-17 05:05:54.936883] I [client-handshake.c:1085:select_server_supported_programs] 0-vol-client-0: Using Program GlusterFS 3git, Num (1 298437), Version (310) [2011-12-17 05:05:54.936945] I [client-handshake.c:1085:select_server_supported_programs] 0-vol-client-3: Using Program GlusterFS 3git, Num (1 298437), Version (310) [2011-12-17 05:05:54.940457] I [client-handshake.c:917:client_setvolume_cbk] 0-vol-client-3: Connected to 127.0.0.1:24012, attached to remote volume '/export/nightly/data/export4'. [2011-12-17 05:05:54.940506] I [client-handshake.c:917:client_setvolume_cbk] 0-vol-client-0: Connected to 127.0.0.1:24009, attached to remote volume '/export/nightly/data/export1'. [2011-12-17 05:05:54.946834] I [fuse-bridge.c:3637:fuse_graph_setup] 0-fuse: switched to graph 0 [2011-12-17 05:05:54.946976] I [fuse-bridge.c:3217:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.10 pending frames: frame : type(1) op(READDIR) frame : type(1) op(READDIR) patchset: git://git.gluster.com/glusterfs.git signal received: 6 time of crash: 2011-12-17 05:06:09 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3git /lib64/libc.so.6[0x3e58c302d0] /lib64/libc.so.6(gsignal+0x35)[0x3e58c30265] /lib64/libc.so.6(abort+0x110)[0x3e58c31d10] /lib64/libc.so.6(__assert_fail+0xf6)[0x3e58c296e6] /usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_stat+0x11e)[0x2aaaab1c0991] /usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client_stat+0x137)[0x2aaaab1acfd6] /usr/local/lib/glusterfs/3git/xlator/cluster/stripe.so(stripe_readdirp_cbk+0x636)[0x2aaaab3fb4df] /usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_readdirp_cbk+0x2f2)[0x2aaaab1be0cd] /usr/local/lib/libgfrpc.so.0(rpc_clnt_handle_reply+0x1f4)[0x2ac935e4f25e] /usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x2b9)[0x2ac935e4f5b6] /usr/local/lib/libgfrpc.so.0(rpc_transport_notify+0x149)[0x2ac935e4b9f3] /usr/local/lib/glusterfs/3git/rpc-transport/socket.so(socket_event_poll_in+0x4b)[0x2aaaaad7dee3] /usr/local/lib/glusterfs/3git/rpc-transport/socket.so(socket_event_handler+0x213)[0x2aaaaad7e425] /usr/local/lib/libglusterfs.so.0[0x2ac935bf3854] /usr/local/lib/libglusterfs.so.0[0x2ac935bf3a59] /usr/local/lib/libglusterfs.so.0(event_dispatch+0x85)[0x2ac935bf3db3] glusterfs(main+0x229)[0x4078c2] /lib64/libc.so.6(__libc_start_main+0xf4)[0x3e58c1d994] glusterfs[0x403c29] --- Additional comment from rabhat on 2011-12-19 01:37:51 EST --- This is the backtrace of the core generated. Core was generated by `glusterfs --volfile-server=Nightly-Sanity --volfile-id=vol /export/nightly/moun'. Program terminated with signal 6, Aborted. #0 0x0000003e58c30265 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x0000003e58c30265 in raise () from /lib64/libc.so.6 #1 0x0000003e58c31d10 in abort () from /lib64/libc.so.6 #2 0x0000003e58c296e6 in __assert_fail () from /lib64/libc.so.6 #3 0x00002aaaab1c0991 in client3_1_stat (frame=0x2b5edec1a5ac, this=0x1cc35f50, data=0x7fff367b0520) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:2676 #4 0x00002aaaab1acfd6 in client_stat (frame=0x2b5edec1a5ac, this=0x1cc35f50, loc=0x7fff367b0680) at ../../../../../xlators/protocol/client/src/client.c:296 #5 0x00002aaaab3fb4df in stripe_readdirp_cbk (frame=0x2b5edec1a454, cookie=0x2b5edec1a500, this=0x1cc3a110, op_ret=9, op_errno=2, orig_entries=0x7fff367b0790) at ../../../../../xlators/cluster/stripe/src/stripe.c:4226 #6 0x00002aaaab1be0cd in client3_1_readdirp_cbk (req=0x2aaaacae904c, iov=0x2aaaacae908c, count=1, myframe=0x2b5edec1a500) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1955 #7 0x00002b5eddb2725e in rpc_clnt_handle_reply (clnt=0x1cc4df30, pollin=0x1cc4f140) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:789 #8 0x00002b5eddb275b6 in rpc_clnt_notify (trans=0x1cc4e200, mydata=0x1cc4df60, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x1cc4f140) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:908 #9 0x00002b5eddb239f3 in rpc_transport_notify (this=0x1cc4e200, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x1cc4f140) at ../../../../rpc/rpc-lib/src/rpc-transport.c:498 #10 0x00002aaaaad7dee3 in socket_event_poll_in (this=0x1cc4e200) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1675 #11 0x00002aaaaad7e425 in socket_event_handler (fd=14, idx=4, data=0x1cc4e200, poll_in=1, poll_out=0, poll_err=0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1790 #12 0x00002b5edd8cb854 in event_dispatch_epoll_handler (event_pool=0x1cc2a9a0, events=0x1cc30310, i=0) at ../../../libglusterfs/src/event.c:794 #13 0x00002b5edd8cba59 in event_dispatch_epoll (event_pool=0x1cc2a9a0) at ../../../libglusterfs/src/event.c:856 #14 0x00002b5edd8cbdb3 in event_dispatch (event_pool=0x1cc2a9a0) at ../../../libglusterfs/src/event.c:956 #15 0x00000000004078c2 in main (argc=6, argv=0x7fff367b0d98) at ../../../glusterfsd/src/glusterfsd.c:1601 (gdb) f 3 #3 0x00002aaaab1c0991 in client3_1_stat (frame=0x2b5edec1a5ac, this=0x1cc35f50, data=0x7fff367b0520) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:2676 2676 GF_ASSERT_AND_GOTO_WITH_ERROR (this->name, (gdb) l 2671 if (!uuid_is_null (args->loc->inode->gfid)) 2672 memcpy (req.gfid, args->loc->inode->gfid, 16); 2673 else 2674 memcpy (req.gfid, args->loc->gfid, 16); 2675 2676 GF_ASSERT_AND_GOTO_WITH_ERROR (this->name, 2677 !uuid_is_null (*((uuid_t*)req.gfid)), 2678 unwind, op_errno, EINVAL); 2679 req.path = (char *)args->loc->path; 2680 conf = this->private; (gdb) p req.gfid $1 = '\000' <repeats 15 times> (gdb) p *args->loc $2 = {path = 0x1cc51cd0 "/run8724/coverage/dir/hardlink", name = 0x1cc51ce6 "hardlink", inode = 0x2aaaacdb329c, parent = 0x2aaaacdb3208, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>} (gdb) p *args->loc->inode $3 = {table = 0x1cc50270, gfid = '\000' <repeats 15 times>, lock = 1, nlookup = 0, ref = 1, ia_type = IA_INVAL, fd_list = { next = 0x2aaaacdb32cc, prev = 0x2aaaacdb32cc}, dentry_list = {next = 0x2aaaacdb32dc, prev = 0x2aaaacdb32dc}, hash = { next = 0x2aaaacdb32ec, prev = 0x2aaaacdb32ec}, list = {next = 0x2aaaacdb3268, prev = 0x1cc502d0}, _ctx = 0x1cc53870} (gdb) f 6 #6 0x00002aaaab1be0cd in client3_1_readdirp_cbk (req=0x2aaaacae904c, iov=0x2aaaacae908c, count=1, myframe=0x2b5edec1a500) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1955 1955 STACK_UNWIND_STRICT (readdirp, frame, rsp.op_ret, (gdb) up #7 0x00002b5eddb2725e in rpc_clnt_handle_reply (clnt=0x1cc4df30, pollin=0x1cc4f140) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:789 789 req->cbkfn (req, req->rsp, req->rspcnt, saved_frame->frame); (gdb) down #6 0x00002aaaab1be0cd in client3_1_readdirp_cbk (req=0x2aaaacae904c, iov=0x2aaaacae908c, count=1, myframe=0x2b5edec1a500) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:1955 1955 STACK_UNWIND_STRICT (readdirp, frame, rsp.op_ret, (gdb) down #5 0x00002aaaab3fb4df in stripe_readdirp_cbk (frame=0x2b5edec1a454, cookie=0x2b5edec1a500, this=0x1cc3a110, op_ret=9, op_errno=2, orig_entries=0x7fff367b0790) at ../../../../../xlators/cluster/stripe/src/stripe.c:4226 4226 STACK_WIND_COOKIE (frame, stripe_readdirp_entry_stat_cbk, (gdb) inode = inode_new (local->fd->inode->table); (gdb) 4205 if (!inode) 4206 goto out; 4207 4208 loc.inode = inode; 4209 loc.parent = local->fd->inode; 4210 ret = inode_path (local->fd->inode, local_entry->d_name, &path); 4211 if (ret != -1) { 4212 loc.path = path; 4213 } else if (inode) { 4214 ret = inode_path (inode, NULL, &path); (gdb) 4215 if (ret != -1) { 4216 loc.path = path; 4217 } else { 4218 goto out; 4219 } 4220 } 4221 4222 loc.name = strrchr (loc.path, '/'); 4223 loc.name++; 4224 trav = this->children; (gdb) 4225 while (trav) { 4226 STACK_WIND_COOKIE (frame, stripe_readdirp_entry_stat_cbk, 4227 local_entry, trav->xlator, 4228 trav->xlator->fops->stat, &loc) In stripe_readdirp_cbk we are creating a new inode, without any gfid set and stackwinding to the below xlators. protocol/client cant see the gfid in both loc and loc->inode and thus asserts. --- Additional comment from aavati on 2011-12-19 11:58:21 EST --- CHANGE: http://review.gluster.com/806 (cluster/stripe: copy the gfid from the dentry structure to loc) merged in master by Vijay Bellur (vijay) --- Additional comment from aavati on 2012-01-04 02:09:52 EST --- CHANGE: http://review.gluster.com/2544 (cluster/stripe: copy the gfid from the dentry structure to loc) merged in release-3.2 by Vijay Bellur (vijay)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0538.html