Description of problem: glusterfs client crashed while doing getxattr call (the client crashed immedietly after mounting). nfs server and glustershd had crashed too. Replicate volume was just created (2 replica), started and enabled quota on the volume, then mounted it. This is the backtrace of the core generated. Core was generated by `/usr/local/sbin/glusterfs --volfile-id=mirror --volfile-server=hyperspace /mnt/'. Program terminated with signal 11, Segmentation fault. #0 __strlen_sse42 () at ../sysdeps/x86_64/multiarch/strlen-sse4.S:32 32 ../sysdeps/x86_64/multiarch/strlen-sse4.S: No such file or directory. in ../sysdeps/x86_64/multiarch/strlen-sse4.S (gdb) bt #0 __strlen_sse42 () at ../sysdeps/x86_64/multiarch/strlen-sse4.S:32 #1 0x00007f30e28c23b7 in gf_strdup (src=0x0) at ../../../../../libglusterfs/src/mem-pool.h:119 #2 0x00007f30e28dd752 in client3_1_getxattr (frame=0x7f30e56c006c, this=0x1b37fd0, data=0x7fff345ab800) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:4641 #3 0x00007f30e28bc962 in client_getxattr (frame=0x7f30e56c006c, this=0x1b37fd0, loc=0x7f30dff049f0, name=0x0, xdata=0x0) at ../../../../../xlators/protocol/client/src/client.c:1452 #4 0x00007f30e2666172 in afr_sh_metadata_sync_prepare (frame=0x7f30e54b784c, this=0x1b3b840) at ../../../../../xlators/cluster/afr/src/afr-self-heal-metadata.c:419 #5 0x00007f30e2666817 in afr_sh_metadata_fix (frame=0x7f30e54b784c, this=0x1b3b840, op_ret=0, op_errno=0) at ../../../../../xlators/cluster/afr/src/afr-self-heal-metadata.c:522 #6 0x00007f30e2660d46 in afr_sh_common_lookup_cbk (frame=0x7f30e54b784c, cookie=0x1, this=0x1b3b840, op_ret=0, op_errno=0, inode=0x7f30de97504c, buf=0x7fff345abc90, xattr=0x7f30e532c82c, postparent=0x7fff345abc20) at ../../../../../xlators/cluster/afr/src/afr-self-heal-common.c:1311 #7 0x00007f30e28d3422 in client3_1_lookup_cbk (req=0x7f30df38fa80, iov=0x7f30df38fac0, count=1, myframe=0x7f30e56bffc0) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:2636 #8 0x00007f30e71f5c91 in rpc_clnt_handle_reply (clnt=0x1b7e310, pollin=0x7f30d80017d0) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:788 #9 0x00007f30e71f6009 in rpc_clnt_notify (trans=0x1b8dea0, mydata=0x1b7e340, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f30d80017d0) at ../../../../rpc/rpc-lib/src/rpc-clnt.c:907 #10 0x00007f30e71f2021 in rpc_transport_notify (this=0x1b8dea0, event=RPC_TRANSPORT_MSG_RECEIVED, data=0x7f30d80017d0) at ../../../../rpc/rpc-lib/src/rpc-transport.c:489 #11 0x00007f30e350f3a5 in socket_event_poll_in (this=0x1b8dea0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1677 #12 0x00007f30e350f919 in socket_event_handler (fd=15, idx=6, data=0x1b8dea0, poll_in=1, poll_out=0, poll_err=0) at ../../../../../rpc/rpc-transport/socket/src/socket.c:1792 #13 0x00007f30e744ccf1 in event_dispatch_epoll_handler (event_pool=0x1b1fcd0, events=0x1b2df00, i=0) at ../../../libglusterfs/src/event.c:785 #14 0x00007f30e744cf0b in event_dispatch_epoll (event_pool=0x1b1fcd0) at ../../../libglusterfs/src/event.c:847 #15 0x00007f30e744d27d in event_dispatch (event_pool=0x1b1fcd0) at ../../../libglusterfs/src/event.c:947 #16 0x0000000000408858 in main (argc=4, argv=0x7fff345ac1f8) at ../../../glusterfsd/src/glusterfsd.c:1674 (gdb) f 1 #1 0x00007f30e28c23b7 in gf_strdup (src=0x0) at ../../../../../libglusterfs/src/mem-pool.h:119 119 len = strlen (src) + 1; (gdb) p src $1 = 0x0 (gdb) f 2 #2 0x00007f30e28dd752 in client3_1_getxattr (frame=0x7f30e56c006c, this=0x1b37fd0, data=0x7fff345ab800) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:4641 4641 local->name = gf_strdup (args->name); (gdb) p args->name $2 = 0x0 (gdb) l 4636 op_errno = ENOMEM; 4637 goto unwind; 4638 } 4639 4640 loc_copy (&local->loc, args->loc); 4641 local->name = gf_strdup (args->name); 4642 frame->local = local; 4643 4644 rsp_iobref = iobref_new (); 4645 if (rsp_iobref == NULL) { (gdb) l - 4626 } 4627 args = data; 4628 4629 if (!(args->loc && args->loc->inode)) { 4630 op_errno = EINVAL; 4631 goto unwind; 4632 } 4633 4634 local = mem_get0 (this->local_pool); 4635 if (!local) { (gdb) up #2 0x00007f30e28dd752 in client3_1_getxattr (frame=0x7f30e56c006c, this=0x1b37fd0, data=0x7fff345ab800) at ../../../../../xlators/protocol/client/src/client3_1-fops.c:4641 4641 local->name = gf_strdup (args->name); (gdb) up #3 0x00007f30e28bc962 in client_getxattr (frame=0x7f30e56c006c, this=0x1b37fd0, loc=0x7f30dff049f0, name=0x0, xdata=0x0) at ../../../../../xlators/protocol/client/src/client.c:1452 1452 ret = proc->fn (frame, this, &args); (gdb) l 1447 "rpc procedure not found for %s", 1448 gf_fop_list[GF_FOP_GETXATTR]); 1449 goto out; 1450 } 1451 if (proc->fn) 1452 ret = proc->fn (frame, this, &args); 1453 out: 1454 if (ret) 1455 STACK_UNWIND_STRICT (getxattr, frame, -1, ENOTCONN, NULL, NULL); 1456 (gdb) p name $3 = 0x0 (gdb) up #4 0x00007f30e2666172 in afr_sh_metadata_sync_prepare (frame=0x7f30e54b784c, this=0x1b3b840) at ../../../../../xlators/cluster/afr/src/afr-self-heal-metadata.c:419 419 STACK_WIND (frame, afr_sh_metadata_getxattr_cbk, (gdb) l 414 gf_log (this->name, GF_LOG_TRACE, 415 "syncing metadata of %s from subvolume %s to %d active sinks", 416 local->loc.path, priv->children[source]->name, 417 sh->active_sinks); 418 419 STACK_WIND (frame, afr_sh_metadata_getxattr_cbk, 420 priv->children[source], 421 priv->children[source]->fops->getxattr, 422 &local->loc, NULL, NULL); 423 (gdb) Since name can be NULL in the getxattr calls, it has to be checked before doing strdup. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: ame, reopening the fds [2012-05-17 13:02:00.740348] I [afr-common.c:3623:afr_notify] 0-mirror-replicate-2: Subvolume 'mirror-client-4' came back up; going online. [2012-05-17 13:02:00.740425] I [client-handshake.c:453:client_set_lk_version_cbk] 0-mirror-client-4: Server lk version = 1 [2012-05-17 13:02:00.740503] I [client-handshake.c:1425:client_setvolume_cbk] 0-mirror-client-5: Connected to 127.0.1.1:24023, attached to remote volume '/mnt/sda8/export5'. [2012-05-17 13:02:00.740538] I [client-handshake.c:1437:client_setvolume_cbk] 0-mirror-client-5: Server and Client lk-version numbers are not same, reopening the fds [2012-05-17 13:02:00.749775] I [fuse-bridge.c:4192:fuse_graph_setup] 0-fuse: switched to graph 0 [2012-05-17 13:02:00.749935] I [client-handshake.c:453:client_set_lk_version_cbk] 0-mirror-client-5: Server lk version = 1 [2012-05-17 13:02:00.750042] I [fuse-bridge.c:3376:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.16 [2012-05-17 13:02:00.751047] I [afr-common.c:1965:afr_set_root_inode_on_first_lookup] 0-mirror-replicate-0: added root inode [2012-05-17 13:02:00.751807] I [afr-common.c:1965:afr_set_root_inode_on_first_lookup] 0-mirror-replicate-1: added root inode [2012-05-17 13:02:00.752141] I [afr-common.c:1965:afr_set_root_inode_on_first_lookup] 0-mirror-replicate-2: added root inode [2012-05-17 13:02:00.752209] I [afr-common.c:1341:afr_launch_self_heal] 0-mirror-replicate-2: background meta-data self-heal triggered. path: /, reason: lookup detected pending operations pending frames: frame : type(1) op(NULL) frame : type(1) op(NULL) frame : type(1) op(NULL) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2012-05-17 13:02:00 configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3git /lib/x86_64-linux-gnu/libc.so.6(+0x33d80)[0x7f30e66bbd80] /lib/x86_64-linux-gnu/libc.so.6(+0x12f49f)[0x7f30e67b749f] /usr/local/lib/glusterfs/3git/xlator/protocol/client.so(+0x153b7)[0x7f30e28c23b7] /usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client3_1_getxattr+0x173)[0x7f30e28dd752] /usr/local/lib/glusterfs/3git/xlator/protocol/client.so(client_getxattr+0x16f)[0x7f30e28bc962] /usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_sh_metadata_sync_prepare+0x3df)[0x7f30e2666172] /usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(afr_sh_metadata_fix+0x68d)[0x7f30e2666817] /usr/local/lib/glusterfs/3git/xlator/cluster/replicate.so(+0x3fd46)[0x7f30e2660d46] : gluster volume info Volume Name: mirror Type: Distributed-Replicate Volume ID: 6ff5ec4a-7781-48ec-a9bb-b34a547ad682 Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: hyperspace:/mnt/sda7/export3 Brick2: hyperspace:/mnt/sda8/export3 Brick3: hyperspace:/mnt/sda7/export4 Brick4: hyperspace:/mnt/sda8/export4 Brick5: hyperspace:/mnt/sda7/export5 Brick6: hyperspace:/mnt/sda8/export5 Options Reconfigured: features.limit-usage: /:250GB features.quota: on
CHANGE: http://review.gluster.com/3353 (protocol/client: check if the name is NULL before duping it) merged in master by Anand Avati (avati)
Checked with glusterfs-3.3.0qa42 and the process does not crash in client3_1_getxattr since we are checking if the name is NULL before strduping it.