Description of problem: Gluster NFS process crashes when the "Steps to Reproduce" are followed. Version-Release number of selected component (if applicable): upstream mainline How reproducible: always (3/3) Steps to Reproduce: 1. create a single brick volume 2. mount via nfs, hold locks on many files simultaneously on the mount point and sleep for about 200 seconds 3. add a brick and start rebalance. 4. After rebalance completes, wait till unlocks are attempted. Actual results: Gluster NFS process crashes during unlock with the bt: Core was generated by `/usr/local/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glu'. Program terminated with signal 11, Segmentation fault. #0 dht_migration_complete_check_task (data=0x7f890981f02c) at dht-helper.c:743 743 ret = syncop_open (dst_node, &local->loc, (gdb) bt #0 dht_migration_complete_check_task (data=0x7f890981f02c) at dht-helper.c:743 #1 0x00007f890b619383 in synctask_wrap (old_task=0x1124040) at syncop.c:161 #2 0x00007f890abf7650 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x0000000000000000 in ?? () Expected results: rebalance should not cause crashes on unlocking. Additional info: script for locking: lock.py : -------------------------------------------------------------------------- #!/usr/bin/python import fcntl import time import sys def main(): file_path = sys.argv[1] FH = open(file_path, "w") try: fcntl.lockf(FH, fcntl.LOCK_EX, 10, 0, 0) except IOError: print "lock could not be acquired." return False print "lock acquired" time.sleep(200) print "out of sleep" try: fcntl.lockf(FH, fcntl.LOCK_UN) except IOError: print "unlock failed" return False print "unlock happened" return True if __name__ == "__main__": main() ----------------------------------------------------------------------- to hold locks on multiple fiels lets say, {a..z}, # for i in {a..z}; do lock $i & done ----------------------------------------------------------------------- NFS logs when it crashed: [2013-03-28 07:17:04.712942] W [client-rpc-fops.c:526:client3_3_stat_cbk] 0-vol-client-0: remote operation failed: No such file or directory [2013-03-28 07:17:04.714454] W [client-rpc-fops.c:526:client3_3_stat_cbk] 0-vol-client-0: remote operation failed: No such file or directory pending frames: frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2013-03-28 07:17:04configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3git /usr/local/sbin/glusterfs(glusterfsd_print_trace+0x1f)[0x408085] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f890abe54a0] /usr/local/lib/glusterfs/3git/xlator/cluster/distribute.so(dht_migration_complete_check_task+0x7a3)[0x7f89064b3fd3] /usr/local/lib/libglusterfs.so.0(synctask_wrap+0x37)[0x7f890b619383] /lib/x86_64-linux-gnu/libc.so.6(+0x48650)[0x7f890abf7650]
More info from the core: (gdb) p (call_frame_t*) data $1 = (call_frame_t *) 0x7f890981f02c (gdb) p *$ $2 = {root = 0x7f89096100a4, parent = 0x7f890981f4e0, next = 0x7f890981f4e0, prev = 0x7f890981f184, local = 0x7f88fed7e7f0, this = 0x1097d80, ret = 0x7f8906292cac <io_stats_stat_cbk>, ref_count = 0, lock = 1, cookie = 0x7f890981f02c, complete = _gf_false, op = GF_FOP_NULL, begin = {tv_sec = 0, tv_usec = 0}, end = {tv_sec = 0, tv_usec = 0}, wind_from = 0x7f89062a2642 <__FUNCTION__.11339> "io_stats_stat", wind_to = 0x7f89062a1a2c "FIRST_CHILD(this)->fops->stat", unwind_from = 0x0, unwind_to = 0x7f89062a1a4a "io_stats_stat_cbk"} (gdb) info locals ret = -1 src_node = 0x1095f80 dst_node = 0x1097210 local = 0x7f88fed7e7f0 dict = 0x0 layout = 0x10c59d0 stbuf = {ia_ino = 9242341100452143058, ia_gfid = "\032\217{dL\255\200CdCIq", <incomplete sequence \322>, ia_dev = 2055, ia_type = IA_IFREG, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 1 '\001', write = 1 '\001', exec = 0 '\000'}, group = {read = 1 '\001', write = 0 '\000', exec = 0 '\000'}, other = {read = 1 '\001', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 1, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 4096, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 1364454824, ia_mtime_nsec = 0, ia_ctime = 1364454843, ia_ctime_nsec = 822884540} this = 0x1097d80 frame = 0x7f890981f02c tmp_loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>} path = 0x0 conf = 0x10c4db0 __FUNCTION__ = "dht_migration_complete_check_task" (gdb) f 1 #1 0x00007f890b619383 in synctask_wrap (old_task=0x1124040) at syncop.c:161 161 task->ret = task->syncfn (task->opaque); (gdb) f 0 #0 dht_migration_complete_check_task (data=0x7f890981f02c) at dht-helper.c:743 743 ret = syncop_open (dst_node, &local->loc, (gdb) p local->loc $3 = {path = 0x10919b0 "<gfid:1adfb78f-7b64-4cad-8043-644349715fd2>", name = 0x0, inode = 0x7f8904f430c8, parent = 0x0, gfid = "\032\217{dL\255\200CdCIq", <incomplete sequence \322>, pargfid = '\000' <repeats 15 times>} (gdb) p dst_node $4 = (xlator_t *) 0x1097210 (gdb) p *$ $5 = {name = 0x1090bf0 "vol-client-1", type = 0x1097060 "protocol/client", next = 0x1095f80, prev = 0x1097d80, parents = 0x1098e60, children = 0x0, options = 0x7f89091d625c, dlhandle = 0x10969e0, fops = 0x7f89069495a0 <fops>, cbks = 0x7f8906949580 <cbks>, dumpops = 0x7f8906949840 <dumpops>, volume_options = {next = 0x1097bb0, prev = 0x1097cc0}, fini = 0x7f890671c234 <fini>, init = 0x7f890671bfe0 <init>, reconfigure = 0x7f890671bdcf <reconfigure>, mem_acct_init = 0x7f890671b82d <mem_acct_init>, notify = 0x7f890671b43b <notify>, loglevel = GF_LOG_NONE, latencies = {{min = 0, max = 0, total = 0, std = 0, mean = 0, count = 0} <repeats 46 times>}, history = 0x0, ctx = 0x105a010, graph = 0x1091c10, itable = 0x0, init_succeeded = 1 '\001', private = 0x10c5a30, mem_acct = { num_types = 0, rec = 0x0}, winds = 0, switched = 0 '\000', local_pool = 0x10c5b10, is_autoloaded = _gf_false} (gdb) l 738 */ 739 SYNCTASK_SETID(0, 0); 740 /* if 'local->fd' (ie, fd based operation), send a 'open()' on 741 destination if not already done */ 742 if (local->loc.inode) { 743 ret = syncop_open (dst_node, &local->loc, 744 local->fd->flags, local->fd); 745 } else { 746 tmp_loc.inode = local->fd->inode; 747 inode_path (local->fd->inode, NULL, &path); (gdb) p local->fd $6 = (fd_t *) 0x0 (gdb) info args data = 0x7f890981f02c (gdb) p (call_frame_t*) data $7 = (call_frame_t *) 0x7f890981f02c (gdb) info locals ret = -1 src_node = 0x1095f80 dst_node = 0x1097210 local = 0x7f88fed7e7f0 dict = 0x0 layout = 0x10c59d0 stbuf = {ia_ino = 9242341100452143058, ia_gfid = "\032\217{dL\255\200CdCIq", <incomplete sequence \322>, ia_dev = 2055, ia_type = IA_IFREG, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 1 '\001', write = 1 '\001', exec = 0 '\000'}, group = {read = 1 '\001', write = 0 '\000', exec = 0 '\000'}, other = {read = 1 '\001', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 1, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 4096, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 1364454824, ia_mtime_nsec = 0, ia_ctime = 1364454843, ia_ctime_nsec = 822884540} this = 0x1097d80 frame = 0x7f890981f02c tmp_loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>} path = 0x0 conf = 0x10c4db0 __FUNCTION__ = "dht_migration_complete_check_task" (gdb) f #0 dht_migration_complete_check_task (data=0x7f890981f02c) at dht-helper.c:743 743 ret = syncop_open (dst_node, &local->loc, (gdb) p *local $8 = {call_cnt = 1, loc = {path = 0x10919b0 "<gfid:1adfb78f-7b64-4cad-8043-644349715fd2>", name = 0x0, inode = 0x7f8904f430c8, parent = 0x0, gfid = "\032\217{dL\255\200CdCIq", <incomplete sequence \322>, pargfid = '\000' <repeats 15 times>}, loc2 = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>}, op_ret = -1, op_errno = 117, layout_mismatch = 0, stbuf = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, prebuf = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, preoldparent = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, postoldparent = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = { suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = { read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, preparent = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, postparent = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = { read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, statvfs = { f_bsize = 0, f_frsize = 0, f_blocks = 0, f_bfree = 0, f_bavail = 0, f_files = 0, f_ffree = 0, f_favail = 0, f_fsid = 0, f_flag = 0, f_namemax = 0, __f_spare = {0, 0, 0, 0, 0, 0}}, fd = 0x0, inode = 0x0, params = 0x0, xattr = 0x0, xattr_req = 0x0, layout = 0x10c5910, size = 0, ia_ino = 0, src_hashed = 0x0, src_cached = 0x0, dst_hashed = 0x0, dst_cached = 0x0, cached_subvol = 0x1097210, hashed_subvol = 0x0, need_selfheal = 0 '\000', file_count = 0, dir_count = 0, main_frame = 0x0, fop_succeeded = 0, linkfile = {linkfile_cbk = 0x0, stbuf = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, loc = {path = 0x0, name = 0x0, inode = 0x0, parent = 0x0, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>}, inode = 0x0, xattr = 0x0, srcvol = 0x0}, selfheal = {hole_cnt = 0, overlaps_cnt = 0, down = 0, misc = 0, dir_cbk = 0x0, layout = 0x0}, uid = 0, gid = 0, flags = 0, mode = 0, rdev = 0, umask = 0, xattr_val = 0x0, key = 0x0, xsel = '\000' <repeats 255 times>, alloc_len = 0, newpath = 0x0, gfid = '\000' <repeats 15 times>, marker = {volmark = 0x0, data = 0x0, host_timebuf = {0, 0}, net_timebuf = {0, 0}, call_count = 0, has_xtime = 0, enoent_count = 0, enotconn_count = 0, enodata_count = 0, noxtime_count = 0, esomerr = 0, xl_specf_unwind = 0x0, xl_local = 0x0, vol_uuid = 0x0, retval = 0 '\000'}, return_estale = 0 '\000', need_lookup_everywhere = 0 '\000', fop = GF_FOP_STAT, linked = _gf_false, link_subvol = 0x0, rebalance = {from_subvol = 0x0, target_node = 0x0, offset = 0, size = 0, flags = 0, count = 0, iobref = 0x0, vector = 0x0, stbuf = {ia_ino = 0, ia_gfid = '\000' <repeats 15 times>, ia_dev = 0, ia_type = IA_INVAL, ia_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, group = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}, other = {read = 0 '\000', write = 0 '\000', exec = 0 '\000'}}, ia_nlink = 0, ia_uid = 0, ia_gid = 0, ia_rdev = 0, ia_size = 0, ia_blksize = 0, ia_blocks = 0, ia_atime = 0, ia_atime_nsec = 0, ia_mtime = 0, ia_mtime_nsec = 0, ia_ctime = 0, ia_ctime_nsec = 0}, target_op_fn = 0x7f89064ed57e <dht_attr2>, xdata = 0x0}} (gdb)
Repeated test as per #Description and found that the crash is not present. Code base: master (f4488e34553ec0bf68c270034af179ab14fd2b58) There are quite a few changes in the area where the old crash was observed, dht_migration_complete_check_task. See commit, 98e796e50198945adc660e42f3f5ab5b668f7bba. Reverted to code before the above mentioned commit, but crashes were still not visible. Overall repeated on both code bases, ~5 times without any crash. There is a functional failure in unlock in the latest code base w.r.t NFS code as follows though (always repeatable), [2013-12-08 14:55:45.843630] W [nlm4.c:1688:nlm4_unlock_resume] 0-nfs-NLM: nlm_get_uniq() returned NULL [2013-12-08 14:55:45.843694] W [nlm4.c:1702:nlm4_unlock_resume] 0-nfs-NLM: unable to unlock_fd_resume This is due to a graph change initiated due to the brick addition that causes NFS to restart and hence nlm_client_list is becomes empty, causing nlm_get_uniq to return a NULL and from there on the unlock to fail. Moving this to NFS, @spradhan can you confirm the behaviour.
(In reply to Shyamsundar from comment #3) > There is a functional failure in unlock in the latest code base w.r.t NFS > code as follows though (always repeatable), > > [2013-12-08 14:55:45.843630] W [nlm4.c:1688:nlm4_unlock_resume] 0-nfs-NLM: > nlm_get_uniq() returned NULL > [2013-12-08 14:55:45.843694] W [nlm4.c:1702:nlm4_unlock_resume] 0-nfs-NLM: > unable to unlock_fd_resume > > This is due to a graph change initiated due to the brick addition that > causes NFS to restart and hence nlm_client_list is becomes empty, causing > nlm_get_uniq to return a NULL and from there on the unlock to fail. > > Moving this to NFS, @spradhan can you confirm the behaviour. This wont be a functional failure if the client does not reclaim the LOCK after server recovers. i.e. 1. Client takes the LOCK on file and goes for sleep(). 2. Add brick/Rebalance (topology change) on the server, causes the NFS server to restart. 3. Once server is back online, it would ask the clients to reclaim their LOCK(s). If the server does not inform the client(s) about its recover status, then it would be an server issue. 4. Clients should reclaim their LOCK again. If the client does not reclaim the LOCK (i.e. not sending the LOCK call again), then its a client issue. 5. By that time if client comes out of the sleep, then that process finishes execution, then it would try to UNLOCK on the file where it did have the LOCK. This would log the behaviour you just noticed. Could you just increase the sleep time and retry. Packet capture would help you to monitor the behaviour. If the crash is not seen, then it is just timing issue with (lock/unlock)'ing. It can be tested with a single file. Reassigning the issue to you. Thanks, Santosh
As per comment #3, Shyam had followed the steps to repro the issue but no SEGV was seen. So, we should try to repro the issue with latest code drop. If not reproducible, then we can close the issue. Let me know if anybody is working on this.
Not reproducible anymore with the test case. Closing it.