Description of problem: ======================== In a dis-rep volume, deleted all the files/dirs from the actual volume from fuse mount(To simulate accidental removal of files) . From one of the snaps taken copied all the files to the actual volume. while taking arequal-checksum on the snap volume mount point , the glusterfs process mount process of the snap volume crashed. 2014-09-26 06:54:02.000347] I [dht-common.c:1892:dht_lookup_cbk] 0-84ffc336efc54efe893ab182bf8107bb-dht: linkfile not having link subvol for /E_new_dir.1/E_new_file.3 pending frames: frame : type(1) op(LOOKUP) frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2014-09-26 06:54:02 configuration details: argp 1 backtrace 1 dlfcn 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.6.0.29 /usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x397981ff06] /usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x397983a59f] /lib64/libc.so.6[0x34066329a0] /usr/lib64/glusterfs/3.6.0.29/xlator/cluster/distribute.so(dht_lookup_everywhere_done+0x6e3)[0x7fd81ed43c03] /usr/lib64/glusterfs/3.6.0.29/xlator/cluster/distribute.so(dht_lookup_everywhere_cbk+0x403)[0x7fd81ed485c3] /usr/lib64/glusterfs/3.6.0.29/xlator/cluster/replicate.so(afr_lookup_cbk+0x558)[0x7fd81efcbb18] /usr/lib64/glusterfs/3.6.0.29/xlator/protocol/client.so(client3_3_lookup_cbk+0x647)[0x7fd81f20a267] /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x397a00e9c5] /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f)[0x397a00fe4f] /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x397a00b668] /usr/lib64/glusterfs/3.6.0.29/rpc-transport/socket.so(+0x9275)[0x7fd81f84e275] /usr/lib64/glusterfs/3.6.0.29/rpc-transport/socket.so(+0xac5d)[0x7fd81f84fc5d] /usr/lib64/libglusterfs.so.0[0x3979876367] /usr/sbin/glusterfs(main+0x603)[0x407e93] /lib64/libc.so.6(__libc_start_main+0xfd)[0x340661ed1d] /usr/sbin/glusterfs[0x4049a9] --------- (END) Version-Release number of selected component (if applicable): ============================================================= glusterfs 3.6.0.29 built on Sep 18 2014 23:55:46 How reproducible: ==================== 1/1 Steps to Reproduce: ====================== 1. Create 2 x 2 dis-rep volume. Start the volume. 2. From 2 client machines create 1 fuse mount each. 3. From 1st client machine fuse mount execute the script "self_heal_sanity_create.sh" 4. From both the client fuse mount points calculate arequal-checksum. 5. Create snapshot snap1 6. Set self-heal-daemon to off 7. Crash brick1 and brick3 (using godown utility) 8. From 1st client machine fuse mount execute the script "self_heal_sanity_modify.sh" 9. From both the client fuse mount points calculate arequal-checksum. 10. Bring back brick1 and brick3 11. Immediately create snapshot snap2. 12. From both the client fuse mount points calculate arequal-checksum. (self-heals all the data) 13. From 1st client machine fuse mount perfrom "rm -rf *" (simulating accedential deletion of data) 14. Create a fuse mount for snap volume "snap2" from one of the clients. 15. from the snap volume mount calculate arequal-checksum. 16. copy the contents from snap mount point to the actual volume mount point "cp -rp * <actual_volume_mount>" 17. calculate the arequal-checksum of both snap_mount_point and actual_volume_mount_point. Expected : They should be same. Actual : They differ 18. unmount the snap volume mount and remount it with option "use-readdirp=NO". 19. calculate arequal-checksum. Actual results: =============== Observed the crash. Further executing stat or ls on the file is crashing the process. Expected results: ================= glusterfs process shouldn't crash
Bug successfully verified the bug on build glusterfs 3.6.0.36.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0038.html