Bug 801149 - [glusterfs-3.3.0qa25]: self-heal daemon crashed since dentry was NULL
Summary: [glusterfs-3.3.0qa25]: self-heal daemon crashed since dentry was NULL
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: replicate
Version: mainline
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Raghavendra Bhat
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 817967
TreeView+ depends on / blocked
 
Reported: 2012-03-07 19:16 UTC by Raghavendra Bhat
Modified: 2015-12-01 16:45 UTC (History)
2 users (show)

Fixed In Version: glusterfs-3.4.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-07-24 17:31:09 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Bhat 2012-03-07 19:16:27 UTC
Description of problem:
2x2 distributed replicate volume. 1 fuse and 1 nfs client. fuse client was running sanity script and nfs client was running rdd, fs-perf-test one after another in a loop.

Brought a brick down, after some time brought it up. Volume set operations were going parallely. glustershd crashed with the below backtrace.

Core was generated by `/usr/local/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /etc/'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f5aacc291a3 in __is_dentry_cyclic (dentry=0x0) at ../../../libglusterfs/src/inode.c:217
217                                              dentry->inode);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6_1.3.x86_64 libgcc-4.4.5-6.el6.x86_64
(gdb) bt
#0  0x00007f5aacc291a3 in __is_dentry_cyclic (dentry=0x0) at ../../../libglusterfs/src/inode.c:217
#1  0x00007f5aacc2a7ce in __inode_link (inode=0x7f5a84964c6c, parent=0x9c189c, name=0x0, iatt=0x7f5a8c2014f0)
    at ../../../libglusterfs/src/inode.c:816
#2  0x00007f5aacc2a8cc in inode_link (inode=0x7f5a84964c6c, parent=0x9c189c, name=0x0, iatt=0x7f5a8c2014f0)
    at ../../../libglusterfs/src/inode.c:847
#3  0x00007f5aa866a5de in _process_entries (this=0x98ded0, parentloc=0x7f5a8c201730, entries=0x7f5a8c201620, offset=0x7f5a8c2016d0, 
    crawl_data=0x7f5a8c0013a0) at ../../../../../xlators/cluster/afr/src/afr-self-heald.c:754
#4  0x00007f5aa866a93b in _crawl_directory (fd=0x9c759c, loc=0x7f5a8c201730, crawl_data=0x7f5a8c0013a0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heald.c:814
#5  0x00007f5aa866aea0 in afr_dir_crawl (data=0x7f5a8c0013a0) at ../../../../../xlators/cluster/afr/src/afr-self-heald.c:943
#6  0x00007f5aa866b16a in afr_dir_exclusive_crawl (data=0x7f5a8c0013a0)
    at ../../../../../xlators/cluster/afr/src/afr-self-heald.c:996
#7  0x00007f5aacc54753 in synctask_wrap (old_task=0x7f5a8c001400) at ../../../libglusterfs/src/syncop.c:144
#8  0x00000034c2243690 in ?? () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()
(gdb)  f 0
#0  0x00007f5aacc291a3 in __is_dentry_cyclic (dentry=0x0) at ../../../libglusterfs/src/inode.c:217
217                                              dentry->inode);
(gdb) p dentry
$1 = (dentry_t *) 0x0
(gdb) f 1
#1  0x00007f5aacc2a7ce in __inode_link (inode=0x7f5a84964c6c, parent=0x9c189c, name=0x0, iatt=0x7f5a8c2014f0)
    at ../../../libglusterfs/src/inode.c:816
816                             if (old_inode && __is_dentry_cyclic (dentry)) {
(gdb) l
811             if (parent) {
812                     old_dentry = __dentry_grep (table, parent, name);
813
814                     if (!old_dentry || old_dentry->inode != link_inode) {
815                             dentry = __dentry_create (link_inode, parent, name);
816                             if (old_inode && __is_dentry_cyclic (dentry)) {
817                                     __dentry_unset (dentry);
818                                     return NULL;
819                             }
820                             __dentry_hash (dentry);
(gdb) p link_inode
$2 = (inode_t *) 0x9c2020
(gdb) p parent
$3 = (inode_t *) 0x9c189c
(gdb) p name
$4 = 0x0
(gdb) p *link_inode
$5 = {table = 0x9c1690, gfid = "\377A\024L听\022l>\325\321", <incomplete sequence \344>, lock = 1, nlookup = 0, ref = 1, 
  ia_type = IA_IFDIR, fd_list = {next = 0x9c2050, prev = 0x9c2050}, dentry_list = {next = 0x9c2060, prev = 0x9c2060}, hash = {
    next = 0x7f5aa3586e70, prev = 0x7f5aa3586e70}, list = {next = 0x9c1a24, prev = 0x9c5238}, _ctx = 0x7f5a84000c30}
(gdb) f 3
#3  0x00007f5aa866a5de in _process_entries (this=0x98ded0, parentloc=0x7f5a8c201730, entries=0x7f5a8c201620, offset=0x7f5a8c2016d0, 
    crawl_data=0x7f5a8c0013a0) at ../../../../../xlators/cluster/afr/src/afr-self-heald.c:754
754                     inode_link (entry_loc.inode, parentloc->inode, NULL, &iattr);
(gdb) p entry_loc
$6 = {path = 0x7f5a840219d0 "/run6686", name = 0x7f5a840219d1 "run6686", inode = 0x7f5a84964c6c, parent = 0x9c189c, 
  gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>, "\001"}
(gdb) 




Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. with clients running some tests, bring a brick down
2. bring it up and let self-heal happen
3. parallely keep running volume set operations while above operations are going on.
  
Actual results:

glustershd crashed

Expected results:

glustershd should not crash

Additional info:
th: /sync_field/file-965, reason: lookup detected pending operations
[2012-03-07 07:46:24.969746] I [afr-self-heal-algorithm.c:131:sh_loop_driver_done] 0-mirror-replicate-0: diff self-heal on /sync_field
/file-965: completed. (21 blocks of 51 were different (41.18%))
[2012-03-07 07:46:24.972670] I [afr-self-heal-common.c:2028:afr_self_heal_completion_cbk] 0-mirror-replicate-0: background  data self-
heal completed on /sync_field/file-965
[2012-03-07 07:46:24.974567] I [afr-common.c:1290:afr_launch_self_heal] 0-mirror-replicate-0: background  data self-heal triggered. pa
th: /sync_field/file-974, reason: lookup detected pending operations
[2012-03-07 07:46:26.251466] I [afr-self-heal-algorithm.c:131:sh_loop_driver_done] 0-mirror-replicate-0: diff self-heal on /sync_field
/file-974: completed. (22 blocks of 50 were different (44.00%))
[2012-03-07 07:46:26.294914] I [afr-self-heal-common.c:2028:afr_self_heal_completion_cbk] 0-mirror-replicate-0: background  data self-heal completed on /sync_field/file-974
[2012-03-07 07:46:26.507573] I [afr-self-heald.c:949:afr_dir_crawl] 0-mirror-replicate-0: Crawl completed on mirror-replicate-0
[2012-03-07 07:46:26.509695] I [afr-self-heald.c:890:afr_find_child_position] 0-mirror-replicate-0: child mirror-client-0 is local
[2012-03-07 07:46:26.739686] W [inode.c:487:__dentry_create] (-->/usr/local/lib/glusterfs/3.3.0qa25/xlator/cluster/replicate.so(+0x5e5de) [0x7f5aa866a5de] (-->/usr/local/lib/libglusterfs.so.0(inode_link+0xc2) [0x7f5aacc2a8cc] (-->/usr/local/lib/libglusterfs.so.0(+0x347b7) [0x7f5aacc2a7b7]))) 0-mirror-replicate-0: inode || parent || name not found
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2012-03-07 07:46:26
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3.0qa25
/lib64/libc.so.6[0x34c2232980]
/usr/local/lib/libglusterfs.so.0(+0x331a3)[0x7f5aacc291a3]
/usr/local/lib/libglusterfs.so.0(+0x347ce)[0x7f5aacc2a7ce]

Comment 1 Amar Tumballi 2012-03-12 09:46:25 UTC
please update these bugs w.r.to 3.3.0qa27, need to work on it as per target milestone set.

Comment 2 Anand Avati 2012-03-14 10:53:41 UTC
CHANGE: http://review.gluster.com/2893 (cluster/afr: handle sending NULL dentry name for inode link in self-heal-daemon) merged in master by Vijay Bellur (vijay)

Comment 3 Raghavendra Bhat 2012-04-05 10:52:10 UTC
Checked with glusterfs-3.3.0qa33. glustershd did not crash because of the NULL dentry.


Note You need to log in before you can comment on or make changes to this bug.