Bug 1602866 - dht: Crash seen in thread dht_dir_attr_heal
Summary: dht: Crash seen in thread dht_dir_attr_heal
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nithya Balachandran
QA Contact:
URL:
Whiteboard:
Depends On: 1601331
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-18 16:37 UTC by Nithya Balachandran
Modified: 2018-10-23 15:14 UTC (History)
7 users (show)

Fixed In Version: glusterfs-5.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1601331
Environment:
Last Closed: 2018-10-23 15:14:52 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Nithya Balachandran 2018-07-18 16:37:06 UTC
+++ This bug was initially created as a clone of Bug #1601331 +++

Description of problem:

While testing two of my geo-rep patches [1] and [2] I see the geo-rep mount process crashed. I was running the modified upstream geo-rep regression test suite to run on replica 3 (6*3) volume. Geo-rep client process crashed as below. Note that the geo-rep mount are aux-gfid-mount.

I looked into the traceback, it's during dht attr heal and the gfid is null in both loc and loc->inode. I didn't have much context on afr/dht, so could debug it further.

(gdb) 
#0  0x00007f260e71b765 in raise () from /lib64/libc.so.6
#1  0x00007f260e71d36a in abort () from /lib64/libc.so.6
#2  0x00007f260e713f97 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007f260e714042 in __assert_fail () from /lib64/libc.so.6
#4  0x00007f2602149ec2 in client_pre_inodelk (this=0x7f25fc00ef20, req=0x7f25e4217670, loc=0x7f25e400a298, cmd=6, flock=0x7f25e400a4b8, volume=0x7f25fc0132b0 "slave-replicate-1", xdata=0x0)
    at client-common.c:841
#5  0x00007f2602138b24 in client3_3_inodelk (frame=0x7f25e4015290, this=0x7f25fc00ef20, data=0x7f25e4217760) at client-rpc-fops.c:5307
#6  0x00007f260210d9d9 in client_inodelk (frame=0x7f25e4015290, this=0x7f25fc00ef20, volume=0x7f25fc0132b0 "slave-replicate-1", loc=0x7f25e400a298, cmd=6, lock=0x7f25e400a4b8, xdata=0x0)
    at client.c:1679
#7  0x00007f2601ea4444 in afr_nonblocking_inodelk (frame=0x7f25e400f680, this=0x7f25fc015230) at afr-lk-common.c:1093
#8  0x00007f2601e9d149 in afr_lock (frame=0x7f25e400f680, this=0x7f25fc015230) at afr-transaction.c:1652
#9  0x00007f2601e9eb84 in afr_transaction_start (local=0x7f25e4009e60, this=0x7f25fc015230) at afr-transaction.c:2333
#10 0x00007f2601e9eec0 in afr_transaction (frame=0x7f25e400f680, this=0x7f25fc015230, type=AFR_METADATA_TRANSACTION) at afr-transaction.c:2402
#11 0x00007f2601e875d7 in afr_setattr (frame=0x7f25e400ece0, this=0x7f25fc015230, loc=0x7f25e4008e58, buf=0x7f25e4008f58, valid=7, xdata=0x0) at afr-inode-write.c:895
#12 0x00007f261011681d in syncop_setattr (subvol=0x7f25fc015230, loc=0x7f25e4008e58, iatt=0x7f25e4008f58, valid=7, preop=0x0, postop=0x0, xdata_in=0x0, xdata_out=0x0) at syncop.c:1811
#13 0x00007f2601bc0448 in dht_dir_attr_heal (data=0x7f25e4007c60) at dht-selfheal.c:2497
#14 0x00007f261010f894 in synctask_wrap () at syncop.c:375
#15 0x00007f260e72fb60 in ?? () from /lib64/libc.so.6
#16 0x0000000000000000 in ?? ()
(gdb) f 4
#4  0x00007f2602149ec2 in client_pre_inodelk (this=0x7f25fc00ef20, req=0x7f25e4217670, loc=0x7f25e400a298, cmd=6, flock=0x7f25e400a4b8, volume=0x7f25fc0132b0 "slave-replicate-1", xdata=0x0)
    at client-common.c:841
841	        GF_ASSERT_AND_GOTO_WITH_ERROR (this->name,
(gdb) p *loc
$1 = {path = 0x7f25e40102f0 "/.gfid/00000000-0000-0000-0000-", '0' <repeats 11 times>, "1/rsnapshot_symlinkbug", name = 0x7f25e401031c "rsnapshot_symlinkbug", inode = 0x7f25ec030d30, 
  parent = 0x7f25fc078870, gfid = '\000' <repeats 15 times>, pargfid = '\000' <repeats 15 times>, "\001"}
(gdb) p *loc->inode
$2 = {table = 0x7f25fc078770, gfid = '\000' <repeats 15 times>, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, 
        __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, nlookup = 0, fd_count = 0, active_fd_count = 0, ref = 3, 
  ia_type = IA_INVAL, fd_list = {next = 0x7f25ec030d88, prev = 0x7f25ec030d88}, dentry_list = {next = 0x7f25ec030d98, prev = 0x7f25ec030d98}, hash = {next = 0x7f25ec030da8, 
    prev = 0x7f25ec030da8}, list = {next = 0x7f25ec03aa98, prev = 0x7f25fc0787d0}, _ctx = 0x7f25ec032580}
(gdb) 
$3 = {table = 0x7f25fc078770, gfid = '\000' <repeats 15 times>, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, 
        __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, nlookup = 0, fd_count = 0, active_fd_count = 0, ref = 3, 
  ia_type = IA_INVAL, fd_list = {next = 0x7f25ec030d88, prev = 0x7f25ec030d88}, dentry_list = {next = 0x7f25ec030d98, prev = 0x7f25ec030d98}, hash = {next = 0x7f25ec030da8, 
    prev = 0x7f25ec030da8}, list = {next = 0x7f25ec03aa98, prev = 0x7f25fc0787d0}, _ctx = 0x7f25ec032580}

Version-Release number of selected component (if applicable):

3.4 source install:
Had two patches [1] and [2] on top of
 commit  8c9028b560b1f0fd816e7d2a9e0bec70cc526c1a


How reproducible:
Rarely, I have hit only once.

Steps to Reproduce:
1.  Run upstream regression test suite modifying the volume type of both master and slave to 6*3
   # prove -v tests/00-geo-rep/georep-basic-dr-rsync.t


Actual results:
mount process crashed 


[1]  https://code.engineering.redhat.com/gerrit/143400
[2]  https://code.engineering.redhat.com/gerrit/143826

Expected results:
No crash should be seen

Additional info:
The mount is done by geo-rep worker process and it's a gfid-acces fuse mount.
Fust mount mounted with option "-o aux-gfid-mount".

Comment 1 Nithya Balachandran 2018-07-18 16:38:14 UTC
RCA:

The process crashed because both loc->gfid and loc->inode->gfid are null. The dht_dir_attr_heal was triggered in dht_lookup_dir_cbk (). This is a fresh named lookup so loc->gfid is null and loc->inode->gfid is null because loc->inode has not been linked.

(gdb) bt
#0  0x00007f260e71b765 in raise () from ./lib64/libc.so.6
#1  0x00007f260e71d36a in abort () from ./lib64/libc.so.6
#2  0x00007f260e713f97 in __assert_fail_base () from ./lib64/libc.so.6
#3  0x00007f260e714042 in __assert_fail () from ./lib64/libc.so.6
#4  0x00007f2602149ec2 in client_pre_inodelk (this=0x7f25fc00ef20, req=0x7f25e4217670, loc=0x7f25e400a298, cmd=6, flock=0x7f25e400a4b8, 
    volume=0x7f25fc0132b0 "slave-replicate-1", xdata=0x0) at client-common.c:841
#5  0x00007f2602138b24 in client3_3_inodelk (frame=0x7f25e4015290, this=0x7f25fc00ef20, data=0x7f25e4217760) at client-rpc-fops.c:5307
#6  0x00007f260210d9d9 in client_inodelk (frame=0x7f25e4015290, this=0x7f25fc00ef20, volume=0x7f25fc0132b0 "slave-replicate-1", 
    loc=0x7f25e400a298, cmd=6, lock=0x7f25e400a4b8, xdata=0x0) at client.c:1679
#7  0x00007f2601ea4444 in afr_nonblocking_inodelk (frame=0x7f25e400f680, this=0x7f25fc015230) at afr-lk-common.c:1093
#8  0x00007f2601e9d149 in afr_lock (frame=0x7f25e400f680, this=0x7f25fc015230) at afr-transaction.c:1652
#9  0x00007f2601e9eb84 in afr_transaction_start (local=0x7f25e4009e60, this=0x7f25fc015230) at afr-transaction.c:2333
#10 0x00007f2601e9eec0 in afr_transaction (frame=0x7f25e400f680, this=0x7f25fc015230, type=AFR_METADATA_TRANSACTION) at afr-transaction.c:2402
#11 0x00007f2601e875d7 in afr_setattr (frame=0x7f25e400ece0, this=0x7f25fc015230, loc=0x7f25e4008e58, buf=0x7f25e4008f58, valid=7, xdata=0x0)
    at afr-inode-write.c:895
#12 0x00007f261011681d in syncop_setattr (subvol=0x7f25fc015230, loc=0x7f25e4008e58, iatt=0x7f25e4008f58, valid=7, preop=0x0, postop=0x0, 
    xdata_in=0x0, xdata_out=0x0) at syncop.c:1811
#13 0x00007f2601bc0448 in dht_dir_attr_heal (data=0x7f25e4007c60) at dht-selfheal.c:2497
#14 0x00007f261010f894 in synctask_wrap () at syncop.c:375
#15 0x00007f260e72fb60 in ?? () from ./lib64/libc.so.6
#16 0x0000000000000000 in ?? ()


(gdb) f 13
#13 0x00007f2601bc0448 in dht_dir_attr_heal (data=0x7f25e4007c60) at dht-selfheal.c:2497
2497	                        ret = syncop_setattr (subvol, &local->loc, &local->mds_stbuf,
(gdb) p local->loc
$119 = {path = 0x7f25e4016d50 "/.gfid/00000000-0000-0000-0000-", '0' <repeats 11 times>, "1/rsnapshot_symlinkbug", 
  name = 0x7f25e4016d7c "rsnapshot_symlinkbug", inode = 0x7f25ec030d30, parent = 0x7f25fc078870, gfid = '\000' <repeats 15 times>, 
  pargfid = '\000' <repeats 15 times>, "\001"}
(gdb) p local->loc->gfid
$120 = '\000' <repeats 15 times>
(gdb) p *local->loc->inode
$121 = {table = 0x7f25fc078770, gfid = '\000' <repeats 15 times>, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 0, __owner = 0, 
        __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, 
      __align = 0}}, nlookup = 0, fd_count = 0, active_fd_count = 0, ref = 3, ia_type = IA_INVAL, fd_list = {next = 0x7f25ec030d88, 
    prev = 0x7f25ec030d88}, dentry_list = {next = 0x7f25ec030d98, prev = 0x7f25ec030d98}, hash = {next = 0x7f25ec030da8, prev = 0x7f25ec030da8}, 
  list = {next = 0x7f25ec03aa98, prev = 0x7f25fc0787d0}, _ctx = 0x7f25ec032580}



In dht_lookup_dir_cbk, the function unwinds immediately after calling dht_dir_attr_heal.


                                ret = synctask_new (this->ctx->env,             
                                                    dht_dir_attr_heal,          
                                                    dht_dir_attr_heal_done,     
                                                    copy, copy);                
                                if (ret) {                                      
                                        gf_msg (this->name, GF_LOG_ERROR, ENOMEM,
                                                DHT_MSG_DIR_ATTR_HEAL_FAILED,   
                                                "Synctask creation failed to heal attr "
                                                "for path %s gfid %s ",         
                                                local->loc.path, local->gfid);  
                                        DHT_STACK_DESTROY (copy);               
                                }                                               
                        }                                                       
                }                                                               
                                                                                
skip_attr_heal:                                                                                                                   
                DHT_STRIP_PHASE1_FLAGS (&local->stbuf);                         
                dht_set_fixed_dir_stat (&local->postparent);                    
                /* Delete mds xattr at the time of STACK UNWIND */              
                if (local->xattr)                                               
                        GF_REMOVE_INTERNAL_XATTR (conf->mds_xattr_key, local->xattr);
                DHT_STACK_UNWIND (lookup, frame, local->op_ret, local->op_errno,
                                  local->inode, &local->stbuf, local->xattr,    
                                  &local->postparent);                          
        }                                               



Most of the time, the unwind reaches the fuse layer and the inode is linked before the dht_attr_dir_heal call stack hits client_pre_inodelk. Once the inode is linked, loc->inode->gfid is non null and client_pre_inodelk proceeds successfully. If the inode has not been linked at this point, it asserts.

I was able to hit this by:
1. setting local->need_attrheal to 1
2. putting a sleep(10) before the line DHT_STRIP_PHASE1_FLAGS (&local->stbuf)  
3. performing a fresh lookup on an existing directory

Comment 2 Worker Ant 2018-07-18 16:55:08 UTC
REVIEW: https://review.gluster.org/20530 (cluster/dht: Set loc->gfid before healing attr) posted (#1) for review on master by N Balachandran

Comment 3 Worker Ant 2018-07-19 03:04:40 UTC
COMMIT: https://review.gluster.org/20530 committed in master by "N Balachandran" <nbalacha> with a commit message- cluster/dht: Set loc->gfid before healing attr

AFR takes inodelks when setting attrs. The loc->gfid
and loc->inode->gfid were both null when dht_dir_attr_heal
was called during a fresh lookup of an existing directory.
As the gfid is null, client_pre_inodelk asserts in the gfid
check.
We now set the loc->gfid before calling dht_dir_attr_heal.

Change-Id: I457f5a73fd301d97a03ca032587e73d4803298ac
fixes: bz#1602866
Signed-off-by: N Balachandran <nbalacha>

Comment 4 Shyamsundar 2018-10-23 15:14:52 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report.

glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.