Bug 1567033

Summary: glusterfsd process crashing on two
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: nravinas
Component: glusterfsAssignee: Raghavendra Bhat <rabhat>
Status: CLOSED UPSTREAM QA Contact: Bala Konda Reddy M <bmekala>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.3CC: atumball, bkunal, dwojslaw, kdhananj, nravinas, pkarampu, rabhat, rgowdapp, rhs-bugs, sankarshan, sheggodu, srangana, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-22 07:00:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Amar Tumballi 2018-04-13 10:26:48 UTC
Crash is happening in 'posix-locks' component:

One thread:

Thread 27:
#8  0x00007fb210053987 in pl_update_refkeeper (this=this@entry=0x7fb20c015550, inode=0x7fb20c40d9d0) at common.c:380

Thread 25
#8  0x00007fb210053987 in pl_update_refkeeper (this=this@entry=0x7fb20c015550, inode=0x7fb20c40d9d0) at common.c:380

Thread10:
#3  0x00007fb21f5222db in fd_ref (fd=0x7fb20cb44f60) at fd.c:456

Thread15:
#3  0x00007fb21f5224ce in gf_fd_fdptr_get (fdtable=0x7fb1c417d630, fd=fd@entry=1015) at fd.c:423

Comment 9 Pranith Kumar K 2018-04-16 17:37:02 UTC
Krutika,
     pl_update_refkeeper() code doesn't seem to be safe for ref/unref? In the case where we set need_ref = 1; Even before a ref happens an unref can happen on the inode in some other thread. This may not be the RC for this particular crash, but the code has that possibility.

void
pl_update_refkeeper (xlator_t *this, inode_t *inode)
{
        pl_inode_t *pl_inode  = NULL;
        int         is_empty  = 0;
        int         need_unref = 0;
        int         need_ref = 0;

        pl_inode = pl_inode_get (this, inode);

        pthread_mutex_lock (&pl_inode->mutex);
        {
                is_empty = __pl_inode_is_empty (pl_inode);

                if (is_empty && pl_inode->refkeeper) {
                        need_unref = 1;
                        pl_inode->refkeeper = NULL;
                }

                if (!is_empty && !pl_inode->refkeeper) {
                        need_ref = 1;
                        pl_inode->refkeeper = inode;
                }
        }
        pthread_mutex_unlock (&pl_inode->mutex);

        if (need_unref)
                inode_unref (inode);

        if (need_ref)
                inode_ref (inode);
}

Comment 40 Amar Tumballi 2018-10-22 07:00:42 UTC
> If engg things there is something to be fixed, we can do that. Else I happy to get this closed.

Bipin, we believe with our focus on stability, these issues would have been fixed. Happy to work on it if there is any further observation. Till then, will CLOSE it as UPSTREAM (to identify the focus on stability).