Bug 885008

Summary: extra unref of the fd might lead to segfault
Product: [Community] GlusterFS Reporter: Raghavendra Bhat <rabhat>
Component: fuseAssignee: Raghavendra Bhat <rabhat>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: medium    
Version: mainlineCC: gluster-bugs, rwheeler, sac
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-11 16:16:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 855326, 855787    
Attachments:
Description Flags
Attached core none

Description Raghavendra Bhat 2012-12-07 09:26:43 UTC
Description of problem:

Upon graph change fuse migrates the fds to the new graph through the function fuse_migrate_fd in which it does the below thing.

ret = fuse_migrate_fd_open (this, basefd, oldfd, old_subvol,
                                    new_subvol, &newfdptr);
        if (ret < 0) {
                gf_log (this->name, GF_LOG_WARNING, "open corresponding to "
                        "basefd (ptr:%p inode-gfid:%s) in new graph failed "
		        "(old-subvolume:%s-%d new-subvolume:%s-%d)", basefd,
                        uuid_utoa (basefd->inode->gfid), old_subvol->name,
                        old_subvol->graph->id, new_subvol->name,
                        new_subvol->graph->id);
                goto out;
        }

        ret = fuse_migrate_locks (this, oldfd, newfdptr, old_subvol,
                                  new_subvol);
        if (ret < 0) {
                gf_log (this->name, GF_LOG_WARNING,
			"migrating locks from old-subvolume (%s-%d) to "
                        "new-subvolume (%s-%d) failed (inode-gfid:%s oldfd:%p "
			"newfd:%p)", old_subvol->name, old_subvol->graph->id,
                        new_subvol->name, new_subvol->graph->id,


At first the fd is migrated and then locks are migrated to the new graph. The newfd created in the new graph is obtained in newfdptr.

In fuse_migrate_fd_open new fd is created for the new graph, and at the end it is unrefed (thus destroying the fd too).  Next time when the same fd pointer is accessed in fuse_migrate_locks the process will segfault.

This seems to be a race condition.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Raghavendra Bhat 2012-12-07 10:17:07 UTC
The way to reproduce is, attach gdb to the fuse client process. Have a break point to fuse_migrate_fd.  Start running dbench on the mount point and do graph changes. Just start step wise execution of each statement once the break point is hit. The process crashes in some time.

Comment 2 Amar Tumballi 2012-12-11 05:55:26 UTC
http://review.gluster.org/4282 is out for review

Comment 3 Sachidananda Urs 2013-01-09 07:20:22 UTC
(gdb) bt
#0  uuid_unpack (in=0xaaaaaab2 <Address 0xaaaaaab2 out of bounds>, uu=0x7f3109ffadd0) at ../../contrib/uuid/unpack.c:44
#1  0x0000003a39842dd6 in uuid_unparse_x (uu=<value optimized out>, out=0x7f30ec001310 "\b\001", 
    fmt=0x3a39861da8 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at ../../contrib/uuid/unparse.c:55
#2  0x0000003a39823ff7 in uuid_utoa (uuid=0xaaaaaab2 <Address 0xaaaaaab2 out of bounds>) at common-utils.c:1836
#3  0x00007f3108f2ac99 in fuse_migrate_locks (this=0x22d4a70, oldfd=0x23ff684, newfd=0x361fe1c, old_subvol=0x2301df0, 
    new_subvol=<value optimized out>) at fuse-bridge.c:3856
#4  0x00007f3108f2dd90 in fuse_migrate_fd (this=0x22d4a70, basefd=0x23ff684, old_subvol=0x2301df0, new_subvol=0x30475e0)
    at fuse-bridge.c:3966
#5  0x00007f3108f2e1fc in fuse_handle_opened_fds (this=0x22d4a70, old_subvol=0x2301df0, new_subvol=0x30475e0)
    at fuse-bridge.c:4017
#6  0x00007f3108f2e2a9 in fuse_graph_switch_task (data=<value optimized out>) at fuse-bridge.c:4068
#7  0x0000003a398434d2 in synctask_wrap (old_task=<value optimized out>) at syncop.c:129
#8  0x000000341b443610 in ?? () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()

Comment 4 Sachidananda Urs 2013-01-09 07:21:14 UTC
Created attachment 675288 [details]
Attached core

Comment 5 Vijay Bellur 2013-01-31 07:40:51 UTC
CHANGE: http://review.gluster.org/4282 (libglusterfs/syncop: do not hold ref on the fd in cbk) merged in master by Anand Avati (avati)