Bug 885008 - extra unref of the fd might lead to segfault
Summary: extra unref of the fd might lead to segfault
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: fuse
Version: mainline
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: ---
Assignee: Raghavendra Bhat
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 855326 855787
TreeView+ depends on / blocked
 
Reported: 2012-12-07 09:26 UTC by Raghavendra Bhat
Modified: 2014-07-11 16:16 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-11 16:16:48 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
Attached core (448.00 KB, application/x-xz)
2013-01-09 07:21 UTC, Sachidananda Urs
no flags Details

Description Raghavendra Bhat 2012-12-07 09:26:43 UTC
Description of problem:

Upon graph change fuse migrates the fds to the new graph through the function fuse_migrate_fd in which it does the below thing.

ret = fuse_migrate_fd_open (this, basefd, oldfd, old_subvol,
                                    new_subvol, &newfdptr);
        if (ret < 0) {
                gf_log (this->name, GF_LOG_WARNING, "open corresponding to "
                        "basefd (ptr:%p inode-gfid:%s) in new graph failed "
		        "(old-subvolume:%s-%d new-subvolume:%s-%d)", basefd,
                        uuid_utoa (basefd->inode->gfid), old_subvol->name,
                        old_subvol->graph->id, new_subvol->name,
                        new_subvol->graph->id);
                goto out;
        }

        ret = fuse_migrate_locks (this, oldfd, newfdptr, old_subvol,
                                  new_subvol);
        if (ret < 0) {
                gf_log (this->name, GF_LOG_WARNING,
			"migrating locks from old-subvolume (%s-%d) to "
                        "new-subvolume (%s-%d) failed (inode-gfid:%s oldfd:%p "
			"newfd:%p)", old_subvol->name, old_subvol->graph->id,
                        new_subvol->name, new_subvol->graph->id,


At first the fd is migrated and then locks are migrated to the new graph. The newfd created in the new graph is obtained in newfdptr.

In fuse_migrate_fd_open new fd is created for the new graph, and at the end it is unrefed (thus destroying the fd too).  Next time when the same fd pointer is accessed in fuse_migrate_locks the process will segfault.

This seems to be a race condition.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Raghavendra Bhat 2012-12-07 10:17:07 UTC
The way to reproduce is, attach gdb to the fuse client process. Have a break point to fuse_migrate_fd.  Start running dbench on the mount point and do graph changes. Just start step wise execution of each statement once the break point is hit. The process crashes in some time.

Comment 2 Amar Tumballi 2012-12-11 05:55:26 UTC
http://review.gluster.org/4282 is out for review

Comment 3 Sachidananda Urs 2013-01-09 07:20:22 UTC
(gdb) bt
#0  uuid_unpack (in=0xaaaaaab2 <Address 0xaaaaaab2 out of bounds>, uu=0x7f3109ffadd0) at ../../contrib/uuid/unpack.c:44
#1  0x0000003a39842dd6 in uuid_unparse_x (uu=<value optimized out>, out=0x7f30ec001310 "\b\001", 
    fmt=0x3a39861da8 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at ../../contrib/uuid/unparse.c:55
#2  0x0000003a39823ff7 in uuid_utoa (uuid=0xaaaaaab2 <Address 0xaaaaaab2 out of bounds>) at common-utils.c:1836
#3  0x00007f3108f2ac99 in fuse_migrate_locks (this=0x22d4a70, oldfd=0x23ff684, newfd=0x361fe1c, old_subvol=0x2301df0, 
    new_subvol=<value optimized out>) at fuse-bridge.c:3856
#4  0x00007f3108f2dd90 in fuse_migrate_fd (this=0x22d4a70, basefd=0x23ff684, old_subvol=0x2301df0, new_subvol=0x30475e0)
    at fuse-bridge.c:3966
#5  0x00007f3108f2e1fc in fuse_handle_opened_fds (this=0x22d4a70, old_subvol=0x2301df0, new_subvol=0x30475e0)
    at fuse-bridge.c:4017
#6  0x00007f3108f2e2a9 in fuse_graph_switch_task (data=<value optimized out>) at fuse-bridge.c:4068
#7  0x0000003a398434d2 in synctask_wrap (old_task=<value optimized out>) at syncop.c:129
#8  0x000000341b443610 in ?? () from /lib64/libc.so.6
#9  0x0000000000000000 in ?? ()

Comment 4 Sachidananda Urs 2013-01-09 07:21:14 UTC
Created attachment 675288 [details]
Attached core

Comment 5 Vijay Bellur 2013-01-31 07:40:51 UTC
CHANGE: http://review.gluster.org/4282 (libglusterfs/syncop: do not hold ref on the fd in cbk) merged in master by Anand Avati (avati)


Note You need to log in before you can comment on or make changes to this bug.