Description of problem: Upon graph change fuse migrates the fds to the new graph through the function fuse_migrate_fd in which it does the below thing. ret = fuse_migrate_fd_open (this, basefd, oldfd, old_subvol, new_subvol, &newfdptr); if (ret < 0) { gf_log (this->name, GF_LOG_WARNING, "open corresponding to " "basefd (ptr:%p inode-gfid:%s) in new graph failed " "(old-subvolume:%s-%d new-subvolume:%s-%d)", basefd, uuid_utoa (basefd->inode->gfid), old_subvol->name, old_subvol->graph->id, new_subvol->name, new_subvol->graph->id); goto out; } ret = fuse_migrate_locks (this, oldfd, newfdptr, old_subvol, new_subvol); if (ret < 0) { gf_log (this->name, GF_LOG_WARNING, "migrating locks from old-subvolume (%s-%d) to " "new-subvolume (%s-%d) failed (inode-gfid:%s oldfd:%p " "newfd:%p)", old_subvol->name, old_subvol->graph->id, new_subvol->name, new_subvol->graph->id, At first the fd is migrated and then locks are migrated to the new graph. The newfd created in the new graph is obtained in newfdptr. In fuse_migrate_fd_open new fd is created for the new graph, and at the end it is unrefed (thus destroying the fd too). Next time when the same fd pointer is accessed in fuse_migrate_locks the process will segfault. This seems to be a race condition. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
The way to reproduce is, attach gdb to the fuse client process. Have a break point to fuse_migrate_fd. Start running dbench on the mount point and do graph changes. Just start step wise execution of each statement once the break point is hit. The process crashes in some time.
http://review.gluster.org/4282 is out for review
(gdb) bt #0 uuid_unpack (in=0xaaaaaab2 <Address 0xaaaaaab2 out of bounds>, uu=0x7f3109ffadd0) at ../../contrib/uuid/unpack.c:44 #1 0x0000003a39842dd6 in uuid_unparse_x (uu=<value optimized out>, out=0x7f30ec001310 "\b\001", fmt=0x3a39861da8 "%08x-%04x-%04x-%02x%02x-%02x%02x%02x%02x%02x%02x") at ../../contrib/uuid/unparse.c:55 #2 0x0000003a39823ff7 in uuid_utoa (uuid=0xaaaaaab2 <Address 0xaaaaaab2 out of bounds>) at common-utils.c:1836 #3 0x00007f3108f2ac99 in fuse_migrate_locks (this=0x22d4a70, oldfd=0x23ff684, newfd=0x361fe1c, old_subvol=0x2301df0, new_subvol=<value optimized out>) at fuse-bridge.c:3856 #4 0x00007f3108f2dd90 in fuse_migrate_fd (this=0x22d4a70, basefd=0x23ff684, old_subvol=0x2301df0, new_subvol=0x30475e0) at fuse-bridge.c:3966 #5 0x00007f3108f2e1fc in fuse_handle_opened_fds (this=0x22d4a70, old_subvol=0x2301df0, new_subvol=0x30475e0) at fuse-bridge.c:4017 #6 0x00007f3108f2e2a9 in fuse_graph_switch_task (data=<value optimized out>) at fuse-bridge.c:4068 #7 0x0000003a398434d2 in synctask_wrap (old_task=<value optimized out>) at syncop.c:129 #8 0x000000341b443610 in ?? () from /lib64/libc.so.6 #9 0x0000000000000000 in ?? ()
Created attachment 675288 [details] Attached core
CHANGE: http://review.gluster.org/4282 (libglusterfs/syncop: do not hold ref on the fd in cbk) merged in master by Anand Avati (avati)