Description of problem: A file is exclusively locked and if this file migrates as part of rebalance migration , after migration lock semantics will not be preserver on the file. Version-Release number of selected component (if applicable): 3.4.0.59rhs-1.el6rhs.x86_64 How reproducible: always Steps to Reproduce: 1. created a dist-rep volume of 6x2 config 2. create a file on the mount point 3. wrote a sample c program which does write lock on the file fl.l_type = F_WRLCK; fl.l_start = 0; fl.l_whence = SEEK_SET; fl.l_len = 0; fl.l_pid = getpid(); int ret = fcntl (fd, F_SETLKW, &fl); if (ret == -1) { printf ("%s", strerror(errno)); exit (1); } while (1) { } /* stay here but lock is not released */ 4. open another window go to same mount point and run the same program again to hold the write lock, this process will be waiting at fcntl because of F_SETLKW, 5. Terminate the running process created at step 4 6. Now add-brick and rebalance , make sure that above said file got migrated 7. repeat step 4 from different mount point Actual results: in step 7 lock will be successfully aquired since lock held in step 3 is not yet released Expected results: at step 7 process should wait for the lock till the first lock is released Additional info: I verifed with write-behind off and add-brick to make sure that graph change is invalidating locks but graph changes preserves the lock semantics but its lost after migration.
Its an existing behaviour. Rebalance process doesn't migrate the locks. Locks are migrated only by gfapi or fuse-bridge. since between steps 6 and 7 no operations are performed from client holding the lock, locks are not migrated to new brick. So, when a new process tries to acquire locks on the file which is moved to new brick, locks are granted.
The problem here is when we should migrate open fds and locks to new graph. Viewed from a single client perspective, its good enough if we do it when we get a new operation on mount point (since then there is no need). However, when locking from multiple clients is considered (such as this case), its better that we do migration ASAP (which is on receiving CHILD_UP event in gfapi/fuse-bridge). However, we need to be aware of races b/w /dev/fuse reader thread and poll thread since notifications (and hence proposed migration) are received on poll thread.
Cloning this bug in 3.1. To be fixed in future release.