Bug 1063615

Summary:	DHT: REBALANCE- Lock semantics are not preserved after rebalance migration
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	shylesh <shmohan>
Component:	distribute	Assignee:	Nithya Balachandran <nbalacha>
Status:	CLOSED DEFERRED	QA Contact:	storage-qa-internal <storage-qa-internal>
Severity:	high	Docs Contact:
Priority:	high
Version:	2.1	CC:	nbalacha, nlevinki, rgowdapp, spalai, vbellur
Target Milestone:	---
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:	dht-data-loss
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1286061 (view as bug list)		Environment:
Last Closed:	2015-11-27 10:28:24 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1286061

Description shylesh 2014-02-11 06:57:46 UTC

Description of problem:
A file is exclusively locked and if this file migrates as part of rebalance migration , after migration lock semantics will not be preserver on the file.

Version-Release number of selected component (if applicable):
3.4.0.59rhs-1.el6rhs.x86_64

How reproducible:
always 

Steps to Reproduce:
1. created a dist-rep volume of 6x2 config
2. create a file on the mount point
3. wrote a sample c program which does write lock  on the file
        fl.l_type = F_WRLCK;
        fl.l_start = 0;
        fl.l_whence = SEEK_SET;
        fl.l_len = 0;
        fl.l_pid = getpid();

        int ret = fcntl (fd, F_SETLKW, &fl);
        if (ret == -1)  {
                printf ("%s", strerror(errno));
                exit (1);
        }
        while (1) {
  
        }   /* stay here but lock is not released */
  
4. open another window go to same mount point and run the same program again to hold the write lock, this process will be waiting  at fcntl because of F_SETLKW, 

5. Terminate the running process created at step 4

6. Now add-brick and rebalance , make sure that above said file got migrated

7. repeat step 4 from different mount point



Actual results:
in step 7 lock will be successfully aquired since lock held in step 3 is not yet released

Expected results:
at step 7 process should wait for the lock till the first lock is released

Additional info:

I verifed with write-behind off and add-brick to make sure that graph change is invalidating locks but graph changes preserves the lock semantics but its lost after migration.

Comment 2 Raghavendra G 2014-02-11 11:48:55 UTC

Its an existing behaviour. Rebalance process doesn't migrate the locks. Locks are migrated only by gfapi or fuse-bridge. since between steps 6 and 7 no operations are performed from client holding the lock, locks are not migrated to new brick. So, when a new process tries to acquire locks on the file which is moved to new brick, locks are granted.

Comment 3 Raghavendra G 2014-02-12 05:24:19 UTC

The problem here is when we should migrate open fds and locks to new graph. Viewed from a single client perspective, its good enough if we do it when we get a new operation on mount point (since then there is no need). However, when locking from multiple clients is considered (such as this case), its better that we do migration ASAP (which is on receiving CHILD_UP event in gfapi/fuse-bridge). However, we need to be aware of races b/w /dev/fuse reader thread and poll thread since notifications (and hence proposed migration) are received on poll thread.

Comment 6 Susant Kumar Palai 2015-11-27 10:28:24 UTC

Cloning this bug in 3.1. To be fixed in future release.