+++ This bug was initially created as a clone of Bug #1118770 +++ Description of problem: ======================= Create Directory from mountpoint and while creation is in progress (Directory created only on hashed sub-volume), execute rename of that Directory(destination Directory does not exist and both Source and Destination hash to same sub-volume here) i.e. from one mout point :- mkfir dir1 from another mount point mv dir1 dir2 After both operation are finished:- - same gfid for different Directories (at same level) - sometimes few files inside those directories are not listed on mount and not accessible Version-Release number : ========================= 3.6.0.24-1.el6rhs.x86_64 How reproducible: ================= always Steps to Reproduce: ==================== 1. create and mount distributed volume. (mount on multiple client) 2. [to reproduce race, we are putting breakpoint at dht_mkdir_hashed_dir_cbk and dht_rename_hashed_dir_cbk] 3. from one mount point execute [root@OVM1 race]# mkdir inprogress bricks:- [root@OVM5 race]# tree /brick*/race/ /brick1/race/ /brick2/race/ └── inprogress /brick3/race/ 1 directory, 0 files from another mount point:- [root@OVM1 race1]# mv inprogress rename bricks:- [root@OVM5 race]# tree /brick*/race/ /brick1/race/ └── rename /brick2/race/ └── inprogress /brick3/race/ └── inprogress 3 directories, 0 files 4. now continue bothe operation 5. verify data from another mount and bricks also mount:- [root@OVM5 race]# ls -lR .: total 0 drwxr-xr-x 2 root root 18 Jul 10 12:50 rename ./rename: total 0 [root@OVM5 race]# mkdir inprogress mkdir: cannot create directory `inprogress': File exists [root@OVM5 race]# ls -lR .: total 0 drwxr-xr-x 2 root root 18 Jul 10 12:50 inprogress drwxr-xr-x 2 root root 18 Jul 10 12:50 rename ./inprogress: total 0 ./rename: total 0 bricks:- same gfid:- [root@OVM5 race]# getfattr -d -m . /brick3/race/* -e hex getfattr: Removing leading '/' from absolute path names # file: brick3/race/inprogress trusted.gfid=0x5b3c1a8ca4b84f27912880710a165fb7 trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 # file: brick3/race/rename trusted.gfid=0x5b3c1a8ca4b84f27912880710a165fb7 trusted.glusterfs.dht=0x000000010000000055555555aaaaaaa9 [root@OVM5 race]# tree /brick*/race/ /brick1/race/ ├── inprogress └── rename /brick2/race/ ├── inprogress └── rename /brick3/race/ ├── inprogress └── rename Actual results: =============== - same gfid for different Directories - sometimes files inside those directories are not listed on mount and Expected results: ================= - no two directory should have same gfid - all files inside those Directories should be accessible from mount point In case if destination directory exist, output would be rename1 already exist and race :- [root@OVM1 race]# mkdir rename [root@OVM1 race1]# mv rename rename1 output on mount:- [root@OVM5 race]# ls -lR .: total 0 drwxr-xr-x 2 root root 18 Jul 10 15:00 rename drwxr-xr-x 3 root root 57 Jul 10 15:00 rename1 ./rename: total 0 ./rename1: total 0 drwxr-xr-x 2 root root 18 Jul 10 15:00 rename ./rename1/rename: total 0 bricks:- [root@OVM5 race]# tree /brick*/race/ /brick1/race/ ├── rename └── rename1 └── rename /brick2/race/ ├── rename └── rename1 └── rename /brick3/race/ ├── rename └── rename1 └── rename 9 directories, 0 files [root@OVM5 race]# getfattr -d -m . -e hex /brick3/race/* -R getfattr: Removing leading '/' from absolute path names # file: brick3/race/rename trusted.gfid=0xac6b95cb620c400d91a55f3ce66ee005 trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff # file: brick3/race/rename1 trusted.gfid=0x9482dd3bf0834596bb74d6ffeffa40d2 trusted.glusterfs.dht=0x00000001000000000000000055555554 # file: brick3/race/rename1/rename trusted.gfid=0xac6b95cb620c400d91a55f3ce66ee005 trusted.glusterfs.dht=0x0000000100000000aaaaaaaaffffffff
REVIEW: http://review.gluster.org/11880 (dht : locks in rename to avoid layout change by lookup selfheal) posted (#1) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11880 (dht: locks in rename to avoid layout change by lookup selfheal) posted (#2) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11880 (dht : locks in rename to avoid layout change by lookup selfheal) posted (#3) for review on master by Sakshi Bansal (sabansal)
REVIEW: http://review.gluster.org/11880 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#4) for review on master by Sakshi Bansal
REVIEW: http://review.gluster.org/11880 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#5) for review on master by Sakshi Bansal
REVIEW: http://review.gluster.org/11880 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#6) for review on master by Sakshi Bansal
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions
REVIEW: http://review.gluster.org/11880 (dht: lock on subvols to prevent rename and lookup selfheal race) posted (#7) for review on master by Sakshi Bansal
COMMIT: http://review.gluster.org/11880 committed in master by Jeff Darcy (jdarcy) ------ commit 6e3b4eae1ae559d16721f765294ab30a270820d0 Author: Sakshi <sabansal> Date: Wed Aug 5 16:05:22 2015 +0530 dht: lock on subvols to prevent rename and lookup selfheal race This patch addresses two races while renaming directories: 1) While renaming src to dst, if a lookup selfheal is triggered it can recreate src on those subvols where rename was successful. This leads to multiple directories (src and dst) having same gfid. To avoid this we must take locks on all subvols with src. 2) While renaming if the dst exists and a lookup selfheal is triggered it will find anomalies in the dst layout and try to heal the stale layout. To avoid this we must take lock on any one subvol with dst. Change-Id: I637f637d3241d9065cd5be59a671c7e7ca3eed53 BUG: 1252244 Signed-off-by: Sakshi <sabansal> Reviewed-on: http://review.gluster.org/11880 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> Reviewed-by: Raghavendra G <rgowdapp> CentOS-regression: Gluster Build System <jenkins.com>
REVIEW: http://review.gluster.org/13988 (quota: setting 'read-only' option in xdata to instruct DHT to not heal) posted (#1) for review on master by Sakshi Bansal
REVIEW: http://review.gluster.org/13988 (quota: setting 'read-only' option in xdata to instruct DHT to not heal) posted (#2) for review on master by Sakshi Bansal
COMMIT: http://review.gluster.org/13988 committed in master by Raghavendra G (rgowdapp) ------ commit abd47f27848c9bb2bf5bc371367c3d41f526ad50 Author: Sakshi Bansal <sabansal> Date: Wed Apr 13 16:40:40 2016 +0530 quota: setting 'read-only' option in xdata to instruct DHT to not heal When quota is enabled the quota enforcer tries to get the size of the source directory by sending nameless lookup to quotad. But if the rename is successful even on one subvol or the source layout has anomalies then this nameless lookup in quotad tries to heal the directory which requires a lock on as many subvols as it can. But src is already locked as part of rename. For rename to proceed in brick it needs to complete a cluster-wide lookup. But cluster-wide lookup in quotad is blocked on locks held by rename, hence a deadlock. To avoid this quota sends an option in xdata which instructs DHT not to heal. Change-Id: I792f9322331def0b1f4e16e88deef55d0c9f17f0 BUG: 1252244 Signed-off-by: Sakshi Bansal <sabansal> Reviewed-on: http://review.gluster.org/13988 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Raghavendra G <rgowdapp>
REVIEW: http://review.gluster.org/14371 (dht: rename takes lock on parent directory if destination exists) posted (#1) for review on master by Sakshi Bansal
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report. glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/ [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user