Description of problem: ======================= Take Snapshot when Directory is created only on hashed sub-volume and not on others. On restoring that snapshot, Directory is not listed on mount point. Lookup on parent Directory is not healing Directory on non-hashed sub-volume. Version-Release number of selected component (if applicable): ============================================================= 3.5qa2-0.340.gitc193996.el6_5.x86_64 How reproducible: ================ always Steps to Reproduce: 1. create Distributed volume, start it and FUSE mount it. 2. create Directory from mount point and make sure you take a snap of volume when Directory is created only on hashed sub-volume and not on any other 3. stop volume and restore snap 4. mount volume again and list out content of parent Directory. newly created Directory is not listed [root@rhs-client18 new]# ls -l total 0 5. verify Directory in backend. Directory is present only on hashed sub-volume, not on other sub-volumes server1:- [root@rhs-client18 new]# ls -l total 0 server2:- [root@OVM5 brick3]# ls -l total 0 drwxr-xr-x 2 root root 6 Apr 24 15:01 dir1 server3:- [root@rhs-client18 new]# ls -l total 0 Actual results: =============== DHT self heal is not healing Directory entry and Directory is not visible on mount point Expected results: ================= Lookup on mount point should heal Directory entry on all up sub-volume Additional info:
Able to reproduce without taking snapshot. Steps:- 1) send Directory creation from one mount point. 2) When Directory is created on hashed and not not creates on other sub-volumes(non hashed), send lookup from other mount point --> lookup is not healin Directory on other sub-volumes
*** Bug 1092501 has been marked as a duplicate of this bug. ***
Upstream patch : http://review.gluster.org/#/c/7599/
https://code.engineering.redhat.com/gerrit/#/c/26144/
verified with build 3.6.0.20-1.el6rhs.x86_64 works as per expectation hence moving to verified
Hi Susant, Please review the edited doc text for technical accuracy and sign off.
I was able to hit this issue on the build glusterfs-3.6.0.25-1. 1. create and start Distributed replicate master and slave volume, and create geo-rep relationship between master and slave (looks like geo-rep has nothing to do with this issue) 2. create Directory from mount point on master and make sure you take a snap of volume when Directory is created only on one of the sub-volume and not on any other. 3. restore snap (follow steps to restore snap) 4. mount volume again and list out content of parent Directory. newly created Directory is not listed.
I also could be able to hit this during snapshot restoration of geo-rep slave and master. A directory in restored master volume was only present in one brick and not on other. Gluster mount could not see the directory. Lookup on parent directory did not heal it but the lookup on the missing directory healed.
Root Cause of the issue: Due to snapshot and mkdir race, snapshot took the entry from the non-hashed and not hashed. As dht_readdirp fop depends on the entry being present on the hashed subvolume to be filtered, the entry was not shown on the mount point and also not healed for the master volume. [Hence, not a regression] Relation to geo-rep: The changlogs are captured at the brick level at the master and since one of the bricks has the directory entry, geo-rep syncs the directory on to the slave. Now the inconsistency being the slave has an entry which the master does not have from the gluster mount point. So closing this bug as it is not a regression and a new bug need to be created to track the geo-rep inconsistencies.
Moving this to ON_QA. If the snapshot has captured the entry from the mkdir on the hashed subvol and still the entry does not get listed on the mount point reopen this bug.
Issue mentioned in the comment 17 is being tracked with the Bug 1128155
verified with 3.6.0.28-1.el6rhs.x86_64, working as expected hence moving to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html