+++ This bug was initially created as a clone of Bug #1308961 +++ +++ This bug was initially created as a clone of Bug #1306907 +++ Description of problem: quarantine folder becomes empty and bitrot status does not display anything. Version-Release number of selected component (if applicable): glusterfs-3.7.5-19.el7rhgs.x86_64 How reproducible: Always Steps to Reproduce: 1. Create a volume dist-rep volume 2. Enable bitrot and quota on the volume 3. Mount the volume using fuse and create 100 1GB files using dd. 4. corrupt some files from backend so that scrubber will mark them as bad. 5. Once scrubber scrubs the files run the command "gluster vol bitrot <vol_name> scrub status" command to see the corrupted files. 4. In the mount point perform linux untar and rm -rf <linuxuntar> in a continuous loop. Actual results: After sometime scrub status does not list the files which were corrupted. Expected results: scrub status should always list the files which were corrupted. Additional info: --- Additional comment from Red Hat Bugzilla Rules Engine on 2016-02-12 01:38:20 EST --- This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. If this bug should be proposed for a different release, please manually change the proposed release flag. --- Additional comment from RamaKasturi on 2016-02-15 02:23:27 EST --- Hi Kotresh, Can you please put the RCA for the bug? Thanks kasturi --- Additional comment from Kotresh HR on 2016-02-15 07:28:27 EST --- RCA: The files under quarantine directory are removed during inode forget. inode forget is called not only during unlink of a file but also when inode table's LRU size exceeds 16k. Hence when bad file is not accessed for a long time and new files are being created and removed putting pressure on inode->tables LRU list to exceed 16k will result in removing bad file from quarantine directory because of which bitrot status fails to show the bad file even though it's not deleted from mount. --- Additional comment from Laura Bailey on 2016-02-16 21:52:58 EST --- This is an upstream bug, removing from the RHGS known issues tracker. --- Additional comment from Kotresh HR on 2016-02-18 04:31:36 EST --- Description of problem: quarantine folder becomes empty and bitrot status does not display anything. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Create a volume dist-rep volume 2. Enable bitrot and quota on the volume 3. Mount the volume using fuse and create 100 1GB files using dd. 4. corrupt some files from backend so that scrubber will mark them as bad. 5. Once scrubber scrubs the files run the command "gluster vol bitrot <vol_name> scrub status" command to see the corrupted files. 4. In the mount point perform linux untar and rm -rf <linuxuntar> in a continuous loop. Actual results: After sometime scrub status does not list the files which were corrupted. Expected results: scrub status should always list the files which were corrupted. --- Additional comment from Vijay Bellur on 2016-02-18 10:15:17 EST --- REVIEW: http://review.gluster.org/13472 (features/bitrot: do not remove the quarantine handle in forget) posted (#1) for review on master by Raghavendra Bhat (raghavendra) --- Additional comment from Vijay Bellur on 2016-02-18 14:14:25 EST --- REVIEW: http://review.gluster.org/13472 (features/bitrot: do not remove the quarantine handle in forget) posted (#2) for review on master by Kotresh HR (khiremat) --- Additional comment from Vijay Bellur on 2016-02-29 22:19:04 EST --- COMMIT: http://review.gluster.org/13472 committed in master by Venky Shankar (vshankar) ------ commit 2102010edab355ac9882eea41a46edaca8b9d02c Author: Raghavendra Bhat <raghavendra> Date: Tue Feb 16 20:22:36 2016 -0500 features/bitrot: do not remove the quarantine handle in forget If an object is marked as bad, then an entry is corresponding to the bad object is created in the .glusterfs/quarantine directory to help scrub status. The entry name is the gfid of the corrupted object. The quarantine handle is removed in below 2 cases. 1) When protocol/server revceives the -ve lookup on an entry whose inode is there in the inode table (it can happen when the corrupted object is deleted directly from the backend for recovery purpose) it sends a forget on the inode and bit-rot-stub removes the quarantine handle in upon getting the forget. refer to the below commit f853ed9c61bf65cb39f859470a8ffe8973818868: http://review.gluster.org/12743) 2) When bit-rot-stub itself realizes that lookup on a corrupted object has failed with ENOENT. But with step1, there is a problem when the bit-rot-stub receives forget due to lru limit exceeding in the inode table. In such cases, though the corrupted object is not deleted (either from the mount point or from the backend), the handle in the quarantine directory is removed and that object is not shown in the bad objects list in the scrub status command. So it is better to follow only 2nd step (i.e. bit-rot-stub removing the handle from the quarantine directory in -ve lookups). Also the handle has to be removed when a corrupted object is unlinked from the mount point itself. Change-Id: Ibc3bbaf4bc8a5f8986085e87b729ab912cbf8cf9 BUG: 1308961 Original author: Raghavendra Bhat <raghavendra> Signed-off-by: Kotresh HR <khiremat> Reviewed-on: http://review.gluster.org/13472 Smoke: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.com> Reviewed-by: Venky Shankar <vshankar>
REVIEW: http://review.gluster.org/13583 (features/bitrot: do not remove the quarantine handle in forget) posted (#1) for review on release-3.7 by Raghavendra Bhat (raghavendra)
http://review.gluster.org/#/c/13552/ fixes this problem in release-3.7.
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.