Bug 1313923 - [New] - quarantine folder becomes empty and bitrot status does not list any files which are corrupted
Summary: [New] - quarantine folder becomes empty and bitrot status does not list any ...
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: bitrot
Version: 3.7.8
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Raghavendra Bhat
QA Contact:
bugs@gluster.org
URL:
Whiteboard:
Depends On: 1306907 1308961
Blocks: glusterfs-3.7.9 1313131
TreeView+ depends on / blocked
 
Reported: 2016-03-02 15:43 UTC by Raghavendra Bhat
Modified: 2017-03-08 10:51 UTC (History)
7 users (show)

Fixed In Version:
Clone Of: 1308961
Environment:
Last Closed: 2017-03-08 10:51:11 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Bhat 2016-03-02 15:43:26 UTC
+++ This bug was initially created as a clone of Bug #1308961 +++

+++ This bug was initially created as a clone of Bug #1306907 +++

Description of problem:
quarantine folder becomes empty and bitrot status does not display anything.

Version-Release number of selected component (if applicable):
glusterfs-3.7.5-19.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create a volume dist-rep volume
2. Enable bitrot and quota on the volume
3. Mount the volume using fuse and create 100 1GB files using dd.
4. corrupt some files from backend so that scrubber will mark them as bad.
5. Once scrubber scrubs the files run the command "gluster vol bitrot <vol_name> scrub status" command to see the corrupted files.
4. In the mount point perform linux untar and rm -rf <linuxuntar> in a continuous loop.

Actual results:
After sometime scrub status does not list the files which were corrupted.

Expected results:
scrub status should always list the files which were corrupted.

Additional info:

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-02-12 01:38:20 EST ---

This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from RamaKasturi on 2016-02-15 02:23:27 EST ---

Hi Kotresh,

   Can you please put the RCA for the bug?

Thanks
kasturi

--- Additional comment from Kotresh HR on 2016-02-15 07:28:27 EST ---

RCA:

The files under quarantine directory are removed during inode forget.
inode forget is called not only during unlink of a file but also when inode table's LRU size exceeds 16k.

Hence when bad file is not accessed for a long time and new files are being created and removed putting pressure on inode->tables LRU list to exceed 16k will result in removing bad file from quarantine directory because of which
bitrot status fails to show the bad file even though it's not deleted from mount.

--- Additional comment from Laura Bailey on 2016-02-16 21:52:58 EST ---

This is an upstream bug, removing from the RHGS known issues tracker.

--- Additional comment from Kotresh HR on 2016-02-18 04:31:36 EST ---

Description of problem:
quarantine folder becomes empty and bitrot status does not display anything.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Create a volume dist-rep volume
2. Enable bitrot and quota on the volume
3. Mount the volume using fuse and create 100 1GB files using dd.
4. corrupt some files from backend so that scrubber will mark them as bad.
5. Once scrubber scrubs the files run the command "gluster vol bitrot <vol_name> scrub status" command to see the corrupted files.
4. In the mount point perform linux untar and rm -rf <linuxuntar> in a continuous loop.

Actual results:
After sometime scrub status does not list the files which were corrupted.

Expected results:
scrub status should always list the files which were corrupted.

--- Additional comment from Vijay Bellur on 2016-02-18 10:15:17 EST ---

REVIEW: http://review.gluster.org/13472 (features/bitrot: do not remove the quarantine handle in forget) posted (#1) for review on master by Raghavendra Bhat (raghavendra)

--- Additional comment from Vijay Bellur on 2016-02-18 14:14:25 EST ---

REVIEW: http://review.gluster.org/13472 (features/bitrot: do not remove the quarantine handle in forget) posted (#2) for review on master by Kotresh HR (khiremat)

--- Additional comment from Vijay Bellur on 2016-02-29 22:19:04 EST ---

COMMIT: http://review.gluster.org/13472 committed in master by Venky Shankar (vshankar) 
------
commit 2102010edab355ac9882eea41a46edaca8b9d02c
Author: Raghavendra Bhat <raghavendra>
Date:   Tue Feb 16 20:22:36 2016 -0500

    features/bitrot: do not remove the quarantine handle in forget
    
    If an object is marked as bad, then an entry is corresponding to the
    bad object is created in the .glusterfs/quarantine directory to help
    scrub status. The entry name is the gfid of the corrupted object.
    The quarantine handle is removed in below 2 cases.
    
    1) When protocol/server revceives the -ve lookup on an entry whose inode
       is there in the inode table (it can happen when the corrupted object
       is deleted directly from the backend for recovery purpose) it sends a
       forget on the inode and bit-rot-stub removes the quarantine handle in
       upon getting the forget.
       refer to the below commit
       f853ed9c61bf65cb39f859470a8ffe8973818868:
       http://review.gluster.org/12743)
    
    2) When bit-rot-stub itself realizes that lookup on a corrupted object
       has failed with ENOENT.
    
    But with step1, there is a problem when the bit-rot-stub receives forget
    due to lru limit exceeding in the inode table. In such cases, though the
    corrupted object is not deleted (either from the mount point or from the
    backend), the handle in the quarantine directory is removed and that object
    is not shown in the bad objects list in the scrub status command.
    
    So it is better to follow only 2nd step (i.e. bit-rot-stub removing the handle
    from the quarantine directory in -ve lookups). Also the handle has to be removed
    when a corrupted object is unlinked from the mount point itself.
    
    Change-Id: Ibc3bbaf4bc8a5f8986085e87b729ab912cbf8cf9
    BUG: 1308961
    Original author: Raghavendra Bhat <raghavendra>
    Signed-off-by: Kotresh HR <khiremat>
    Reviewed-on: http://review.gluster.org/13472
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Venky Shankar <vshankar>

Comment 1 Vijay Bellur 2016-03-02 15:54:51 UTC
REVIEW: http://review.gluster.org/13583 (features/bitrot: do not remove the quarantine handle in forget) posted (#1) for review on release-3.7 by Raghavendra Bhat (raghavendra)

Comment 2 Vijay Bellur 2016-03-10 03:00:18 UTC
http://review.gluster.org/#/c/13552/ fixes this problem in release-3.7.

Comment 3 Kaushal 2017-03-08 10:51:11 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.


Note You need to log in before you can comment on or make changes to this bug.