Bug 1306907

Summary: [New] - quarantine folder becomes empty and bitrot status does not list any files which are corrupted
Product: Red Hat Gluster Storage Reporter: RamaKasturi <knarra>
Component: bitrotAssignee: Kotresh HR <khiremat>
Status: CLOSED ERRATA QA Contact: Sweta Anandpara <sanandpa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: asrivast, byarlaga, khiremat, rhinduja, rhs-bugs, sanandpa, storage-qa-internal, vshankar
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.9-1 Doc Type: Bug Fix
Doc Text:
During an inode forget operation, files under the quarantine directory are removed. The inode forget operation is called during the unlinking of a file, and when the inode table's LRU (Least Recently Used) cache size exceeds 16 KB. This means that, when a corrupted file is not accessed for a long time, and the LRU cache exceeds 16 KB, the corrupted file will be removed from the quarantine directory. This results in the corrupted file not being shown in BitRot status output, even though the corrupted file has not been deleted from the volume itself.
Story Points: ---
Clone Of:
: 1308961 (view as bug list) Environment:
Last Closed: 2016-06-23 05:08:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1268895, 1299184, 1308961, 1313131, 1313923    
Attachments:
Description Flags
CLI logs
none
scrub logs none

Description RamaKasturi 2016-02-12 06:38:18 UTC
Description of problem:
quarantine folder becomes empty and bitrot status does not display anything.

Version-Release number of selected component (if applicable):
glusterfs-3.7.5-19.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create a volume dist-rep volume
2. Enable bitrot and quota on the volume
3. Mount the volume using fuse and create 100 1GB files using dd.
4. corrupt some files from backend so that scrubber will mark them as bad.
5. Once scrubber scrubs the files run the command "gluster vol bitrot <vol_name> scrub status" command to see the corrupted files.
4. In the mount point perform linux untar and rm -rf <linuxuntar> in a continuous loop.

Actual results:
After sometime scrub status does not list the files which were corrupted.

Expected results:
scrub status should always list the files which were corrupted.

Additional info:

Comment 2 RamaKasturi 2016-02-15 07:23:27 UTC
Hi Kotresh,

   Can you please put the RCA for the bug?

Thanks
kasturi

Comment 3 Kotresh HR 2016-02-15 12:28:27 UTC
RCA:

The files under quarantine directory are removed during inode forget.
inode forget is called not only during unlink of a file but also when inode table's LRU size exceeds 16k.

Hence when bad file is not accessed for a long time and new files are being created and removed putting pressure on inode->tables LRU list to exceed 16k will result in removing bad file from quarantine directory because of which
bitrot status fails to show the bad file even though it's not deleted from mount.

Comment 5 Kotresh HR 2016-02-17 05:23:16 UTC
Doc Text looks good.

Comment 7 Kotresh HR 2016-03-15 06:23:29 UTC
Upstream Patches:
http://review.gluster.org/#/c/13552/ (release/3.7)
http://review.gluster.org/#/c/13472/ (master)

Comment 9 Sweta Anandpara 2016-04-19 07:01:08 UTC
Tested and verified this on the build 3.7.9-1

Reduced the network.inode-lru-limit to 50. Created 1000 files, corrupted over 100 files, waited for the scrubber to mark the files as bad, started linux untar and rm -rf simultaneously. Scrub status continued to show the correct output. Deleted about 50 files from the mountpoint, and the changes were correctly reflected in the quarantine folder and hence the scrub output.

Moving this BZ to fixed in 3.1.3. Detailed logs are attached.

Comment 10 Sweta Anandpara 2016-04-19 11:43:40 UTC
Created attachment 1148537 [details]
CLI logs

Comment 11 Sweta Anandpara 2016-04-19 11:44:12 UTC
Created attachment 1148538 [details]
scrub logs

Comment 13 errata-xmlrpc 2016-06-23 05:08:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240