Bug 1266841

Summary: Data Tiering:Data Loss:File migrations(flushing of data) to cold tier fails on detach tier with quota limits reached
Product: [Community] GlusterFS Reporter: Nag Pavan Chilakam <nchilaka>
Component: tieringAssignee: Raghavendra G <rgowdapp>
Status: CLOSED EOL QA Contact: bugs <bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.7.4CC: bugs, nchilaka, sankarshan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1278419 (view as bug list) Environment:
Last Closed: 2017-03-08 11:03:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1265623, 1267812    
Bug Blocks: 1260923, 1278419    

Description Nag Pavan Chilakam 2015-09-28 07:30:29 UTC
Description of problem:
======================
When quota limits are reached and then we try to do a detach tier.
The detach tier completes but fails to flush data to cold tier.
All the files are listed as failed



Version-Release number of selected component (if applicable):
==========================================================
[root@zod ~]# rpm -qa|grep gluster
glusterfs-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-fuse-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-debuginfo-3.7.4-0.33.git1d02d4b.el7.centos.x86_64
glusterfs-api-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-client-xlators-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-server-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-cli-3.7.4-0.43.gitf139283.el7.centos.x86_64
glusterfs-libs-3.7.4-0.43.gitf139283.el7.centos.x86_64
[root@zod ~]# gluster --version
glusterfs 3.7.4 built on Sep 19 2015 01:30:43
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.
[root@zod ~]# 


Steps to Reproduce:
==================
1.create a tier volume
2.enable quota and set hardlimit of root volume / to say 10gb
3. Now enable ctr and set the demote freq to say 1500sec
3.now fill the hardlimit to full
4. Now issue a detach tier start

It can be seen that the detach tier completes but fails as all the files would have failed to be flushed to cold tier. Only link files would have been created in cold bricks but no actual data movement would have happened





Eg:
Cold brick:
/rhs/brick3/angola:
total 8
---------T. 2 root root    0 Sep 28 12:47 file.11
---------T. 2 root root    0 Sep 28 12:47 file.12
---------T. 2 root root    0 Sep 28 12:47 file.13
---------T. 2 root root    0 Sep 28 12:47 file.15
---------T. 2 root root    0 Sep 28 12:47 file.32
---------T. 2 root root    0 Sep 28 12:47 file.33
---------T. 2 root root    0 Sep 28 12:47 file.35
---------T. 2 root root    0 Sep 28 12:47 file.42
---------T. 2 root root    0 Sep 28 12:47 file.44
---------T. 2 root root    0 Sep 28 12:47 file.45
---------T. 2 root root    0 Sep 28 12:47 file.54
---------T. 2 root root    0 Sep 28 12:47 file.62
---------T. 2 root root    0 Sep 28 12:47 file.63
drwxr-xr-x. 2 root root 8192 Sep 28 12:46 hotdir1


hot brick:
/rhs/brick7/angola_hot:
total 1269588
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:40 file.31
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.38
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.41
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.43
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.47
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.49
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.52
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.57
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:44 file.6
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.64
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:41 file.66
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:42 file.68
-rw-r-Sr-T. 2 root root 100000000 Sep 28 12:44 file.8
drwxr-xr-x. 2 root root      8192 Sep 28 12:37 hotdir1






Mount point, after commit of detach tier:
========================================
[root@localhost angola]# du -sh *
0	file.1
0	file.10
0	file.11
0	file.12
0	file.13
0	file.14
0	file.15
0	file.2
0	file.3
0	file.31
0	file.32
0	file.33
0	file.34
0	file.35
0	file.36
0	file.37
0	file.38
0	file.39
0	file.4
0	file.40
0	file.41
0	file.42
0	file.43
0	file.44
0	file.45
0	file.46
0	file.47
0	file.48
0	file.49
0	file.5
0	file.50
0	file.51
0	file.52
0	file.53
0	file.54
0	file.55
0	file.56
0	file.57
0	file.58
0	file.59
0	file.6
0	file.60
0	file.61
0	file.62
0	file.63
0	file.64
0	file.65
0	file.66
0	file.67

Comment 1 Nag Pavan Chilakam 2015-09-28 07:46:18 UTC
sosreports.eng.blr.redhat.com:/home/repo/sosreports/bug.1266841

Comment 2 Vijaikumar Mallikarjuna 2015-09-28 09:55:29 UTC
Hi Nag Pavan,

What is the re-balance status? does it show any failure numbers. In this case it is expected that files that have failed to migrate will can be lost when commit is performed

Comment 3 Nag Pavan Chilakam 2015-09-28 11:17:19 UTC
Yes, there are failures under rebalance.
But given that detach tier must just move files b/w bricks and the end user doesn't bother about how it is done. We must not be seeing any failures.
May be I should change the title.
In Short "quotas must not stop data being moved from hot to cold tier during a detach tier"

Comment 4 Vijaikumar Mallikarjuna 2015-10-01 06:29:18 UTC
Patch submitted: http://review.gluster.org/#/c/12266/

Comment 7 Kaushal 2017-03-08 11:03:12 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.