Bug 1277368

Summary: Bit rot version and signature for the files on a tiered volume are missing after few promotions and demotions of the files.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: RamaKasturi <knarra>
Component: tierAssignee: Raghavendra Bhat <rabhat>
Status: CLOSED ERRATA QA Contact: RamaKasturi <knarra>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: dlambrig, josferna, khiremat, nbalacha, nchilaka, rcyriac, rhs-bugs, sankarshan, storage-qa-internal, vshankar
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.5-7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-01 05:50:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1260783, 1260923    

Description RamaKasturi 2015-11-03 07:51:58 UTC
Description of problem:
Have a tiered volume and bit rot enabled on that. When ever the file gets signed using bitd daemon the file gets promoted and demoted and this happens in a continuous loop. After some time i see that the files in the cold tier does not have bit rot version and signature.

Version-Release number of selected component (if applicable):
glusterfs-3.7.5-5.el6rhs.x86_64

How reproducible:


Steps to Reproduce:
1. Create a tiered volume with cold tier being EC and hot tier being replicate volume.
2. Enable bit rot on the volume.
3. Set the promote and demote frequency as 360
4. Fuse mount the volume and create some files.

Actual results:
promotions and demotions of the files happens in a continous loop and after some time, files in the cold tier does not have bit rot version and signature.

Expected results:
Bit rot version and signature should not be missing from files.

Additional info:

Comment 2 RamaKasturi 2015-11-03 08:51:04 UTC
SOS Reports are present at the link below:

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1277368/

Comment 5 Joseph Elwin Fernandes 2015-11-11 12:41:23 UTC
After doing some investigation we (Kasturi and me) found the following observation,
1) bitd calculate signature for linkto file on the hot tier and also remembers the previous bit-rot.version before the file migration to the cold tier.
2) when a file gets migrated to a tier, immediate write io's are considers as part of the migration and there is no version change until the version timeout.
(the point 2 is not a problem but just a observation)

Proof:
=====

Setup :
=======
[root@fedora1 test]# gluster volume info
 
Volume Name: test
Type: Tier
Volume ID: 888f73b8-b5bc-4f0f-91ba-bf8dd39884d5
Status: Started
Number of Bricks: 8
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: fedora1:/home/ssd/small_brick3/s3
Brick2: fedora1:/home/ssd/small_brick2/s2
Brick3: fedora1:/home/ssd/small_brick1/s1
Brick4: fedora1:/home/ssd/small_brick0/s0
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick5: fedora1:/home/disk/d1
Brick6: fedora1:/home/disk/d2
Brick7: fedora1:/home/disk/d3
Brick8: fedora1:/home/disk/d4
Options Reconfigured:
features.scrub: Active
features.bitrot: on
features.record-counters: on
features.ctr-enabled: on
performance.readdir-ahead: on
[root@fedora1 test]#


Create a file called "file1"

[root@fedora1 test]# echo "hello world" > file1

This is how the bricks look after the creation of the file

[root@fedora1 test]# ls -l /home/disk/d* /home/ssd/small_brick*/s*
/home/disk/d1:
total 0

/home/disk/d2:
total 0

/home/disk/d3:
total 0

/home/disk/d4:
total 0

/home/ssd/small_brick0/s0:
total 8
-rw-r--r-- 2 root root 12 Nov 11 14:46 file1

/home/ssd/small_brick1/s1:
total 8
-rw-r--r-- 2 root root 12 Nov 11 14:46 file1

/home/ssd/small_brick2/s2:
total 0

/home/ssd/small_brick3/s3:
total 0
[root@fedora1 test]# 

and this is the bit-rot version,

Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1                          Wed Nov 11 14:47:41 2015

getfattr: Removing leading '/' from absolute path names
# file: home/ssd/small_brick0/s0/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005643068900006547
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4


After 2 mins we see the signature for the bit-rot-version 2

getfattr: Removing leading '/' from absolute path names
# file: home/ssd/small_brick0/s0/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.signature=0x010200000000000000a948904f2f0f479b8f8197694b30184b0d2ed1c1cd2a1ec0fb85d299a192a447
trusted.bit-rot.version=0x02000000000000005643068900006547
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4


Let write some more data to bump-up the version, (The file is still in the hot tier)

echo "hello world" >> file1

This is the signature for the version 3 

Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1                          Wed Nov 11 14:51:43 2015

getfattr: Removing leading '/' from absolute path names
# file: home/ssd/small_brick0/s0/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.signature=0x010300000000000000ec498a36221dd860c6f24ea26cb29cec68a38479496f78e54ce35f34c8106847
trusted.bit-rot.version=0x03000000000000005643068900006547
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4

Now let the file get demoted to cold tier, 
And this is now the bricks look after the demotion

[root@fedora1 test]# ls -l /home/disk/d* /home/ssd/small_brick*/s*/home/disk/d1:
total 0

/home/disk/d2:
total 0

/home/disk/d3:
total 8
-rw-r--r-- 2 root root 24 Nov 11 14:49 file1

/home/disk/d4:
total 8
-rw-r--r-- 2 root root 24 Nov 11 14:49 file1

/home/ssd/small_brick0/s0:
total 0
---------T 2 root root 0 Nov 11 14:52 file1

/home/ssd/small_brick1/s1:
total 0
---------T 2 root root 0 Nov 11 14:52 file1

/home/ssd/small_brick2/s2:
total 0

/home/ssd/small_brick3/s3:
total 0
[root@fedora1 test]# 

The hot tier has the linkto file and cold tier as the actual file.
And this is the xattrs on the linkto file and actual file immediately after the demotion.


getfattr: Removing leading '/' from absolute path names
# file: home/ssd/small_brick0/s0/file1
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4
trusted.tier.tier-dht.linkto=0x746573742d636f6c642d64687400


Every 1.0s: getfattr -d -m . -e hex /home/disk/d3/*                                          Wed Nov 11 14:52:49 2015

getfattr: Removing leading '/' from absolute path names
# file: home/disk/d3/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.version=0x02000000000000005643067600008eeb
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4

Observe that the linkto file has no version or signature now and the actual file has the new fresh version 2

This is how the xattrs look like after the signing of the files by bitd

Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1                          Wed Nov 11 14:54:08 2015

getfattr: Removing leading '/' from absolute path names
# file: home/ssd/small_brick0/s0/file1
trusted.bit-rot.signature=0x010100000000000000e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4
trusted.tier.tier-dht.linkto=0x746573742d636f6c642d64687400


Every 1.0s: getfattr -d -m . -e hex /home/disk/d3/*                                          Wed Nov 11 14:54:38 2015

getfattr: Removing leading '/' from absolute path names
# file: home/disk/d3/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.signature=0x010200000000000000ec498a36221dd860c6f24ea26cb29cec68a38479496f78e54ce35f34c8106847
trusted.bit-rot.version=0x02000000000000005643067600008eeb
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4

Please observe the xattrs on the linkto file on the hot tier,
1) there is not version on it
2) but there is a signature for the version 3 ! which was the version on the file when it was last on hot tier
3) the signature "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
 on the linkto file is same as the checksum we calculated directly using sha256sum 
[root@fedora1 ~]# sha256sum  /home/ssd/small_brick0/s0/file1
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855  /home/ssd/small_brick0/s0/file1
But different than the actual file which is
"ec498a36221dd860c6f24ea26cb29cec68a38479496f78e54ce35f34c8106847"
[root@fedora1 ~]# sha256sum /home/disk/d3/file1
ec498a36221dd860c6f24ea26cb29cec68a38479496f78e54ce35f34c8106847  /home/disk/d3/file1


now lets heat up the file

[root@fedora1 test]# echo "hello world" >> file1 
[root@fedora1 test]# 
[root@fedora1 test]# 
[root@fedora1 test]# echo "hello world" >> file1 
[root@fedora1 test]# 

and let it promote to the hot tier

[root@fedora1 test]# ls -l /home/disk/d* /home/ssd/small_brick*/s*
/home/disk/d1:
total 0

/home/disk/d2:
total 0

/home/disk/d3:
total 0

/home/disk/d4:
total 0

/home/ssd/small_brick0/s0:
total 8
-rw-r--r-- 2 root root 48 Nov 11 14:55 file1

/home/ssd/small_brick1/s1:
total 8
-rw-r--r-- 2 root root 48 Nov 11 14:55 file1

/home/ssd/small_brick2/s2:
total 0

/home/ssd/small_brick3/s3:
total 0
[root@fedora1 test]# 

Now observe the signature and version on the file in the hot tier immediately after the promotion. We have a signature of linkto file of version 3 and the incremented version 4 from the previous stale version.

Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1                          Wed Nov 11 18:04:06 2015

getfattr: Removing leading '/' from absolute path names
# file: home/ssd/small_brick0/s0/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.signature=0x010300000000000000e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
trusted.bit-rot.version=0x04000000000000005643068900006547
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4


After some time bitd signs the file with version 4.

Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1                          Wed Nov 11 18:06:03 2015

getfattr: Removing leading '/' from absolute path names
# file: home/ssd/small_brick0/s0/file1
trusted.afr.dirty=0x000000000000000000000000
trusted.bit-rot.signature=0x010400000000000000e54e4db5a177bd4d986796a020f202ec0f90f4aef037769bda181efd79cff9b2
trusted.bit-rot.version=0x04000000000000005643068900006547
trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4

Comment 6 Venky Shankar 2015-11-13 07:55:54 UTC
(In reply to Joseph Elwin Fernandes from comment #5)
> After doing some investigation we (Kasturi and me) found the following
> observation,
> 1) bitd calculate signature for linkto file on the hot tier and also
> remembers the previous bit-rot.version before the file migration to the cold
> tier.
> 2) when a file gets migrated to a tier, immediate write io's are considers
> as part of the migration and there is no version change until the version
> timeout.
> (the point 2 is not a problem but just a observation)
> 
> Proof:
> =====
> 
> Setup :
> =======
> [root@fedora1 test]# gluster volume info
>  
> Volume Name: test
> Type: Tier
> Volume ID: 888f73b8-b5bc-4f0f-91ba-bf8dd39884d5
> Status: Started
> Number of Bricks: 8
> Transport-type: tcp
> Hot Tier :
> Hot Tier Type : Distributed-Replicate
> Number of Bricks: 2 x 2 = 4
> Brick1: fedora1:/home/ssd/small_brick3/s3
> Brick2: fedora1:/home/ssd/small_brick2/s2
> Brick3: fedora1:/home/ssd/small_brick1/s1
> Brick4: fedora1:/home/ssd/small_brick0/s0
> Cold Tier:
> Cold Tier Type : Distributed-Replicate
> Number of Bricks: 2 x 2 = 4
> Brick5: fedora1:/home/disk/d1
> Brick6: fedora1:/home/disk/d2
> Brick7: fedora1:/home/disk/d3
> Brick8: fedora1:/home/disk/d4
> Options Reconfigured:
> features.scrub: Active
> features.bitrot: on
> features.record-counters: on
> features.ctr-enabled: on
> performance.readdir-ahead: on
> [root@fedora1 test]#
> 
> 
> Create a file called "file1"
> 
> [root@fedora1 test]# echo "hello world" > file1
> 
> This is how the bricks look after the creation of the file
> 
> [root@fedora1 test]# ls -l /home/disk/d* /home/ssd/small_brick*/s*
> /home/disk/d1:
> total 0
> 
> /home/disk/d2:
> total 0
> 
> /home/disk/d3:
> total 0
> 
> /home/disk/d4:
> total 0
> 
> /home/ssd/small_brick0/s0:
> total 8
> -rw-r--r-- 2 root root 12 Nov 11 14:46 file1
> 
> /home/ssd/small_brick1/s1:
> total 8
> -rw-r--r-- 2 root root 12 Nov 11 14:46 file1
> 
> /home/ssd/small_brick2/s2:
> total 0
> 
> /home/ssd/small_brick3/s3:
> total 0
> [root@fedora1 test]# 
> 
> and this is the bit-rot version,
> 
> Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1         
> Wed Nov 11 14:47:41 2015
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/ssd/small_brick0/s0/file1
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x02000000000000005643068900006547
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4
> 
> 
> After 2 mins we see the signature for the bit-rot-version 2
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/ssd/small_brick0/s0/file1
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.
> signature=0x010200000000000000a948904f2f0f479b8f8197694b30184b0d2ed1c1cd2a1ec
> 0fb85d299a192a447
> trusted.bit-rot.version=0x02000000000000005643068900006547
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4
> 
> 
> Let write some more data to bump-up the version, (The file is still in the
> hot tier)
> 
> echo "hello world" >> file1
> 
> This is the signature for the version 3 
> 
> Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1         
> Wed Nov 11 14:51:43 2015
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/ssd/small_brick0/s0/file1
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.
> signature=0x010300000000000000ec498a36221dd860c6f24ea26cb29cec68a38479496f78e
> 54ce35f34c8106847
> trusted.bit-rot.version=0x03000000000000005643068900006547
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4

Everything is good till here.

> 
> Now let the file get demoted to cold tier, 
> And this is now the bricks look after the demotion
> 
> [root@fedora1 test]# ls -l /home/disk/d*
> /home/ssd/small_brick*/s*/home/disk/d1:
> total 0
> 
> /home/disk/d2:
> total 0
> 
> /home/disk/d3:
> total 8
> -rw-r--r-- 2 root root 24 Nov 11 14:49 file1
> 
> /home/disk/d4:
> total 8
> -rw-r--r-- 2 root root 24 Nov 11 14:49 file1
> 
> /home/ssd/small_brick0/s0:
> total 0
> ---------T 2 root root 0 Nov 11 14:52 file1
> 
> /home/ssd/small_brick1/s1:
> total 0
> ---------T 2 root root 0 Nov 11 14:52 file1
> 
> /home/ssd/small_brick2/s2:
> total 0
> 
> /home/ssd/small_brick3/s3:
> total 0
> [root@fedora1 test]# 
> 
> The hot tier has the linkto file and cold tier as the actual file.
> And this is the xattrs on the linkto file and actual file immediately after
> the demotion.
> 
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/ssd/small_brick0/s0/file1
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4
> trusted.tier.tier-dht.linkto=0x746573742d636f6c642d64687400

The data object gets converted to a link-to file. The xattrs got removed here. I guess the code migrates the object and does a ftruncate() followed by some setattr() calls. The ftruncate() should have resulted in version getting incremented. Here the version and signature xattrs are missing. Need to examine why.

> 
> 
> Every 1.0s: getfattr -d -m . -e hex /home/disk/d3/*                         
> Wed Nov 11 14:52:49 2015
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/disk/d3/file1
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.version=0x02000000000000005643067600008eeb
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4
> 
> Observe that the linkto file has no version or signature now and the actual
> file has the new fresh version 2

This is fine.

> 
> This is how the xattrs look like after the signing of the files by bitd
> 
> Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1         
> Wed Nov 11 14:54:08 2015
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/ssd/small_brick0/s0/file1
> trusted.bit-rot.
> signature=0x010100000000000000e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934
> ca495991b7852b855
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4
> trusted.tier.tier-dht.linkto=0x746573742d636f6c642d64687400

Normallly versioning would start with version = 2, it's kind of fish here in two ways: One, the version xattr is missing (it was already missing when the data object got converted to link-to). Second, the signature xattr shows up from nowhere with version = 1. This is fishy as the starting version for an object is 2.

> 
> 
> Every 1.0s: getfattr -d -m . -e hex /home/disk/d3/*                         
> Wed Nov 11 14:54:38 2015
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/disk/d3/file1
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.
> signature=0x010200000000000000ec498a36221dd860c6f24ea26cb29cec68a38479496f78e
> 54ce35f34c8106847
> trusted.bit-rot.version=0x02000000000000005643067600008eeb
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4

This is fine.

> 
> Please observe the xattrs on the linkto file on the hot tier,
> 1) there is not version on it
> 2) but there is a signature for the version 3 ! which was the version on the
> file when it was last on hot tier
> 3) the signature
> "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
>  on the linkto file is same as the checksum we calculated directly using
> sha256sum 
> [root@fedora1 ~]# sha256sum  /home/ssd/small_brick0/s0/file1
> e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 
> /home/ssd/small_brick0/s0/file1
> But different than the actual file which is
> "ec498a36221dd860c6f24ea26cb29cec68a38479496f78e54ce35f34c8106847"
> [root@fedora1 ~]# sha256sum /home/disk/d3/file1
> ec498a36221dd860c6f24ea26cb29cec68a38479496f78e54ce35f34c8106847 
> /home/disk/d3/file1
> 
> 
> now lets heat up the file

Before debugging further, the two "unknowns" relating the link-to needs to be solved. 

> 
> [root@fedora1 test]# echo "hello world" >> file1 
> [root@fedora1 test]# 
> [root@fedora1 test]# 
> [root@fedora1 test]# echo "hello world" >> file1 
> [root@fedora1 test]# 
> 
> and let it promote to the hot tier
> 
> [root@fedora1 test]# ls -l /home/disk/d* /home/ssd/small_brick*/s*
> /home/disk/d1:
> total 0
> 
> /home/disk/d2:
> total 0
> 
> /home/disk/d3:
> total 0
> 
> /home/disk/d4:
> total 0
> 
> /home/ssd/small_brick0/s0:
> total 8
> -rw-r--r-- 2 root root 48 Nov 11 14:55 file1
> 
> /home/ssd/small_brick1/s1:
> total 8
> -rw-r--r-- 2 root root 48 Nov 11 14:55 file1
> 
> /home/ssd/small_brick2/s2:
> total 0
> 
> /home/ssd/small_brick3/s3:
> total 0
> [root@fedora1 test]# 
> 
> Now observe the signature and version on the file in the hot tier
> immediately after the promotion. We have a signature of linkto file of
> version 3 and the incremented version 4 from the previous stale version.
> 
> Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1         
> Wed Nov 11 18:04:06 2015
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/ssd/small_brick0/s0/file1
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.
> signature=0x010300000000000000e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934
> ca495991b7852b855
> trusted.bit-rot.version=0x04000000000000005643068900006547
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4
> 
> 
> After some time bitd signs the file with version 4.
> 
> Every 1.0s: getfattr -d -m . -e hex /home/ssd/small_brick0/s0/file1         
> Wed Nov 11 18:06:03 2015
> 
> getfattr: Removing leading '/' from absolute path names
> # file: home/ssd/small_brick0/s0/file1
> trusted.afr.dirty=0x000000000000000000000000
> trusted.bit-rot.
> signature=0x010400000000000000e54e4db5a177bd4d986796a020f202ec0f90f4aef037769
> bda181efd79cff9b2
> trusted.bit-rot.version=0x04000000000000005643068900006547
> trusted.gfid=0xb2cb50645c644b1995299097bd8e44e4

Comment 7 Venky Shankar 2015-11-19 05:42:06 UTC
[Commenting on behalf of Johnny, who is busy with other things]

So, this is what Johnny (rabhat@) found in the QA cluster: the file which had bitrot extended attributes missing had the .glusterfs linkage missing. Bitrot relies on GFID based operations for it's correct functioning. This looks like the likely cause for the absence of the xattrs on the file.

By the looks of it (give the above cause), this does not look like a bug directly related to bitrot. However, how did the file end up in such a state is currently unknown.

Comment 8 Venky Shankar 2015-11-19 13:36:36 UTC
I looked into a fresh setup provided by QE and was able to see the bitrot anomaly once in several runs. The issue always happens in the hot tier and not in the cold tier. Before I could debug further, the file got migrated back to the cold tier and got the version/signature xattrs (cold tier).

To debug further, I changed the promote/demote frequencies for quick promotion and delaying demotion by setting the value to 15 and 300 respectively which triggered something unusual - none of the files are getting promoted to the hot tier, but freshly created files do go in hot tier and get demoted eventually. Also, demotions no don't leave a link-to file in the hot tier and log files get filled with:

/var/log/glusterfs/bricks/rhgs-brick6-b13.log:[2015-11-19 13:32:06.951477] I [MSGID: 115060] [server-rpc-fops.c:890:_gf_server_log_setxattr_failure] 0-vol1-server: 26221: SETXATTR /h11 (47389175-7d5d-48eb-a05a-36a69872defa) ==> trusted.tier.tier-dht.linkto

Setting them back to 240/240 (which is what I got in the setup) seems to get the migrations working. So, my debugging start again...

Comment 9 Venky Shankar 2015-11-19 14:28:33 UTC
> Setting them back to 240/240 (which is what I got in the setup) seems to get
> the migrations working. So, my debugging start again...

Ummm.. migrations are still stuck.

Comment 10 Venky Shankar 2015-11-19 17:16:54 UTC
(In reply to Venky Shankar from comment #9)
> > Setting them back to 240/240 (which is what I got in the setup) seems to get
> > the migrations working. So, my debugging start again...
> 
> Ummm.. migrations are still stuck.

I get this is the tier log file (<volume>-tier.log)

[2015-11-19 16:28:03.702276] I [MSGID: 109070] [dht-common.c:1840:dht_lookup_linkfile_cbk] 0-vol1-tier-dht: Lookup of //file13 on vol1-cold-dht (following linkfile) failed ,gfid = 2801b8d5-5646-487d-9597-08a6b3087e7c [Invalid argument]
[2015-11-19 16:28:03.705075] I [MSGID: 109069] [dht-common.c:1159:dht_lookup_unlink_stale_linkto_cbk] 0-vol1-tier-dht: Returned with op_ret -1 and op_errno 16 for //file13
[2015-11-19 16:28:03.705109] E [MSGID: 109037] [tier.c:418:tier_migrate_using_query_file] 0-vol1-tier-dht: Failed to do lookup on file file13
[2015-11-19 16:28:03.708875] I [MSGID: 109070] [dht-common.c:1840:dht_lookup_linkfile_cbk] 0-vol1-tier-dht: Lookup of //file13 on vol1-cold-dht (following linkfile) failed ,gfid = b3c92be7-80b5-4a64-b93d-e0f2e8b40325 [Invalid argument]
[2015-11-19 16:28:03.711957] I [MSGID: 109069] [dht-common.c:1159:dht_lookup_unlink_stale_linkto_cbk] 0-vol1-tier-dht: Returned with op_ret -1 and op_errno 16 for //file13
[2015-11-19 16:28:03.711999] E [MSGID: 109037] [tier.c:418:tier_migrate_using_query_file] 0-vol1-tier-dht: Failed to do lookup on file file13
[2015-11-19 16:28:03.990334] W [MSGID: 109023] [dht-rebalance.c:591:__dht_rebalance_create_dst_file] 0-vol1-tier-dht: //file13: failed to set xattr on vol1-hot-dht (Permission denied)
[2015-11-19 16:28:04.337050] W [MSGID: 109023] [dht-rebalance.c:1462:dht_migrate_file] 0-vol1-tier-dht: Migrate file failed://file13: failed to set xattr on vol1-hot-dht (Permission denied)

any clues?

Comment 12 Joseph Elwin Fernandes 2015-11-20 05:26:00 UTC
I will have a look at this.

Comment 13 Joseph Elwin Fernandes 2015-11-20 09:31:34 UTC
I just cross verified on the upstream build. The T file is left on the hot tier after demotion. Might be some fix  didnt make it to downstream. To test with the current upstream please get the 
gluster volume set test cluster.watermark-low 1

this will make sure files get demoted for a small data set.

Please refer the help of this volume set help for details on this option.

Let me know if this helps

Comment 14 Nithya Balachandran 2015-11-20 09:54:22 UTC
The downstream patch which prevents the linkfile from being demoted was merged recently. It should be available in the next build.
(https://code.engineering.redhat.com/gerrit/#/c/61840/)



Instead of setting cluster.watermark-low , you can use cluster.tier-mode to ignore watermarks.

gluster volume set <volname> cluster.tier-mode test

Comment 15 Venky Shankar 2015-11-20 09:58:35 UTC
(In reply to Nithya Balachandran from comment #14)
> The downstream patch which prevents the linkfile from being demoted was
> merged recently. It should be available in the next build.
> (https://code.engineering.redhat.com/gerrit/#/c/61840/)

Thanks! Another problem in the cluster was that when a file get's demoted to cold tier, further I/Os on the file does not migrate it to the hot tier with some lookup() failures in the tier log file as per Comment #10.

If this a side effect or something else?

> 
> 
> 
> Instead of setting cluster.watermark-low , you can use cluster.tier-mode to
> ignore watermarks.
> 
> gluster volume set <volname> cluster.tier-mode test

Comment 16 Nithya Balachandran 2015-11-20 10:02:49 UTC
That needs to be analysed. Do these errors show up in the latest master where the linkto file is not deleted?

Comment 17 Venky Shankar 2015-11-20 10:06:20 UTC
(In reply to Nithya Balachandran from comment #16)
> That needs to be analysed. Do these errors show up in the latest master
> where the linkto file is not deleted?

I haven't tried tier with latest master. Johnny (rabhat@) was able to give it a run and found that files were not getting demoted. I guess the watermark option (or cluster.tier-mode) needs to be set for aggressive demotions.

I'll try to give it a run in sometime.

Comment 18 Vivek Agarwal 2015-11-23 07:05:17 UTC
This issue seems to be working with upstream build, seems the patch did not make to the last build which is now merged, moving this to modified.

Comment 19 RamaKasturi 2015-11-27 12:06:43 UTC
Hi,

  Can you please add the link to the patch which has the fix for this issue ?

Thanks
kasturi

Comment 20 RamaKasturi 2015-12-01 05:49:29 UTC
Hi,

  Can you please add the link to the patch which has the fix for this issue ?

Thanks
kasturi

Comment 21 Joseph Elwin Fernandes 2015-12-01 05:58:22 UTC
1) The problem of deleting the T file is solved by Nithya's patch, which she has mentioned in the bug, refer comment 
https://bugzilla.redhat.com/show_bug.cgi?id=1277368#c14

2) The patch for ignoring bit rot internal traffic by ctr :
https://code.engineering.redhat.com/gerrit/60999
https://code.engineering.redhat.com/gerrit/61241

Comment 22 RamaKasturi 2015-12-01 18:02:23 UTC
Verified and works fine with glusterfs-3.7.5-7.el7rhgs.x86_64.

Did not observe bit rot version and signature being missed for files when promotions and demotions happen.

Comment 24 errata-xmlrpc 2016-03-01 05:50:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html