Bug 1289163

Summary: first file created after hot tier full fails to create, but gets database entry and later ends up as a stale erroneous file (file with ???????????)
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nag Pavan Chilakam <nchilaka>
Component: tierAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED DUPLICATE QA Contact: Sweta Anandpara <sanandpa>
Severity: high Docs Contact:
Priority: high    
Version: rhgs-3.1CC: dlambrig, rhs-bugs, storage-qa-internal, vagarwal
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.1.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.7.5-13 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1291566 (view as bug list) Environment:
Last Closed: 2015-12-23 06:46:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1291566, 1293348    

Description Nag Pavan Chilakam 2015-12-07 14:55:49 UTC
Description of problem:
======================
Looks like we are creating a database entry for a file much before even waiting for a confirmation from the brick end(ie creating during the winding itself and not checking during unwind of stack)

After a hot tier is completely filled, the first fresh file to be created, will always fail as below:
[root@localhost testvol]# touch waterwood
touch: cannot touch ‘waterwood’: No space left on device



If we check the database as soon as we the file create failed, we can see that it has a db entry as below:

>>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==
99e541df-2828-48e1-a33d-95273269fd6f|1449495715|296974|0|0|0|0|0|0|0|0
99e541df-2828-48e1-a33d-95273269fd6f|00000000-0000-0000-0000-000000000001|waterwood|/waterwood|0|0


Now, if you check the mount point again by issuing "ls" you will see the file as below(you won't see the file immediately, so wait for the next migration cycle)

??????????? ? ?    ?            ?            ? waterwood


this stale file is visible from mount point and remains for ever




Version-Release number of selected component (if applicable):
=====================
[root@zod ~]# rpm -qa|grep gluster
glusterfs-fuse-3.7.5-9.el7rhgs.x86_64
glusterfs-debuginfo-3.7.5-9.el7rhgs.x86_64
glusterfs-3.7.5-9.el7rhgs.x86_64
glusterfs-server-3.7.5-9.el7rhgs.x86_64
glusterfs-client-xlators-3.7.5-9.el7rhgs.x86_64
glusterfs-cli-3.7.5-9.el7rhgs.x86_64
glusterfs-libs-3.7.5-9.el7rhgs.x86_64
glusterfs-api-3.7.5-9.el7rhgs.x86_64


How reproducible:
===============
consistently


Steps to Reproduce:
==================
1.create a tiered vol(i disabled watermark)
2.now have the hot tier such that the space available is a bit more than minfree disk(so that the new file create will still go to hot tier)
3.now create a file(use dd)which will spill over the space of hot tier
4. the file will fail as below after certain time:
[root@localhost testvol]# dd if=/dev/urandom of=yelpy bs=1024 count=3000000
dd: error writing ‘yelpy’: Input/output error
dd: closing output file ‘yelpy’: Input/output error

5. Now immediatly create a new file using touch say " touch waterwood"

Ideally, the file should be created in cold tier, but will fail as below:
[root@localhost testvol]# touch waterwood
touch: cannot touch ‘waterwood’: No space left on device

6. Now immediately check the database in backend 

 echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ;  echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /dummy/brick101/testvol/.glusterfs/testvol.db; echo "=============ColdBrick#2 =========" ;  echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /dummy/brick102/testvol/.glusterfs/testvol.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /dummy/brick105/testvol_hot/.glusterfs/testvol_hot.db; 


7. It can be seen that there is an entry in the hot tier for "waterwood file", even though the file is not created

8. Now create a new file, the new file will get created in cold tier (as expected)
9. wait for the next migrate cycle, and it can be seen that the file "waterwood" would be seen on mount point but as some junk or corrupt file as "????????"



[root@zod ~]# gluster v info testvol
 
Volume Name: testvol
Type: Tier
Volume ID: 3909b526-0ce5-44b3-b5a8-d4d2868e9db9
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Replicate
Number of Bricks: 1 x 2 = 2
Brick1: yarrow:/dummy/brick105/testvol_hot
Brick2: zod:/dummy/brick105/testvol_hot
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick3: zod:/dummy/brick102/testvol
Brick4: yarrow:/dummy/brick102/testvol
Brick5: zod:/dummy/brick101/testvol
Brick6: yarrow:/dummy/brick101/testvol
Options Reconfigured:
cluster.tier-demote-frequency: 180
cluster.tier-mode: test
features.ctr-enabled: on
performance.readdir-ahead: on

Comment 3 Vivek Agarwal 2015-12-15 08:21:49 UTC
http://review.gluster.org/12969

Comment 4 Joseph Elwin Fernandes 2015-12-21 13:53:58 UTC
https://code.engineering.redhat.com/gerrit/#/c/64284/

Comment 6 Vivek Agarwal 2015-12-23 06:46:32 UTC

*** This bug has been marked as a duplicate of bug 1291566 ***