Bug 1295293 - first file created after hot tier full fails to create, but gets database entry
Summary: first file created after hot tier full fails to create, but gets database entry
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: tier
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ---
Assignee: hari gowtham
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-04 06:29 UTC by Nag Pavan Chilakam
Modified: 2018-11-08 18:59 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-08 18:59:11 UTC
Embargoed:


Attachments (Terms of Use)

Description Nag Pavan Chilakam 2016-01-04 06:29:52 UTC
Raised a bug with same problem already "1291566 - first file created after hot tier full fails to create, but gets database entry and later ends up as a stale erroneous file (file with ???????????)"

But, the fix given solved only the stale erroneous file(the later part). The db entry was not fixed, due that being a potential sensitive code change, which could end up regressing. hence raising this bug for fixing "first file created after hot tier full fails to create but gets database entry "


=========COPYING STEPS AND INFO FROM BUG=======================
escription of problem:
======================
Looks like we are creating a database entry for a file much before even waiting for a confirmation from the brick end(ie creating during the winding itself and not checking during unwind of stack)

After a hot tier is completely filled, the first fresh file to be created, will always fail as below:
[root@localhost testvol]# touch waterwood
touch: cannot touch ‘waterwood’: No space left on device



If we check the database as soon as we the file create failed, we can see that it has a db entry as below:

>>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==
99e541df-2828-48e1-a33d-95273269fd6f|1449495715|296974|0|0|0|0|0|0|0|0
99e541df-2828-48e1-a33d-95273269fd6f|00000000-0000-0000-0000-000000000001|waterwood|/waterwood|0|0


Now, if you check the mount point again by issuing "ls" you will see the file as below(you won't see the file immediately, so wait for the next migration cycle)

??????????? ? ?    ?            ?            ? waterwood


this stale file is visible from mount point and remains for ever




Version-Release number of selected component (if applicable):
=====================
[root@zod ~]# rpm -qa|grep gluster
glusterfs-fuse-3.7.5-9.el7rhgs.x86_64
glusterfs-debuginfo-3.7.5-9.el7rhgs.x86_64
glusterfs-3.7.5-9.el7rhgs.x86_64
glusterfs-server-3.7.5-9.el7rhgs.x86_64
glusterfs-client-xlators-3.7.5-9.el7rhgs.x86_64
glusterfs-cli-3.7.5-9.el7rhgs.x86_64
glusterfs-libs-3.7.5-9.el7rhgs.x86_64
glusterfs-api-3.7.5-9.el7rhgs.x86_64


How reproducible:
===============
consistently


Steps to Reproduce:
==================
1.create a tiered vol(i disabled watermark)
2.now have the hot tier such that the space available is a bit more than minfree disk(so that the new file create will still go to hot tier)
3.now create a file(use dd)which will spill over the space of hot tier
4. the file will fail as below after certain time:
[root@localhost testvol]# dd if=/dev/urandom of=yelpy bs=1024 count=3000000
dd: error writing ‘yelpy’: Input/output error
dd: closing output file ‘yelpy’: Input/output error

5. Now immediatly create a new file using touch say " touch waterwood"

Ideally, the file should be created in cold tier, but will fail as below:
[root@localhost testvol]# touch waterwood
touch: cannot touch ‘waterwood’: No space left on device

6. Now immediately check the database in backend 

 echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ;  echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /dummy/brick101/testvol/.glusterfs/testvol.db; echo "=============ColdBrick#2 =========" ;  echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /dummy/brick102/testvol/.glusterfs/testvol.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /dummy/brick105/testvol_hot/.glusterfs/testvol_hot.db; 


7. It can be seen that there is an entry in the hot tier for "waterwood file", even though the file is not created

8. Now create a new file, the new file will get created in cold tier (as expected)
9. wait for the next migrate cycle, and it can be seen that the file "waterwood" would be seen on mount point but as some junk or corrupt file as "????????"



[root@zod ~]# gluster v info testvol
 
Volume Name: testvol
Type: Tier
Volume ID: 3909b526-0ce5-44b3-b5a8-d4d2868e9db9
Status: Started
Number of Bricks: 6
Transport-type: tcp
Hot Tier :
Hot Tier Type : Replicate
Number of Bricks: 1 x 2 = 2
Brick1: yarrow:/dummy/brick105/testvol_hot
Brick2: zod:/dummy/brick105/testvol_hot
Cold Tier:
Cold Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick3: zod:/dummy/brick102/testvol
Brick4: yarrow:/dummy/brick102/testvol
Brick5: zod:/dummy/brick101/testvol
Brick6: yarrow:/dummy/brick101/testvol
Options Reconfigured:
cluster.tier-demote-frequency: 180
cluster.tier-mode: test
features.ctr-enabled: on
performance.readdir-ahead: on

Comment 4 hari gowtham 2018-11-08 18:59:11 UTC
As tier is not being actively developed, I'm closing this bug. Feel free to open it if necessary.


Note You need to log in before you can comment on or make changes to this bug.