Bug 1314660 - File gets struck in hot tier after volume restart
Summary: File gets struck in hot tier after volume restart
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: tier
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: Nag Pavan Chilakam
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-04 08:09 UTC by hari gowtham
Modified: 2018-11-08 18:28 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-08 18:28:07 UTC
Embargoed:


Attachments (Terms of Use)
sos report attached. (5.41 MB, application/x-xz)
2016-03-04 08:09 UTC, hari gowtham
no flags Details

Description hari gowtham 2016-03-04 08:09:12 UTC
Created attachment 1133090 [details]
sos report attached.

Description of problem:
on a newly created tiered volume where demotion is taking place, the volume has to be stopped while the migration is being done and started so that the migration takes place. at this point one file might fail to get demoted. this file stays in the hot brick forever.

Version-Release number of selected component (if applicable):
3.1.2 
(has been found to be fixed in master i.e 3.7.8 and above)

How reproducible:
at least once in 3 times.

Steps to Reproduce:
1.create a tiered volume and mount it.
2.create a lot of files so that the demotion of these files will take a while
3.while demoting stop the volume.
4.start the volume (demotion also starts)
5.once the demotion is done check for the files. 
6. one file alone might end up on the hot tier

Actual results:
one file ends up on hot tier

Expected results:
file shouldn't be there on hot. it should get demoted.

Additional info:

set the tier mode to test and demote frequency low.

cluster.tier-demote-frequency: 10
cluster.tier-mode: test

i have pasted the script below which i used to come across this bug:

  1 #!bin/bash
  2 rm /home/md5sum/all_md5sum.txt                                                           
  3 glusterd
  4 gluster v create v1 10.70.42.193:/home/bricks/b1 10.70.42.193:/home/bricks/b2 force
  5 gluster v start v1
  6 gluster v tier v1 attach 10.70.42.193:/home/bricks/hb1 10.70.42.193:/home/bricks/hb2 force
  7 gluster v set v1 cluster.tier-mode test
  8 gluster v set v1 cluster.tier-demote-frequency 10
  9 mkdir /data/gluster/mount
 10 mount -t glusterfs 10.70.42.193:v1 /data/gluster/mount/
 11 crefi --single --random --min=1k --max=10k /data/gluster/mount/
 12 md5sum /data/gluster/mount/* > /home/md5sum/all_md5sum.txt
 13 gluster v rebal v1 status
 14 ls /home/bricks/*
 15 sleep 12
 16 ls /home/bricks/*
 17 ps aux | ag tier
 18 gluster v stop v1
 19 ls /home/bricks/*
 20 sleep 5
 21 gluster v start v1
 22 sleep 12
 23 ps aux | ag tier
 24 ls /home/bricks/*
 25 gluster v rebal v1 status
 26 md5sum -c /home/md5sum/all_md5sum.txt

The configure might vary as per one's setup. The md5sum is to check if the file was intact.(yes it is) 
I have used crefi to generate files of various file randomly(if if script is going to be used, then crefi has to be install)

once these steps are done. check the hot brick after a while(more then 10 seconds or more). It will have the file.

===============================================================================

Volume Name: v1
Type: Tier
Volume ID: 18bd9d73-1449-445f-98f4-5a9fcfd5150a
Status: Started
Number of Bricks: 4
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 2
Brick1: 10.70.42.193:/home/bricks/hb2
Brick2: 10.70.42.193:/home/bricks/hb1
Cold Tier:
Cold Tier Type : Distribute
Number of Bricks: 2
Brick3: 10.70.42.193:/home/bricks/b1
Brick4: 10.70.42.193:/home/bricks/b2
Options Reconfigured:
cluster.tier-demote-frequency: 10
cluster.tier-mode: test
features.ctr-enabled: on
performance.readdir-ahead: on

Comment 3 hari gowtham 2018-11-08 18:28:07 UTC
As tier is not being actively developed, I'm closing this bug. Feel free to open it if necessary.


Note You need to log in before you can comment on or make changes to this bug.