Bug 1291969 - [Tiering]: When files are heated continuously, promotions are too aggressive that it promotes files way beyond high water mark
Summary: [Tiering]: When files are heated continuously, promotions are too aggressive ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: tier
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: RHGS 3.1.2
Assignee: Bug Updates Notification Mailing List
QA Contact: krishnaram Karthick
URL:
Whiteboard:
Depends On:
Blocks: 1291549 1293932
TreeView+ depends on / blocked
 
Reported: 2015-12-16 05:16 UTC by krishnaram Karthick
Modified: 2016-09-17 15:37 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.7.5-15
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1293932 (view as bug list)
Environment:
Last Closed: 2016-03-01 06:04:07 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0193 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 update 2 2016-03-01 10:20:36 UTC

Description krishnaram Karthick 2015-12-16 05:16:32 UTC
Description of problem:

When 100 files were continuously heated by repeated writes, promotions were too aggressive that it crossed high water mark. Promotions continued till almost 90% when high water mark was set at 60%

Total hot tier size: 20G
high water mark at 60% : 12G
low water mark at 30% : 6G

Promotions continued till 90%, i.e., 18G and hot tier was at 90% at least for 30 minutes.

However, when the system was left with the same state with overnight IO, disk usage on hot tier reduced to 53% i.e., 11G

[root@dhcp43-19 ~]#  gluster vol info
 
Volume Name: tier-vol-01
Type: Tier
Volume ID: 78d1fc18-5d0a-452d-ab26-b78c15655c60
Status: Started
Number of Bricks: 13
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 1
Brick1: 10.70.42.47:/rhs/brick4/leg2
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick2: 10.70.42.47:/rhs/brick1/leg1
Brick3: 10.70.43.19:/rhs/brick1/leg1
Brick4: 10.70.42.177:/rhs/brick1/leg1
Brick5: 10.70.42.10:/rhs/brick1/leg1
Brick6: 10.70.43.140:/rhs/brick1/leg1
Brick7: 10.70.42.87:/rhs/brick1/leg1
Brick8: 10.70.42.228:/rhs/brick1/leg1
Brick9: 10.70.42.183:/rhs/brick1/leg1
Brick10: 10.70.42.47:/rhs/brick2/leg1
Brick11: 10.70.43.19:/rhs/brick2/leg1
Brick12: 10.70.42.177:/rhs/brick2/leg1
Brick13: 10.70.42.10:/rhs/brick2/leg1
Options Reconfigured:
cluster.tier-mode: cache
features.ctr-enabled: on
cluster.tier-promote-frequency: 120
cluster.tier-demote-frequency: 120
features.record-counters: on
cluster.watermark-hi: 60
cluster.watermark-low: 10
performance.readdir-ahead: on

Files that were written on mount were - /mnt/tier-vol-01/dd/file-{901..999}

Version-Release number of selected component (if applicable):
glusterfs-3.7.5-11.el7rhgs.x86_64

How reproducible:
Yet to determine if this issue is reproducible always, Have seen this behavior at least twice

Steps to Reproduce:
1. Keep the system idle with multiple files on cold tier
2. When current disk usage in hot tier is less than low water mark, keep writing to 100 files continuously
3. Monitor disk usage of hot tier

Actual results:
Promotions continue way above high water mark setting

Expected results:
promotions should stop when high water mark is hit

Additional info:
No new files were created during this test, existing files were written in order to mark them for promotion.

Comment 1 krishnaram Karthick 2015-12-16 05:21:25 UTC
sosreports are available here --> http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1291969/

Comment 3 Dan Lambright 2015-12-22 21:09:28 UTC
Hi Karthick. What is the file size you are using?

Comment 4 krishnaram Karthick 2015-12-23 01:29:39 UTC
File sizes used were in the range of 100Mb for the initial test when the total hot tier was 20Gb.

Comment 5 Dan Lambright 2015-12-23 05:30:21 UTC
Each node may independently pick a file to promote. There are 12 nodes on the cold tier, so no more than 1.2G should be moved in parallel in a single direction. This could leave to an overshoot of 1.2G (I think), but you see more than that.

What each node should be doing is (1) checking free space on the entire hot tier, abort if there is not enough space to move a particular 100Mb file (2) if there is space, move the file, (3) back to (1). 

On code inspection, it looks like step (1) is improperly implemented. I'll see if I can recreate this and write a fix if my inspection proves to be correct.

Comment 9 Dan Lambright 2016-01-08 14:06:57 UTC
Patch 64888 submitted for this problem.

Comment 10 krishnaram Karthick 2016-01-11 05:44:07 UTC
Verified the fix on build glusterfs-3.7.5-15

Following tests were performed to verify the fix.

Test 1:
1) Created a dist-rep volume (300Gb), enabled quota
2) Fuse mounted the volume and created 1000 files of 10Mb
3) Attached a replicated tier brick(50Gb)
4) Modified watermark to 45% and 70%
5) Heated all 1000 files created in step 2. No new files were written

Result: Disk usage hit 70% and never increased beyond

Test 2: After test1, increased watermark to 80%
Result: Disk usage hit 80% and never increased beyond

Test 3: After test 2, decreased watermark to 70%
Result: Disk usage reached 70% and never increased beyond

Test 4: stopped heating files and left the system idle.
Result: Disk usage reached 45% and never decreased beyond

Test 5: Repeated test 1 and test 4 with promotion cycle at 300s and demotion cycle at 1800s. 
Results were same.

Moving the bug to verified state.

Comment 12 errata-xmlrpc 2016-03-01 06:04:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0193.html


Note You need to log in before you can comment on or make changes to this bug.