Bug 1294790 - promotions and demotions not happening after attach tier due to fix layout taking very long time(3 days)
Summary: promotions and demotions not happening after attach tier due to fix layout ta...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: tier
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: RHGS 3.1.3
Assignee: sankarshan
QA Contact: krishnaram Karthick
URL:
Whiteboard: tier-migration
: 1303406 (view as bug list)
Depends On:
Blocks: 1268895 1299184 1313228 1323016
TreeView+ depends on / blocked
 
Reported: 2015-12-30 10:33 UTC by Nag Pavan Chilakam
Modified: 2019-04-03 09:15 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.7.9-2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1313228 (view as bug list)
Environment:
Last Closed: 2016-06-23 05:01:11 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:1240 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.1 Update 3 2016-06-23 08:51:28 UTC

Description Nag Pavan Chilakam 2015-12-30 10:33:36 UTC
Description of problem:
=======================
I have created a dist-EC volume and had following IOs happen on it:
1)create a parent dir, in this dir create other dir, copy linux kernel, untart the kernel.
Again under the parent dir, create another dir, copy linux kernel, untart the kernel.  and so on in a loop of about 1000 so dir.1, dir.2.....dir.1000
2)With an hour lag or so, start to rename or move the dir.1 to rename_dir.1 and so on for all the dirs 

in total created about 100GB of data 

Kept this IO pumping for about a day
and then attached tier.

After attaching tier i changed the some values wrt watermarks and other(look at volinfo)

Now, I started to pump in more files creates and IOs,

Problem: even after 3 days, not a single file has got promoted or demoted

Also, the binary files under /var/run/gluster are not created(the list for promotes/demotes)


Perceived Impact: If an existing customer with huge data sets, want to attach tier to their volume, then the functionality of tiering will not be seen for a few days due to fix layout





[root@zod ~]# gluster v info stress;gluster v status stress;gluster v rebal stress status;gluster v tier stress status
 
Volume Name: stress
Type: Tier
Volume ID: 67a53277-2e05-4240-b0fb-38728d7e1bcd
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distributed-Replicate
Number of Bricks: 2 x 2 = 4
Brick1: yarrow:/rhs/brick5/stress_hot
Brick2: zod:/rhs/brick5/stress_hot
Brick3: yarrow:/rhs/brick7/stress_hot
Brick4: zod:/rhs/brick7/stress_hot
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick5: zod:/rhs/brick1/stress
Brick6: yarrow:/rhs/brick1/stress
Brick7: zod:/rhs/brick2/stress
Brick8: yarrow:/rhs/brick2/stress
Brick9: zod:/rhs/brick3/stress
Brick10: yarrow:/rhs/brick3/stress
Brick11: zod:/rhs/brick4/stress
Brick12: yarrow:/rhs/brick4/stress
Brick13: zod:/rhs/brick5/stress
Brick14: yarrow:/rhs/brick5/stress
Brick15: yarrow:/rhs/brick6/stress
Brick16: zod:/rhs/brick6/stress
Options Reconfigured:
cluster.tier-max-files: 100000000000
cluster.tier-max-mb: 10000000000
cluster.watermark-hi: 50
cluster.watermark-low: 15
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
cluster.tier-mode: cache
features.ctr-enabled: on
performance.readdir-ahead: on
Status of volume: stress
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick yarrow:/rhs/brick5/stress_hot         49163     0          Y       1877 
Brick zod:/rhs/brick5/stress_hot            49163     0          Y       11298
Brick yarrow:/rhs/brick7/stress_hot         49162     0          Y       1858 
Brick zod:/rhs/brick7/stress_hot            49162     0          Y       11279
Cold Bricks:
Brick zod:/rhs/brick1/stress                49156     0          Y       7639 
Brick yarrow:/rhs/brick1/stress             49156     0          Y       31165
Brick zod:/rhs/brick2/stress                49157     0          Y       7658 
Brick yarrow:/rhs/brick2/stress             49157     0          Y       31186
Brick zod:/rhs/brick3/stress                49158     0          Y       7677 
Brick yarrow:/rhs/brick3/stress             49158     0          Y       31209
Brick zod:/rhs/brick4/stress                49159     0          Y       7696 
Brick yarrow:/rhs/brick4/stress             49159     0          Y       31228
Brick zod:/rhs/brick5/stress                49160     0          Y       7715 
Brick yarrow:/rhs/brick5/stress             49160     0          Y       31247
Brick yarrow:/rhs/brick6/stress             49161     0          Y       31266
Brick zod:/rhs/brick6/stress                49161     0          Y       7734 
NFS Server on localhost                     2049      0          Y       21614
Self-heal Daemon on localhost               N/A       N/A        Y       21624
Quota Daemon on localhost                   N/A       N/A        Y       19827
NFS Server on yarrow                        2049      0          Y       4333 
Self-heal Daemon on yarrow                  N/A       N/A        Y       4373 
Quota Daemon on yarrow                      N/A       N/A        Y       10391
 
Task Status of Volume stress
------------------------------------------------------------------------------
Task                 : Tier migration      
ID                   : 47b3bd56-9d0e-4025-a84b-a21a6ab99d4e
Status               : in progress         
 
                                    Node Rebalanced-files          size       scanned      failures       skipped               status   run time in secs
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                               localhost                0        0Bytes             0             0             0          in progress          183055.00
                                  yarrow                0        0Bytes             0             0             0          in progress          183052.00
volume rebalance: stress: success
Node                 Promoted files       Demoted files        Status              
---------            ---------            ---------            ---------           
localhost            0                    0                    in progress         
yarrow               0                    0                    in progress         
Tiering Migration Functionality: stress: success

Comment 4 Mohammed Rafi KC 2016-02-01 05:45:05 UTC
*** Bug 1303406 has been marked as a duplicate of this bug. ***

Comment 9 Mike McCune 2016-03-28 22:48:26 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 10 Joseph Elwin Fernandes 2016-04-06 05:40:57 UTC
https://code.engineering.redhat.com/gerrit/#/c/71484/

Comment 13 krishnaram Karthick 2016-05-18 03:53:16 UTC
Verified the fix in build - glusterfs-3.7.9-5.el7rhgs.x86_64


1) Files are promoted immediately after attach tier operation
2) Fix layout happens in the background
3) trusted.tier.fix.layout.complete xattr is set after fixlayout is complete

Moving the bug to verified.

Comment 17 errata-xmlrpc 2016-06-23 05:01:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1240


Note You need to log in before you can comment on or make changes to this bug.