Description of problem: On a 8 node gluster cluster, distributed replicated brick (2x2) is attached as hot tier on a 6x2 dispersed volume. A script to generate thousand 100 Mb files using dd was run over night. Data files are expected to be written to hot tier and as hot tier is dist-repl, it is expected to be distributed across two brick pairs. Instead, files are seen only on one set of bricks. Other brick just contains directory and no files are present. Pair-1 Brick3: 10.70.37.121:/rhs/brick4/leg1 --> 34G of data Brick4: 10.70.37.191:/rhs/brick4/leg1 --> 34G of data Pair-2: Brick1: 10.70.37.111:/rhs/brick4/leg1 --> 6.0M of data Brick2: 10.70.37.154:/rhs/brick4/leg1 --> 6.0M of data This is not expected. However, when 100 new files were created, they are equally distributed between the two pair of bricks. The behavior hence, is not consistent. sosreports from all nodes shall be attached shortly. Please note that the gluster servers and client are on rhel 6.7. Dev can have a look at the system. [root@dhcp37-191 ~]# gluster vol info Volume Name: tier-test-vol-01 Type: Tier Volume ID: 96318a9f-6033-4d86-9519-db89ec44246d Status: Started Number of Bricks: 16 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: 10.70.37.111:/rhs/brick4/leg1 Brick2: 10.70.37.154:/rhs/brick4/leg1 Brick3: 10.70.37.121:/rhs/brick4/leg1 Brick4: 10.70.37.191:/rhs/brick4/leg1 Cold Tier: Cold Tier Type : Distributed-Disperse Number of Bricks: 2 x (4 + 2) = 12 Brick5: 10.70.37.191:/rhs/brick1/leg1 Brick6: 10.70.37.121:/rhs/brick1/leg1 Brick7: 10.70.37.154:/rhs/brick1/leg1 Brick8: 10.70.37.111:/rhs/brick1/leg1 Brick9: 10.70.37.140:/rhs/brick15/leg1 Brick10: 10.70.37.132:/rhs/brick15/leg1 Brick11: 10.70.37.48:/rhs/brick15/leg1 Brick12: 10.70.37.180:/rhs/brick15/leg1 Brick13: 10.70.37.191:/rhs/brick2/leg1 Brick14: 10.70.37.121:/rhs/brick2/leg1 Brick15: 10.70.37.154:/rhs/brick2/leg1 Brick16: 10.70.37.111:/rhs/brick2/leg1 Options Reconfigured: cluster.tier-promote-frequency: 120 cluster.write-freq-threshold: 3 features.record-counters: on cluster.watermark-hi: 30 cluster.watermark-low: 10 cluster.tier-demote-frequency: 7200 cluster.tier-mode: cache features.ctr-enabled: on performance.readdir-ahead: on Version-Release number of selected component (if applicable): glusterfs-server-3.7.5-9.el6rhs.x86_64 How reproducible: We will have to try reproducing. Steps to Reproduce: <These are not steps to reproduce, but steps that let into this issue> 1. Configure a distributed disperse volume (6x2) 2. Attach a dist-repl hot tier to the above created vol 3. Modify the configurable parameters as shown below cluster.tier-promote-frequency: 120 cluster.write-freq-threshold: 3 cluster.watermark-hi: 30 cluster.watermark-low: 10 cluster.tier-demote-frequency: 7200 4. On the client, create multiple files under multiple directories so as to fill data till high water mark and keep writing more files. Actual results: Data in hot tier is not distributed Expected results: Data in hot tier has to be distributed Additional info:
sosreports can be found here --> http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1289021/
This is a reproducible issue and what I understand now is that, new files created are properly distributed on the hot tier, but files are not getting promoted to one pair of dist-repl. So, the steps are reproduce are, 1) create a tiered volume with dist-repl hot tier as shown in the below o/p of gluster vol info 2) create and write 100 files (/dd/file-{301..400}) --> Files are written on dist-repl hot tier ( and distributed as expected 3) Allow these 100 files to get demoted by writing more new files so the initially created 100 files are moved to cold tier --> All 100 files (file-30..file400) are demoted 4) Do not create any new files but 'heat' already created 100 files by continuously writing to them --> only files that hash to one set of bricks (10.70.37.191:/rhs/brick10/leg1 and 10.70.37.121:/rhs/brick10/leg1) are promoted. Files that hash to other set of bricks (10.70.37.111:/rhs/brick10/leg1 and 10.70.37.154:/rhs/brick10/leg1) aren't promoted. [root@dhcp37-191 ~]# gluster vol info Volume Name: tier-volume-01 Type: Tier Volume ID: 4ccf14ac-ecbe-4a7f-ade1-34a19a506fbb Status: Started Number of Bricks: 16 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: 10.70.37.111:/rhs/brick10/leg1 Brick2: 10.70.37.154:/rhs/brick10/leg1 Brick3: 10.70.37.121:/rhs/brick10/leg1 Brick4: 10.70.37.191:/rhs/brick10/leg1 Cold Tier: Cold Tier Type : Distributed-Disperse Number of Bricks: 2 x (4 + 2) = 12 Brick5: 10.70.37.191:/rhs/brick1/leg1 Brick6: 10.70.37.121:/rhs/brick1/leg1 Brick7: 10.70.37.154:/rhs/brick1/leg1 Brick8: 10.70.37.111:/rhs/brick1/leg1 Brick9: 10.70.37.140:/rhs/brick15/leg1 Brick10: 10.70.37.132:/rhs/brick15/leg1 Brick11: 10.70.37.48:/rhs/brick15/leg1 Brick12: 10.70.37.180:/rhs/brick15/leg1 Brick13: 10.70.37.191:/rhs/brick2/leg1 Brick14: 10.70.37.121:/rhs/brick2/leg1 Brick15: 10.70.37.154:/rhs/brick2/leg1 Brick16: 10.70.37.111:/rhs/brick2/leg1 Options Reconfigured: cluster.tier-demote-frequency: 3600 cluster.watermark-hi: 30 cluster.watermark-low: 10 cluster.tier-mode: cache features.ctr-enabled: on performance.readdir-ahead: on