Description of problem: ====================== When counter is enabled, and a file in cold tier is accessed, more than the threshold limit, the file gets promoted as expected, but is again demoted instantly, hence leading to wasted IO and break in tiering functionality Version-Release number of selected component (if applicable): glusterfs-server-3.7.5-6.el7rhgs.x86_64 How reproducible: =================== easy Steps to Reproduce: 1.create a tiered volume which is spread across a 2 node cluster 2.turn on features.record-counter 3.now set the thresholds to some value(which by default are zero) say 3 as below: cluster.write-freq-threshold: 3 cluster.read-freq-threshold: 3 3.Now create a file which is now in hot tier 4. Now wait for the file to demote. A 5. Once demoted, append one line through echo 4 times. You can see that the db is shown recording the counter 6.Note down the gluster v tier vol status and rebal status. 7.Keep checking the db continusly(else you will miss seeing the promote demote as the window is small) Now in the next cycle it can be seen that the file is immediatly promoted and demoted 8 you can even recheck the vol tier and rebal status which would have shwown one promote and demote. Actual results: The file will promote and demote immediatly meaning that the file even on conti access ends up on cold tier. Hence break in tier functioanlity NOTE:There is a microsec difference in time b/w the two nodes. I suspect this the reason as one pair of the afr brick might lag in promotion CLI Logs: ========= [root@zod distrep]# gluster v info count Volume Name: count Type: Tier Volume ID: 8475b2fe-db84-473a-832e-947034cf78a0 Status: Started Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: yarrow:/rhs/brick7/count_hot Brick2: zod:/rhs/brick7/count_hot Brick3: yarrow:/rhs/brick6/count_hot Brick4: zod:/rhs/brick6/count_hot Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick5: zod:/rhs/brick1/count Brick6: yarrow:/rhs/brick1/count Brick7: zod:/rhs/brick2/count Brick8: yarrow:/rhs/brick2/count Options Reconfigured: diagnostics.brick-log-level: TRACE cluster.write-freq-threshold: 3 cluster.read-freq-threshold: 3 features.record-counters: off features.ctr-enabled: on performance.readdir-ahead: on [root@zod distrep]# gluster v tier count status;gluster v rebal count status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 12 0 in progress yarrow 0 22 in progress Tiering Migration Functionality: count: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 12 0Bytes 15 0 0 in progress 7310.00 yarrow 22 0Bytes 26 0 0 in progress 7298.00 volume rebalance: count: success [root@zod distrep]# [root@zod distrep]# [root@zod distrep]# gluster v tier count status;gluster v rebal count status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 13 0 in progress yarrow 0 23 in progress Tiering Migration Functionality: count: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 13 0Bytes 16 0 0 in progress 7463.00 yarrow 23 0Bytes 27 0 0 in progress 7451.00 volume rebalance: count: success [root@zod distrep]# [root@zod distrep]# gluster v tier count status;gluster v rebal count status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 13 0 in progress yarrow 0 24 in progress Tiering Migration Functionality: count: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 13 0Bytes 17 0 0 in progress 7516.00 yarrow 24 0Bytes 28 0 0 in progress 7504.00 volume rebalance: count: success [root@zod distrep]# [root@zod distrep]# [root@zod distrep]# [root@zod distrep]# gluster v tier count status;gluster v rebal count status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 13 0 in progress yarrow 0 24 in progress Tiering Migration Functionality: count: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 13 0Bytes 17 0 0 in progress 7614.00 yarrow 24 0Bytes 28 0 0 in progress 7602.00 volume rebalance: count: success [root@zod distrep]# gluster v info count Volume Name: count Type: Tier Volume ID: 8475b2fe-db84-473a-832e-947034cf78a0 Status: Started Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: yarrow:/rhs/brick7/count_hot Brick2: zod:/rhs/brick7/count_hot Brick3: yarrow:/rhs/brick6/count_hot Brick4: zod:/rhs/brick6/count_hot Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick5: zod:/rhs/brick1/count Brick6: yarrow:/rhs/brick1/count Brick7: zod:/rhs/brick2/count Brick8: yarrow:/rhs/brick2/count Options Reconfigured: diagnostics.brick-log-level: TRACE cluster.write-freq-threshold: 3 cluster.read-freq-threshold: 3 features.record-counters: off features.ctr-enabled: on performance.readdir-ahead: on [root@zod distrep]# Brick logs attached
Created attachment 1097271 [details] brick logs of both nodes to see the file being immediealty promoted and demoted including recording of counters
The vol info output you have pasted shows features.record-counters: off !
You can look into following log: [root@zod ~]# echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick1/mobi/.glusterfs/mobi.db; echo "=============ColdBrick#2 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick2/mobi/.glusterfs/mobi.db; echo ">>>>>>>>>>>> HOTBRICK#1 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick7/mobi_hot/.glusterfs/mobi_hot.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick6/mobi_hot/.glusterfs/mobi_hot.db ===========Date===================== Mon Nov 23 11:07:56 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|1448257054|993447|0|0|0|0|0|0|5|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 =============ColdBrick#2 ========= da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|1|1 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|1|1 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== [root@zod ~]# echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick1/mobi/.glusterfs/mobi.db; echo "=============ColdBrick#2 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick2/mobi/.glusterfs/mobi.db; echo ">>>>>>>>>>>> HOTBRICK#1 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick7/mobi_hot/.glusterfs/mobi_hot.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick6/mobi_hot/.glusterfs/mobi_hot.db ===========Date===================== Mon Nov 23 11:07:57 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|1448257054|993447|0|0|0|0|0|0|5|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 =============ColdBrick#2 ========= da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|1|1 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|1|1 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== [root@zod ~]# echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick1/mobi/.glusterfs/mobi.db; echo "=============ColdBrick#2 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick2/mobi/.glusterfs/mobi.db; echo ">>>>>>>>>>>> HOTBRICK#1 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick7/mobi_hot/.glusterfs/mobi_hot.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick6/mobi_hot/.glusterfs/mobi_hot.db ;ll /rhs/brick*/mobi* ===========Date===================== Mon Nov 23 11:08:10 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|0|0|0|0|0|0|0|0|1|1 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 =============ColdBrick#2 ========= da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== /rhs/brick1/mobi: total 4 -rw-r--r--. 2 root root 541 Nov 23 11:07 c2 /rhs/brick2/mobi: total 8 -rw-r--r--. 2 root root 541 Nov 23 11:04 c1 -rw-r--r--. 2 root root 46 Nov 23 11:04 h1 -rw-r--r--. 2 root root 0 Nov 23 2015 h2 /rhs/brick6/mobi_hot: total 0 /rhs/brick7/mobi_hot: total 0 [root@zod ~]# [root@zod ~]# echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick1/mobi/.glusterfs/mobi.db; echo "=============ColdBrick#2 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick2/mobi/.glusterfs/mobi.db; echo ">>>>>>>>>>>> HOTBRICK#1 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick7/mobi_hot/.glusterfs/mobi_hot.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick6/mobi_hot/.glusterfs/mobi_hot.db ;ll /rhs/brick*/mobi* ===========Date===================== Mon Nov 23 11:08:15 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|0|0|0|0|0|0|0|0|1|1 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 =============ColdBrick#2 ========= da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== /rhs/brick1/mobi: total 4 -rw-r--r--. 2 root root 541 Nov 23 11:07 c2 /rhs/brick2/mobi: total 8 -rw-r--r--. 2 root root 541 Nov 23 11:04 c1 -rw-r--r--. 2 root root 46 Nov 23 11:04 h1 -rw-r--r--. 2 root root 0 Nov 23 2015 h2 /rhs/brick6/mobi_hot: total 0 /rhs/brick7/mobi_hot: total 0 [root@zod ~]# date Mon Nov 23 11:08:26 IST 2015 [root@zod ~]# [root@zod ~]# gluster v tier mobi status;gluster v rebal mobi status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 1 0 in progress yarrow 0 3 in progress Tiering Migration Functionality: mobi: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 1 0Bytes 3 0 0 in progress 466.00 yarrow 3 0Bytes 3 0 0 in progress 463.00 volume rebalance: mobi: success [root@zod ~]# [root@zod ~]# [root@zod ~]# gluster v tier mobi status;gluster v rebal mobi status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 2 0 in progress yarrow 0 4 in progress Tiering Migration Functionality: mobi: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 2 0Bytes 4 0 0 in progress 497.00 yarrow 4 0Bytes 4 0 0 in progress 494.00 volume rebalance: mobi: success [root@zod ~]# #gluster v get mobi all|grep thres [root@zod ~]# gluster v tier mobi status;gluster v rebal mobi status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 2 0 in progress yarrow 0 4 in progress Tiering Migration Functionality: mobi: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 2 0Bytes 4 0 0 in progress 563.00 yarrow 4 0Bytes 4 0 0 in progress 560.00 volume rebalance: mobi: success [root@zod ~]# gluster v info mobi Volume Name: mobi Type: Tier Volume ID: b2493bb9-08ee-4b18-b78c-04aeabe377c2 Status: Started Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: yarrow:/rhs/brick7/mobi_hot Brick2: zod:/rhs/brick7/mobi_hot Brick3: yarrow:/rhs/brick6/mobi_hot Brick4: zod:/rhs/brick6/mobi_hot Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick5: zod:/rhs/brick1/mobi Brick6: yarrow:/rhs/brick1/mobi Brick7: zod:/rhs/brick2/mobi Brick8: yarrow:/rhs/brick2/mobi Options Reconfigured: cluster.read-freq-threshold: 4 cluster.write-freq-threshold: 4 features.record-counters: on features.ctr-enabled: on performance.readdir-ahead: on
Well I tested the problem reported in this bug in the same setup Nag Pavan had tested previously using the same volume, and found counters to be working, This is what I did. gluster volume info output Volume Name: mobi Type: Tier Volume ID: b2493bb9-08ee-4b18-b78c-04aeabe377c2 Status: Started Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: yarrow:/rhs/brick7/mobi_hot Brick2: zod:/rhs/brick7/mobi_hot Brick3: yarrow:/rhs/brick6/mobi_hot Brick4: zod:/rhs/brick6/mobi_hot Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick5: zod:/rhs/brick1/mobi Brick6: yarrow:/rhs/brick1/mobi Brick7: zod:/rhs/brick2/mobi Brick8: yarrow:/rhs/brick2/mobi Options Reconfigured: cluster.read-freq-threshold: 4 cluster.write-freq-threshold: 4 features.record-counters: on features.ctr-enabled: on performance.readdir-ahead: on Had 5 file created and demoted to the cold tier h1, h2, c1, c2 and c3 This is how the bricks look like, /rhs/brick1/mobi: total 8.0K -rw-r--r--. 2 root root 2.8K Nov 23 17:04 c2 -rw-r--r--. 2 root root 2.8K Nov 23 17:04 c3 /rhs/brick2/mobi: total 16K -rw-r--r--. 2 root root 2.8K Nov 23 17:04 c1 -rw-r--r--. 2 root root 2.8K Nov 23 17:04 h1 -rw-r--r--. 2 root root 2.8K Nov 23 17:04 h2 -rw-r--r--. 2 root root 30 Nov 23 17:32 newf /rhs/brick6/mobi_hot: total 0 ---------T. 2 root root 0 Nov 23 17:30 c3 ---------T. 2 root root 0 Nov 23 19:39 newf /rhs/brick7/mobi_hot: total 0 and this is how the db entries look like, ===========Date===================== Mon Nov 23 19:47:42 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|0|0|0|0|0|0|0|0|0|0 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|0|0|0|0|0|0|0|0|0|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|00000000-0000-0000-0000-000000000001|c3|/c3|0|0 =============ColdBrick#2 ========= bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|0|0 4c169529-b24a-4f5c-9c91-05c4ed102a70|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 4c169529-b24a-4f5c-9c91-05c4ed102a70|00000000-0000-0000-0000-000000000001|newf|/newf|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== Now lets heat the file with 4 writes each so that they become candidates for promotion, echo "heelo world" >> h1 echo "heelo world" >> h1 echo "heelo world" >> h1 echo "heelo world" >> h1 echo "heelo world" >> h2 echo "heelo world" >> h2 echo "heelo world" >> h2 echo "heelo world" >> h2 echo "heelo world" >> c1 echo "heelo world" >> c1 echo "heelo world" >> c1 echo "heelo world" >> c1 echo "heelo world" >> c2 echo "heelo world" >> c2 echo "heelo world" >> c2 echo "heelo world" >> c2 echo "heelo world" >> c3 echo "heelo world" >> c3 echo "heelo world" >> c3 echo "heelo world" >> c3 And this is how the db entries look like with heated files, notice the heat write heat counters having the value 4 ===========Date===================== Mon Nov 23 19:48:25 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|1448288301|774919|0|0|0|0|0|0|4|0 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|1448288301|783661|0|0|0|0|0|0|4|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|00000000-0000-0000-0000-000000000001|c3|/c3|0|0 =============ColdBrick#2 ========= bf2accc6-035b-4af5-95eb-06e6c8106263|1448288301|765976|0|0|0|0|0|0|4|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|1448288301|756658|0|0|0|0|0|0|4|0 f697a256-bf6d-45d5-b483-4711c7515e18|1448288301|746519|0|0|0|0|0|0|4|0 4c169529-b24a-4f5c-9c91-05c4ed102a70|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 4c169529-b24a-4f5c-9c91-05c4ed102a70|00000000-0000-0000-0000-000000000001|newf|/newf|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== The files get promoted with the next promotion cycle and this is how the bricks look like /rhs/brick1/mobi: total 0 /rhs/brick2/mobi: total 4.0K -rw-r--r--. 2 root root 30 Nov 23 17:32 newf /rhs/brick6/mobi_hot: total 16K -rw-r--r--. 2 root root 2.9K Nov 23 19:48 c1 -rw-r--r--. 2 root root 2.9K Nov 23 19:48 c3 -rw-r--r--. 2 root root 2.9K Nov 23 19:48 h1 -rw-r--r--. 2 root root 2.9K Nov 23 19:48 h2 ---------T. 2 root root 0 Nov 23 19:39 newf /rhs/brick7/mobi_hot: total 4.0K -rw-r--r--. 2 root root 2.9K Nov 23 19:48 c2 Observe all the heated files are promoted. And this is how the db entries look like, ===========Date===================== Mon Nov 23 19:50:23 IST 2015 =============ColdBrick#1 ========= =============ColdBrick#2 ========= 4c169529-b24a-4f5c-9c91-05c4ed102a70|0|0|0|0|0|0|0|0|0|0 4c169529-b24a-4f5c-9c91-05c4ed102a70|00000000-0000-0000-0000-000000000001|newf|/newf|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== 3aca3400-b6b1-495d-a433-1c094762358b|1448288400|105735|0|0|0|0|0|0|1|1 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|1448288400|145780|0|0|0|0|0|0|1|1 bf2accc6-035b-4af5-95eb-06e6c8106263|1448288400|184285|0|0|0|0|0|0|1|1 da996a66-5ebe-44f8-8482-5cb5eea52c97|1448288400|222641|0|0|0|0|0|0|1|1 f697a256-bf6d-45d5-b483-4711c7515e18|1448288400|259994|0|0|0|0|0|0|1|1 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|00000000-0000-0000-0000-000000000001|c3|/c3|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 Now lets heat the files twice to see if they stay on the hot tier using echo "heelo world" >> h1 echo "heelo world" >> h1 echo "heelo world" >> h1 echo "heelo world" >> h1 echo "heelo world" >> h2 echo "heelo world" >> h2 echo "heelo world" >> h2 echo "heelo world" >> h2 echo "heelo world" >> c1 echo "heelo world" >> c1 echo "heelo world" >> c1 echo "heelo world" >> c1 echo "heelo world" >> c2 echo "heelo world" >> c2 echo "heelo world" >> c2 echo "heelo world" >> c2 echo "heelo world" >> c3 echo "heelo world" >> c3 echo "heelo world" >> c3 echo "heelo world" >> c3 And this is now the db entries look like, after the files are heated two times. ===========Date===================== Mon Nov 23 19:50:45 IST 2015 =============ColdBrick#1 ========= =============ColdBrick#2 ========= 4c169529-b24a-4f5c-9c91-05c4ed102a70|0|0|0|0|0|0|0|0|0|0 4c169529-b24a-4f5c-9c91-05c4ed102a70|00000000-0000-0000-0000-000000000001|newf|/newf|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== 3aca3400-b6b1-495d-a433-1c094762358b|1448288440|424985|0|0|0|0|0|0|9|1 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|1448288440|432357|0|0|0|0|0|0|9|1 bf2accc6-035b-4af5-95eb-06e6c8106263|1448288440|417364|0|0|0|0|0|0|9|1 da996a66-5ebe-44f8-8482-5cb5eea52c97|1448288440|409610|0|0|0|0|0|0|9|1 f697a256-bf6d-45d5-b483-4711c7515e18|1448288440|401995|0|0|0|0|0|0|9|1 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|00000000-0000-0000-0000-000000000001|c3|/c3|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 We wait for the next demotion cycle, and expect the files to be on the hot tier with their write counters cleared and this is how the db entries look like, ===========Date===================== Mon Nov 23 19:52:04 IST 2015 =============ColdBrick#1 ========= =============ColdBrick#2 ========= 4c169529-b24a-4f5c-9c91-05c4ed102a70|0|0|0|0|0|0|0|0|0|0 4c169529-b24a-4f5c-9c91-05c4ed102a70|00000000-0000-0000-0000-000000000001|newf|/newf|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== 3aca3400-b6b1-495d-a433-1c094762358b|1448288440|424985|0|0|0|0|0|0|0|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|1448288440|432357|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|1448288440|417364|0|0|0|0|0|0|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|1448288440|409610|0|0|0|0|0|0|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|1448288440|401995|0|0|0|0|0|0|0|0 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|00000000-0000-0000-0000-000000000001|c3|/c3|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 To confirm the file are still on the hot tier lets have a look at the brick Every 1.0s: ./mobi_ls.sh Mon Nov 23 19:52:16 2015 /rhs/brick1/mobi: total 0 /rhs/brick2/mobi: total 4.0K -rw-r--r--. 2 root root 30 Nov 23 17:32 newf /rhs/brick6/mobi_hot: total 16K -rw-r--r--. 2 root root 3.0K Nov 23 19:50 c1 -rw-r--r--. 2 root root 3.0K Nov 23 19:50 c3 -rw-r--r--. 2 root root 3.0K Nov 23 19:50 h1 -rw-r--r--. 2 root root 3.0K Nov 23 19:50 h2 ---------T. 2 root root 0 Nov 23 19:39 newf /rhs/brick7/mobi_hot: total 4.0K -rw-r--r--. 2 root root 3.0K Nov 23 19:50 c2 We dont heat the file any more and expected them to be demoted in the coming demotion cycle, And this is how the db entries and bricks look like after the demotion, ===========Date===================== Mon Nov 23 19:54:02 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|0|0|0|0|0|0|0|0|0|0 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|0|0|0|0|0|0|0|0|0|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 cd9d9c86-cd8e-46c9-98d6-9b21e9f97c4c|00000000-0000-0000-0000-000000000001|c3|/c3|0|0 =============ColdBrick#2 ========= 4c169529-b24a-4f5c-9c91-05c4ed102a70|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|0|0 4c169529-b24a-4f5c-9c91-05c4ed102a70|00000000-0000-0000-0000-000000000001|newf|/newf|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== Every 1.0s: ./mobi_ls.sh Mon Nov 23 19:54:09 2015 /rhs/brick1/mobi: total 8.0K -rw-r--r--. 2 root root 3.0K Nov 23 19:50 c2 -rw-r--r--. 2 root root 3.0K Nov 23 19:50 c3 /rhs/brick2/mobi: total 16K -rw-r--r--. 2 root root 3.0K Nov 23 19:50 c1 -rw-r--r--. 2 root root 3.0K Nov 23 19:50 h1 -rw-r--r--. 2 root root 3.0K Nov 23 19:50 h2 -rw-r--r--. 2 root root 30 Nov 23 17:32 newf /rhs/brick6/mobi_hot: total 0 ---------T. 2 root root 0 Nov 23 19:39 newf /rhs/brick7/mobi_hot: total 0 So we saw that counters are working as expected.
Following was what i had seen VOl details [root@zod ~]# gluster v info mobi Volume Name: mobi Type: Tier Volume ID: b2493bb9-08ee-4b18-b78c-04aeabe377c2 Status: Started Number of Bricks: 8 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick1: yarrow:/rhs/brick7/mobi_hot Brick2: zod:/rhs/brick7/mobi_hot Brick3: yarrow:/rhs/brick6/mobi_hot Brick4: zod:/rhs/brick6/mobi_hot Cold Tier: Cold Tier Type : Distributed-Replicate Number of Bricks: 2 x 2 = 4 Brick5: zod:/rhs/brick1/mobi Brick6: yarrow:/rhs/brick1/mobi Brick7: zod:/rhs/brick2/mobi Brick8: yarrow:/rhs/brick2/mobi Options Reconfigured: cluster.read-freq-threshold: 4 cluster.write-freq-threshold: 4 features.record-counters: on features.ctr-enabled: on performance.readdir-ahead: on ===========> tier stats readings before the cycle of promote is started [root@zod ~]# gluster v tier mobi status;gluster v rebal mobi status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 3 0 in progress yarrow 0 5 in progress Tiering Migration Functionality: mobi: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 3 0Bytes 5 0 0 in progress 764.00 yarrow 5 0Bytes 5 0 0 in progress 761.00 volume rebalance: mobi: success [root@zod ~]# [root@zod ~]# [root@zod ~]# ############################################ [root@zod ~]# date Mon Nov 23 11:07:44 IST 2015 [root@zod ~]# ll /rhs/brick*/mobi* /rhs/brick1/mobi: total 4 -rw-r--r--. 2 root root 541 Nov 23 11:07 c2 /rhs/brick2/mobi: total 8 -rw-r--r--. 2 root root 541 Nov 23 11:04 c1 -rw-r--r--. 2 root root 46 Nov 23 11:04 h1 -rw-r--r--. 2 root root 0 Nov 23 2015 h2 /rhs/brick6/mobi_hot: total 0 /rhs/brick7/mobi_hot: total 0 ---------T. 2 root root 0 Nov 23 11:02 c2 =========>Heating file "c2" by appending files with cat command. Did that 5 times and can be seen in db [root@zod ~]# echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick1/mobi/.glusterfs/mobi.db; echo "=============ColdBrick#2 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick2/mobi/.glusterfs/mobi.db; echo ">>>>>>>>>>>> HOTBRICK#1 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick7/mobi_hot/.glusterfs/mobi_hot.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick6/mobi_hot/.glusterfs/mobi_hot.db ===========Date===================== Mon Nov 23 11:07:50 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|1448257054|993447|0|0|0|0|0|0|5|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 =============ColdBrick#2 ========= da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|1|1 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|1|1 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== [root@zod ~]# [root@zod ~]# [root@zod ~]# echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick1/mobi/.glusterfs/mobi.db; echo "=============ColdBrick#2 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick2/mobi/.glusterfs/mobi.db; echo ">>>>>>>>>>>> HOTBRICK#1 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick7/mobi_hot/.glusterfs/mobi_hot.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick6/mobi_hot/.glusterfs/mobi_hot.db ===========Date===================== Mon Nov 23 11:07:56 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|1448257054|993447|0|0|0|0|0|0|5|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 =============ColdBrick#2 ========= da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|1|1 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|1|1 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== [root@zod ~]# echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick1/mobi/.glusterfs/mobi.db; echo "=============ColdBrick#2 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick2/mobi/.glusterfs/mobi.db; echo ">>>>>>>>>>>> HOTBRICK#1 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick7/mobi_hot/.glusterfs/mobi_hot.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick6/mobi_hot/.glusterfs/mobi_hot.db ===========Date===================== Mon Nov 23 11:07:57 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|1448257054|993447|0|0|0|0|0|0|5|0 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 =============ColdBrick#2 ========= da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|1|1 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|1|1 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== [root@zod ~]# echo "===========Date=====================";date; echo "=============ColdBrick#1 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick1/mobi/.glusterfs/mobi.db; echo "=============ColdBrick#2 =========" ; echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick2/mobi/.glusterfs/mobi.db; echo ">>>>>>>>>>>> HOTBRICK#1 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick7/mobi_hot/.glusterfs/mobi_hot.db;echo ">>>>>>>>>>>> HOTBRICK#2 <<<<<<<<==";echo "select * from gf_file_tb; select * from gf_flink_tb;" | sqlite3 /rhs/brick6/mobi_hot/.glusterfs/mobi_hot.db ;ll /rhs/brick*/mobi* ============>NEW CYCLE STARTED AND FILE IS MOVED BACK TO COLD: ===========Date===================== Mon Nov 23 11:08:10 IST 2015 =============ColdBrick#1 ========= 3aca3400-b6b1-495d-a433-1c094762358b|0|0|0|0|0|0|0|0|1|1 3aca3400-b6b1-495d-a433-1c094762358b|00000000-0000-0000-0000-000000000001|c2|/c2|0|0 =============ColdBrick#2 ========= da996a66-5ebe-44f8-8482-5cb5eea52c97|0|0|0|0|0|0|0|0|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|0|0|0|0|0|0|0|0|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|0|0|0|0|0|0|0|0|0|0 da996a66-5ebe-44f8-8482-5cb5eea52c97|00000000-0000-0000-0000-000000000001|h2|/h2|0|0 bf2accc6-035b-4af5-95eb-06e6c8106263|00000000-0000-0000-0000-000000000001|c1|/c1|0|0 f697a256-bf6d-45d5-b483-4711c7515e18|00000000-0000-0000-0000-000000000001|h1|/h1|0|0 >>>>>>>>>>>> HOTBRICK#1 <<<<<<<<== >>>>>>>>>>>> HOTBRICK#2 <<<<<<<<== /rhs/brick1/mobi: total 4 -rw-r--r--. 2 root root 541 Nov 23 11:07 c2 /rhs/brick2/mobi: total 8 -rw-r--r--. 2 root root 541 Nov 23 11:04 c1 -rw-r--r--. 2 root root 46 Nov 23 11:04 h1 -rw-r--r--. 2 root root 0 Nov 23 2015 h2 /rhs/brick6/mobi_hot: total 0 /rhs/brick7/mobi_hot: total 0 [root@zod ~]# See the tier status(it can be seen that the count promote and demote has been increased by one, as the same file got promoted and demoted) [root@zod ~]# gluster v tier mobi status;gluster v rebal mobi status Node Promoted files Demoted files Status --------- --------- --------- --------- localhost 4 0 in progress yarrow 0 6 in progress Tiering Migration Functionality: mobi: success Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 4 0Bytes 6 0 0 in progress 844.00 yarrow 6 0Bytes 6 0 0 in progress 841.00 volume rebalance: mobi: success [root@zod ~]# [root@zod ~]# [root@zod ~]# [root@zod ~]# [root@zod ~]# [root@zod ~]# [root@zod ~]# [root@zod ~]#
the logs I put up are previous to what joseph published.
After further investigation we found that time of both the servers participating in the cluster are not in sync. This causes the ping pong effect. This issue NOT particular frequency counter but also occurs without it. Its important for all the gluster servers to be in sync via NTP. Requesting Nag to test the above with the servers synced with by a NTP server.
QE has reported at the 'RHGS 3.1.2 - Tiering Bugs - Triage & Status Check' meeting on 30 November 2015, that the reported issue is no longer reproducible on a re-test, with the requirement that all RHGS servers are to be synced with by an NTP server being followed. Based on this, the BZ is being CLOSED