Description of problem: No demotions are seen on a 16 node setup after a series of failure. Please refer steps to reproduce to look at the steps followed to end up in this issue. No errors related to demotion are seen in the system. [root@dhcp37-101 glusterfs]# ll /var/run/gluster/krk-vol-tier-dht/ total 508 -rw-r--r--. 1 root root 364593 Jan 29 14:56 demotequeryfile-krk-vol-tier-dht.err -rw-r--r--. 1 root root 149210 Jan 30 14:30 promotequeryfile-krk-vol-tier-dht.err Volume Name: krk-vol Type: Tier Volume ID: 192655ce-4ef6-4ada-8e0c-6f137e2721e1 Status: Started Number of Bricks: 36 Transport-type: tcp Hot Tier : Hot Tier Type : Distributed-Replicate Number of Bricks: 6 x 2 = 12 Brick1: 10.70.37.101:/rhs/brick6/krkvol Brick2: 10.70.35.163:/rhs/brick6/krkvol Brick3: 10.70.35.173:/rhs/brick6/krkvol Brick4: 10.70.35.232:/rhs/brick6/krkvol Brick5: 10.70.35.176:/rhs/brick6/krkvol Brick6: 10.70.35.231:/rhs/brick6/krkvol Brick7: 10.70.35.44:/rhs/brick6/krkvol Brick8: 10.70.37.195:/rhs/brick6/krkvol Brick9: 10.70.37.202:/rhs/brick6/krkvol Brick10: 10.70.37.120:/rhs/brick6/krkvol Brick11: 10.70.37.60:/rhs/brick6/krkvol Brick12: 10.70.37.69:/rhs/brick6/krkvol Cold Tier: Cold Tier Type : Distributed-Disperse Number of Bricks: 2 x (8 + 4) = 24 Brick13: 10.70.35.176:/rhs/brick5/krkvol Brick14: 10.70.35.232:/rhs/brick5/krkvol Brick15: 10.70.35.173:/rhs/brick5/krkvol Brick16: 10.70.35.163:/rhs/brick5/krkvol Brick17: 10.70.37.101:/rhs/brick5/krkvol Brick18: 10.70.37.69:/rhs/brick5/krkvol Brick19: 10.70.37.60:/rhs/brick5/krkvol Brick20: 10.70.37.120:/rhs/brick5/krkvol Brick21: 10.70.37.202:/rhs/brick4/krkvol Brick22: 10.70.37.195:/rhs/brick4/krkvol Brick23: 10.70.35.155:/rhs/brick4/krkvol Brick24: 10.70.35.222:/rhs/brick4/krkvol Brick25: 10.70.35.108:/rhs/brick4/krkvol Brick26: 10.70.35.44:/rhs/brick4/krkvol Brick27: 10.70.35.89:/rhs/brick4/krkvol Brick28: 10.70.35.231:/rhs/brick4/krkvol Brick29: 10.70.35.176:/rhs/brick4/krkvol Brick30: 10.70.35.232:/rhs/brick4/krkvol Brick31: 10.70.35.173:/rhs/brick4/krkvol Brick32: 10.70.35.163:/rhs/brick4/krkvol Brick33: 10.70.37.101:/rhs/brick4/krkvol Brick34: 10.70.37.69:/rhs/brick4/krkvol Brick35: 10.70.37.60:/rhs/brick4/krkvol Brick36: 10.70.37.120:/rhs/brick4/krkvol Options Reconfigured: cluster.tier-demote-frequency: 300 cluster.watermark-hi: 60 cluster.watermark-low: 50 cluster.min-free-disk: 20 performance.write-behind: off performance.open-behind: off performance.read-ahead: off performance.io-cache: off features.quota-deem-statfs: off features.inode-quota: on features.quota: on performance.readdir-ahead: on features.record-counters: on cluster.write-freq-threshold: 1 cluster.read-freq-threshold: 1 cluster.tier-max-files: 10000 diagnostics.client-log-level: INFO features.ctr-enabled: on cluster.tier-mode: cache features.uss: on Version-Release number of selected component (if applicable): glusterfs-3.7.5-17.el7rhgs.x86_64 How reproducible: Yet to determine Steps to Reproduce: 1. On a tiered volume, promote so many files so current disk usage is near high watermark and leave the system in the same state with continuous file heating for 12+ hrs --> Started at apprx Jan 30 15:30:00 IST 2. Kill all brick process in hot tier --> induced approx at Jan 31 09:30:00 IST 2016 3. restart glusterd on all nodes hosting hot tier --> induced approx at Jan 31 17:00:00 IST 4. restart tier volume 5. Reduce high watermark way below current disk usage level Actual results: No demotions are seen Expected results: Demotions should happen immediately after high watermark is breached Additional info: sosreports shall be attached
After volume restarted, tier daemon will also restart, which will trigger a fix-layout. Based on volume size fix-layout can take hours . Here the tier daemon still doing the fix-layout. After finishing fix-layout, promotion/demotion should start.
*** This bug has been marked as a duplicate of bug 1294790 ***