Bug 1335154

Summary: [Tiering]: The message 'Max cycle time reached..exiting migration' incorrectly displayed as an 'error' in the logs
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Sweta Anandpara <sanandpa>
Component: tierAssignee: hari gowtham <hgowtham>
Status: CLOSED WONTFIX QA Contact: krishnaram Karthick <kramdoss>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rhgs-3.1CC: nbalacha, rhs-bugs
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1335973 1336470 1336472 (view as bug list) Environment:
Last Closed: 2018-11-08 19:04:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1335973, 1336470, 1336472    

Description Sweta Anandpara 2016-05-11 12:44:35 UTC
Description of problem:
========================
The default cycle when the tiering process tries to migrate files from the database is 2 mins. When the migration (of all the files present in database) does not complete in 120 secs (for various reasons), then the cycle/database gets reset. This is how it has been designed to work. 

So the message 'Existing migration due to max cycle time reached' should just be an INFO message, rather than getting displayed as an 'ERROR'


Version-Release number of selected component (if applicable):
=============================================================
3.7.9-3


How reproducible:
==================
Have hit it twice


Steps to Reproduce:
===================
1. Have about 10 large files (say, of size 2G) present in the cold tier. 
2. Access the files (5-10 of them), so that triggering them to get migrated to hot tier
3. Verify the messages displayed in /var/log/messages/<volname>-tier.log for the above mentioned error message

Actual results:
=================

[2016-05-09 12:47:46.894803] I 
[glusterfsd-mgmt.c:1600:mgmt_getspec_cbk] 0-glusterfs: No change in 
volfile, continuing
The message "I [MSGID: 109103] [dht-shared.c:469:dht_reconfigure] 
0-DHT: conf->dthrottle: normal, conf->defrag->recon_thread_count: 2" 
repeated 5 times between [2016-05-09 12:47:29.711139] and [2016-05-09 
12:47:46.894411]
[2016-05-09 13:04:20.076182] E [MSGID: 109038] 
[tier.c:532:tier_migrate_using_query_file] 0-ozone-tier-dht: Max cycle 
time reached. Exiting migration.
[2016-05-09 22:02:01.298825] I [MSGID: 100011] 
[glusterfsd.c:1323:reincarnate] 0-glusterfsd: Fetching the volume file 
from server...


Expected results:
==================
The said message should be displayed as INFO 


Additional info:
=================

[2016-05-11 07:01:39.194473] W [MSGID: 122056] [ec-combine.c:866:ec_combine_check] 0-ozone-disperse-0: Mismatching xdata in answers of 'LOOKUP'
[2016-05-11 07:04:18.664244] E [MSGID: 109038] [tier.c:532:tier_migrate_using_query_file] 0-ozone-tier-dht: Max cycle time reached. Exiting migration.
[2016-05-11 07:04:19.682213] I [dht-rebalance.c:3616:gf_defrag_start_crawl] 0-DHT: crawling file-system completed

[root@dhcp47-188 ~]# 
[root@dhcp47-188 ~]# rpm -qa | grep gluster
glusterfs-libs-3.7.9-3.el7rhgs.x86_64
glusterfs-api-devel-3.7.9-3.el7rhgs.x86_64
vdsm-gluster-4.16.30-1.3.el7rhgs.noarch
glusterfs-cli-3.7.9-3.el7rhgs.x86_64
glusterfs-client-xlators-3.7.9-3.el7rhgs.x86_64
glusterfs-server-3.7.9-3.el7rhgs.x86_64
glusterfs-rdma-3.7.9-3.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-api-3.7.9-3.el7rhgs.x86_64
glusterfs-devel-3.7.9-3.el7rhgs.x86_64
glusterfs-debuginfo-3.7.9-3.el7rhgs.x86_64
gluster-nagios-addons-0.2.6-1.el7rhgs.x86_64
glusterfs-fuse-3.7.9-3.el7rhgs.x86_64
glusterfs-3.7.9-3.el7rhgs.x86_64
glusterfs-geo-replication-3.7.9-3.el7rhgs.x86_64
[root@dhcp47-188 ~]# 
[root@dhcp47-188 ~]# 
[root@dhcp47-188 ~]# gluster v info ozone
 
Volume Name: ozone
Type: Tier
Volume ID: ba0cfdd6-4d28-416b-b628-fdfb9ac317a4
Status: Started
Number of Bricks: 16
Transport-type: tcp
Hot Tier :
Hot Tier Type : Distribute
Number of Bricks: 4
Brick1: 10.70.46.193:/brick/brick4/ozone
Brick2: 10.70.46.187:/brick/brick4/ozone
Brick3: 10.70.46.215:/brick/brick4/ozone
Brick4: 10.70.47.188:/brick/brick4/ozone
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick5: 10.70.47.188:/brick/brick1/ozone
Brick6: 10.70.46.215:/brick/brick1/ozone
Brick7: 10.70.46.187:/brick/brick1/ozone
Brick8: 10.70.46.193:/brick/brick1/ozone
Brick9: 10.70.47.188:/brick/brick2/ozone
Brick10: 10.70.46.215:/brick/brick2/ozone
Brick11: 10.70.46.187:/brick/brick2/ozone
Brick12: 10.70.46.193:/brick/brick2/ozone
Brick13: 10.70.47.188:/brick/brick3/ozone
Brick14: 10.70.46.215:/brick/brick3/ozone
Brick15: 10.70.46.187:/brick/brick3/ozone
Brick16: 10.70.46.193:/brick/brick3/ozone
Options Reconfigured:
cluster.watermark-hi: 5
cluster.watermark-low: 2
cluster.tier-mode: cache
features.ctr-enabled: on
performance.readdir-ahead: on
[root@dhcp47-188 ~]# 
[root@dhcp47-188 ~]# 
[root@dhcp47-188 ~]# gluster pool list
UUID					Hostname    	State
1bb3d70d-dbb0-4dd7-9a4d-ae33564ef226	10.70.46.215	Connected 
60b85677-44a0-413f-9200-7516c9b88006	10.70.46.187	Connected 
34a7a230-1513-4244-92b6-47fd17cd7f37	10.70.46.193	Connected 
d8339859-b7e5-4683-9e53-00e34a3d090d	localhost   	Connected 
[root@dhcp47-188 ~]#

Comment 4 Nithya Balachandran 2016-05-16 08:30:43 UTC
Patch posted on upstream master: http://review.gluster.org/#/c/14336/

Comment 6 hari gowtham 2018-11-08 19:04:36 UTC
As tier is not being actively developed, I'm closing this bug. Feel free to open it if necessary.