1229270 – tiering: tier daemon not restarting during volume/glusterd restart

Bug 1229270 - tiering: tier daemon not restarting during volume/glusterd restart

Summary: tiering: tier daemon not restarting during volume/glusterd restart

Keywords:
Status:	CLOSED DUPLICATE of bug 1276245
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	tier
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Mohammed Rafi KC
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1229271 (view as bug list)
Depends On:	994405 1225330 1233151 1235202 1265890 1273354
Blocks:
TreeView+	depends on / blocked

Reported:	2015-06-08 10:53 UTC by Nag Pavan Chilakam
Modified:	2018-11-30 05:44 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:	1225330
Environment:
Last Closed:	2015-11-24 06:13:54 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Nag Pavan Chilakam 2015-06-08 10:53:10 UTC

+++ This bug was initially created as a clone of Bug #1225330 +++

Description of problem:

tier daemon should always run on the node to promote/demote the files, but when volume is stopped , we will stop the daemon, but when start the volume the daemon should also start. Same case for glusterd restart after tier daemon went offline

Version-Release number of selected component (if applicable):


How reproducible:

100%

Steps to Reproduce:
1.create a tiered volume
2.stop the volume
3.start the volume
4.check for the tier process

Actual results:

tier daemon was not running

Expected results:

volume restart should run the rebalance again

Additional info:

--- Additional comment from Anand Avati on 2015-05-27 03:14:14 EDT ---

REVIEW: http://review.gluster.org/10933 (glusterd/tier: configure tier daemon during volume restart) posted (#1) for review on master by mohammed rafi  kc (rkavunga)

--- Additional comment from Anand Avati on 2015-05-27 03:17:59 EDT ---

REVIEW: http://review.gluster.org/10933 (glusterd/tier: configure tier daemon during volume restart) posted (#2) for review on master by mohammed rafi  kc (rkavunga)

--- Additional comment from Anand Avati on 2015-05-29 03:42:45 EDT ---

REVIEW: http://review.gluster.org/10933 (glusterd/tier: configure tier daemon during volume restart) posted (#3) for review on master by mohammed rafi  kc (rkavunga)

--- Additional comment from Mohammed Rafi KC on 2015-06-03 10:52:28 EDT ---

apart from http://review.gluster.org/10933, it requires one more fix

Comment 2 Joseph Elwin Fernandes 2015-06-10 09:12:27 UTC

*** Bug 1229271 has been marked as a duplicate of this bug. ***

Comment 3 Mohammed Rafi KC 2015-06-10 13:59:51 UTC

upstream patch : http://review.gluster.org/#/c/10933/

Comment 6 RamaKasturi 2015-11-20 06:33:11 UTC

I am seeing the above mentioned issue with build glusterfs-3.7.5-6.el7rhgs.x86_64.

Following are the steps i performed:

1) Had a tiered volume in the system.

2) stopped the volume.

3) started the volume again.

4) when i check the gluster vol tier <vol_name> status , it displays the following output.

[root@rhs-client2 ~]# gluster vol tier vol_tier status
Node                 Promoted files       Demoted files        Status              
---------            ---------            ---------            ---------           
localhost            1                    0                    failed              
10.70.36.62          0                    1                    in progress         
Tiering Migration Functionality: vol_tier: success

Tier daemon fails to start on the node from where the volume is stopped.

I do not see the pid under the folder "/var/lib/glusterd/vols/vol_tier/tier" 

[root@rhs-client2 tier]# ls -l
total 0


once the volume is started forcefully, i can see that tier daemon starts to run.

So, reopening this bug.

Comment 7 RamaKasturi 2015-11-20 07:24:33 UTC

output of gluster volume info :
===============================

[root@rhs-client2 tier]# gluster vol info
 
Volume Name: vol_tier
Type: Tier
Volume ID: 0093a2a0-7ac1-4319-9a57-f125190db6a9
Status: Started
Number of Bricks: 14
Transport-type: tcp
Hot Tier :
Hot Tier Type : Replicate
Number of Bricks: 1 x 2 = 2
Brick1: rhs-client38.lab.eng.blr.redhat.com:/bricks/brick6/b14
Brick2: rhs-client2.lab.eng.blr.redhat.com:/bricks/brick6/b13
Cold Tier:
Cold Tier Type : Distributed-Disperse
Number of Bricks: 2 x (4 + 2) = 12
Brick3: rhs-client2.lab.eng.blr.redhat.com:/bricks/brick0/b1
Brick4: rhs-client38.lab.eng.blr.redhat.com:/bricks/brick0/b2
Brick5: rhs-client2.lab.eng.blr.redhat.com:/bricks/brick1/b3
Brick6: rhs-client38.lab.eng.blr.redhat.com:/bricks/brick1/b4
Brick7: rhs-client2.lab.eng.blr.redhat.com:/bricks/brick2/b5
Brick8: rhs-client38.lab.eng.blr.redhat.com:/bricks/brick2/b6
Brick9: rhs-client2.lab.eng.blr.redhat.com:/bricks/brick3/b7
Brick10: rhs-client38.lab.eng.blr.redhat.com:/bricks/brick3/b8
Brick11: rhs-client2.lab.eng.blr.redhat.com:/bricks/brick4/b9
Brick12: rhs-client38.lab.eng.blr.redhat.com:/bricks/brick4/b10
Brick13: rhs-client2.lab.eng.blr.redhat.com:/bricks/brick5/b11
Brick14: rhs-client38.lab.eng.blr.redhat.com:/bricks/brick5/b12
Options Reconfigured:
performance.readdir-ahead: on
features.ctr-enabled: on
cluster.tier-promote-frequency: 240
cluster.tier-demote-frequency: 240
features.bitrot: on
features.scrub: Active


output of gluster volume status:
================================

[root@rhs-client2 tier]# gluster volume status
Status of volume: vol_tier
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Hot Bricks:
Brick rhs-client38.lab.eng.blr.redhat.com:/
bricks/brick6/b14                           49167     0          Y       19767
Brick rhs-client2.lab.eng.blr.redhat.com:/b
ricks/brick6/b13                            49169     0          Y       20074
Cold Bricks:
Brick rhs-client2.lab.eng.blr.redhat.com:/b
ricks/brick0/b1                             49163     0          Y       20092
Brick rhs-client38.lab.eng.blr.redhat.com:/
bricks/brick0/b2                            49161     0          Y       19785
Brick rhs-client2.lab.eng.blr.redhat.com:/b
ricks/brick1/b3                             49164     0          Y       20110
Brick rhs-client38.lab.eng.blr.redhat.com:/
bricks/brick1/b4                            49162     0          Y       19803
Brick rhs-client2.lab.eng.blr.redhat.com:/b
ricks/brick2/b5                             49165     0          Y       20128
Brick rhs-client38.lab.eng.blr.redhat.com:/
bricks/brick2/b6                            49163     0          Y       19821
Brick rhs-client2.lab.eng.blr.redhat.com:/b
ricks/brick3/b7                             49166     0          Y       20146
Brick rhs-client38.lab.eng.blr.redhat.com:/
bricks/brick3/b8                            49164     0          Y       19839
Brick rhs-client2.lab.eng.blr.redhat.com:/b
ricks/brick4/b9                             49167     0          Y       20164
Brick rhs-client38.lab.eng.blr.redhat.com:/
bricks/brick4/b10                           49165     0          Y       19857
Brick rhs-client2.lab.eng.blr.redhat.com:/b
ricks/brick5/b11                            49168     0          Y       20182
Brick rhs-client38.lab.eng.blr.redhat.com:/
bricks/brick5/b12                           49166     0          Y       19875
NFS Server on localhost                     2049      0          Y       20355
Self-heal Daemon on localhost               N/A       N/A        Y       20363
Bitrot Daemon on localhost                  N/A       N/A        Y       20371
Scrubber Daemon on localhost                N/A       N/A        Y       20383
NFS Server on 10.70.36.62                   2049      0          Y       20041
Self-heal Daemon on 10.70.36.62             N/A       N/A        Y       20049
Bitrot Daemon on 10.70.36.62                N/A       N/A        Y       20057
Scrubber Daemon on 10.70.36.62              N/A       N/A        Y       20068
 
Task Status of Volume vol_tier
------------------------------------------------------------------------------
Task                 : Tier migration      
ID                   : ab8e4cb8-b79b-4b85-b673-1e04e3af42b7
Status               : in progress

Comment 8 RamaKasturi 2015-11-20 07:32:18 UTC

sos reports can be found at the link below:

http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1229270/

Comment 9 Mohammed Rafi KC 2015-11-20 12:08:16 UTC

Tier daemon tried to start during volume start, but failed since the brick was not up at the moment. Will be putting a fix soon.

Comment 10 Mohammed Rafi KC 2015-11-24 06:13:54 UTC


*** This bug has been marked as a duplicate of bug 1276245 ***

Note You need to log in before you can comment on or make changes to this bug.