1511767 – After detach tier start glusterd log flooded with "0-transport: EPOLLERR - disconnecting now" messages

Bug 1511767 - After detach tier start glusterd log flooded with "0-transport: EPOLLERR - disconnecting now" messages

Summary: After detach tier start glusterd log flooded with "0-transport: EPOLLERR - di...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	tier
Sub Component:
Version:	rhgs-3.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.4.0
Assignee:	hari gowtham
QA Contact:	Sweta Anandpara
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1503137
TreeView+	depends on / blocked

Reported:	2017-11-10 05:14 UTC by Bala Konda Reddy M
Modified:	2018-09-04 06:40 UTC (History)
CC List:	4 users (show)
Fixed In Version:	glusterfs-3.12.2-1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-09-04 06:39:09 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2018:2607	0	None	None	None	2018-09-04 06:40:27 UTC

Description Bala Konda Reddy M 2017-11-10 05:14:21 UTC

Description of problem:
After performing detach tier start, glusterd log is flooded with   "[socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now" for every 3 seconds.

Version-Release number of selected component (if applicable):
3.8.4-51

How reproducible:
always

Steps to Reproduce:
1. Create and start disperse volume 
2. Mount the volume and write some data
3. attach X2 replica as hot tier to volume
4. Perform detach tier start

Actual results:
Functionality wise it is working fine but glusterd log is flooded with Info messages for every 3 seconds 

Expected results:
continuous "EPOLLERR - disconnecting now" should not be seen in glusterd log. 

Additional info:

Comment 2 hari gowtham 2017-11-10 10:41:42 UTC

Partial RCA:
The defrag variable being shared between tier process and detach process, doesn't cause the issue as per the first suspect. I can see it be fine in an older version of downstream (3.8.0) where the tier process and detach process share the defrag variable

with the downstream code (3.8.4-51) I can see a disconnect but I don't see a connect. may be that's why its still keeps trying to connect. 

I need to look further to understand why there is this change. and why we dont get a RPC connect with the current code (3.8.4-51).

Comment 3 hari gowtham 2017-11-13 12:55:18 UTC

Hi,

The ablove issue is not reproducible with the downstream version 3.4.0.
Things work fine. Is it necessary to take a look at this with the issue being fixed in 3.4.0?

Regards,
Hari.

Comment 9 errata-xmlrpc 2018-09-04 06:39:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Note You need to log in before you can comment on or make changes to this bug.