Description of problem: After performing detach tier start, glusterd log is flooded with "[socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now" for every 3 seconds. Version-Release number of selected component (if applicable): 3.8.4-51 How reproducible: always Steps to Reproduce: 1. Create and start disperse volume 2. Mount the volume and write some data 3. attach X2 replica as hot tier to volume 4. Perform detach tier start Actual results: Functionality wise it is working fine but glusterd log is flooded with Info messages for every 3 seconds Expected results: continuous "EPOLLERR - disconnecting now" should not be seen in glusterd log. Additional info:
Partial RCA: The defrag variable being shared between tier process and detach process, doesn't cause the issue as per the first suspect. I can see it be fine in an older version of downstream (3.8.0) where the tier process and detach process share the defrag variable with the downstream code (3.8.4-51) I can see a disconnect but I don't see a connect. may be that's why its still keeps trying to connect. I need to look further to understand why there is this change. and why we dont get a RPC connect with the current code (3.8.4-51).
Hi, The ablove issue is not reproducible with the downstream version 3.4.0. Things work fine. Is it necessary to take a look at this with the issue being fixed in 3.4.0? Regards, Hari.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607