Description of problem: ======================= Post rebalance completion (remove-brick or add-brick) observed following info messages every 3 secs: [root@dhcp37-64 ~]# tailf /var/log/glusterfs/glusterd.log [2017-08-22 08:54:55.763095] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-22 08:54:58.763920] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-22 08:55:01.764697] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-22 08:55:04.765471] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-22 08:55:07.766176] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now [2017-08-22 08:55:10.766886] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now Currently in about a day we have around 23k lines and it keeps increasing. Eventually this would leave the systems /var partition out of space. [root@dhcp37-64 ~]# grep -ri "EPOLLERR - disconnecting now" /var/log/glusterfs/glusterd.log | wc -l 23263 [root@dhcp37-64 ~]# Version-Release number of selected component (if applicable): ============================================================= mainline How reproducible: ================= Always Steps to Reproduce: =================== 1. Create 3x2 volume and write data to it 2. Remove brick start to make it 2x2 3. Once rebalance is completed, do commit. 4. Monitor the glusterd log file Actual results: =============== EPOLLERR error message comes every 3 secs.
REVIEW: https://review.gluster.org/18117 (glusterd: disable rpc_clnt_t after relalance process disconnection) posted (#1) for review on release-3.12 by Atin Mukherjee (amukherj)
COMMIT: https://review.gluster.org/18117 committed in release-3.12 by Shyamsundar Ranganathan (srangana) ------ commit bace1dd564f401f904dac6b965299f77228e4b1d Author: Milind Changire <mchangir> Date: Thu Aug 24 12:39:47 2017 +0530 glusterd: disable rpc_clnt_t after relalance process disconnection Problem: glusterd continues to connect to rebalance process even after the socket connection has disconnected. Solution: rpc_clnt_disable() disables the rpc_clnt_t object and disarms all relevant timers and drops refs to the rpc_clnt_t object and the transport as well. >Reviewed-on: https://review.gluster.org/18114 >Reviewed-by: MOHIT AGRAWAL <moagrawa> >Tested-by: Atin Mukherjee <amukherj> >Reviewed-by: Atin Mukherjee <amukherj> >Smoke: Gluster Build System <jenkins.org> >CentOS-regression: Gluster Build System <jenkins.org> >(cherry picked from commit a894d44427649e99d4344a241dc2f9d584a9a691) Change-Id: I981d6f1cc0087037f1927062c2770a4d5026a619 BUG: 1484885 Signed-off-by: Milind Changire <mchangir> Reviewed-on: https://review.gluster.org/18117 Tested-by: Atin Mukherjee <amukherj> Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Shyamsundar Ranganathan <srangana>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report. glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html [2] https://www.gluster.org/pipermail/gluster-users/