REVIEW: http://review.gluster.org/15308 (glusterd: clean up old port and allocate new one on every restart) posted (#1) for review on release-3.8 by Avra Sengupta (asengupt)
Based on the discussion with Avra looks like root cause for the below scenario is same. Rebooted one of the cluster node having 1*2 volume with snapd enabled, after reboot snapd didn't cameup on of the cluster node. snapd log from the node where snapd didn't come up is: ======================================================= [2016-08-24 10:16:26.486745] E [socket.c:771:__socket_server_bind] 0-tcp.Dis-server: binding to failed: Address already in use [2016-08-24 10:16:26.486753] E [socket.c:774:__socket_server_bind] 0-tcp.Dis-server: Port is already in use [2016-08-24 10:16:26.486762] W [rpcsvc.c:1630:rpcsvc_create_listener] 0-rpc-service: listening on transport failed [2016-08-24 10:16:26.486770] W [MSGID: 115045] [server.c:1061:init] 0-Dis-server: creation of listener failed [2016-08-24 10:16:26.486778] E [MSGID: 101019] [xlator.c:433:xlator_init] 0-Dis-server: Initialization of volume 'Dis-server' failed, review your volfile again [2016-08-24 10:16:26.486785] E [MSGID: 101066] [graph.c:324:glusterfs_graph_init] 0-Dis-server: initializing translator failed [2016-08-24 10:16:26.486791] E [MSGID: 101176] [graph.c:670:glusterfs_graph_activate] 0-graph: init failed [2016-08-24 10:16:26.487110] W [glusterfsd.c:1286:cleanup_and_exit] (-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x3c1) [0x7fd3f02bfd51] -->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x172) [0x7fd3f02ba542] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x6b) [0x7fd3f02b9abb] ) 0-: received signum (98), shutting down (END)
REVIEW: http://review.gluster.org/15308 (glusterd: clean up old port and allocate new one on every restart) posted (#2) for review on release-3.8 by Avra Sengupta (asengupt)
All 3.8.x bugs are now reported against version 3.8 (without .x). For more information, see http://www.gluster.org/pipermail/gluster-devel/2016-September/050859.html
COMMIT: http://review.gluster.org/15308 committed in release-3.8 by Niels de Vos (ndevos) ------ commit 394c654cd26f232ed493442a5858017be0518b28 Author: Atin Mukherjee <amukherj> Date: Mon Jul 25 19:09:08 2016 +0530 glusterd: clean up old port and allocate new one on every restart Backport of http://review.gluster.org/#/c/15005/9. GlusterD as of now was blindly assuming that the brick port which was already allocated would be available to be reused and that assumption is absolutely wrong. Solution : On first attempt, we thought GlusterD should check if the already allocated brick ports are free, if not allocate new port and pass it to the daemon. But with that approach there is a possibility that if PMAP_SIGNOUT is missed out, the stale port will be given back to the clients where connection will keep on failing. Now given the port allocation always start from base_port, if everytime a new port has to be allocated for the daemons, the port range will still be under control. So this fix tries to clean up old port using pmap_registry_remove () if any and then goes for pmap_registry_alloc () This patch is being ported to 3.8 branch because, the brick process blindly re-using old port, without registering with the pmap server, causes snapd daemon to not start properly, even though snapd registers with the pmap server. With this patch, all the brick processes and snapd will register with the pmap server to either get the same port, or a new port, and avoid port collision. > Reviewed-on: http://review.gluster.org/15005 > Smoke: Gluster Build System <jenkins.org> > NetBSD-regression: NetBSD Build System <jenkins.org> > CentOS-regression: Gluster Build System <jenkins.org> > Reviewed-by: Avra Sengupta <asengupt> (cherry picked from commit c3dee6d35326c6495591eb5bbf7f52f64031e2c4) Change-Id: If54a055d01ab0cbc06589dc1191d8fc52eb2c84f BUG: 1369766 Signed-off-by: Atin Mukherjee <amukherj> Reviewed-on: http://review.gluster.org/15308 Tested-by: Avra Sengupta <asengupt> NetBSD-regression: NetBSD Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Smoke: Gluster Build System <jenkins.org> Reviewed-by: Niels de Vos <ndevos>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.6, please open a new bug report. glusterfs-3.8.6 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://www.gluster.org/pipermail/packaging/2016-November/000217.html [2] https://www.gluster.org/pipermail/gluster-users/
REVIEW: http://review.gluster.org/16234 (glusterd: clean up old port and allocate new one on every restart) posted (#1) for review on release-3.8-fb by Kevin Vigor (kvigor)