1323564 – [scale] Brick process does not start after node reboot

Bug 1323564 - [scale] Brick process does not start after node reboot

Summary: [scale] Brick process does not start after node reboot

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	3.7.10
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Atin Mukherjee
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1322306 1322805 1333711
Blocks:
TreeView+	depends on / blocked

Reported:	2016-04-04 05:57 UTC by Atin Mukherjee
Modified:	2016-06-28 12:13 UTC (History)
CC List:	5 users (show)
Fixed In Version:	glusterfs-3.7.12
Clone Of:	1322805
Environment:
Last Closed:	2016-06-28 12:13:55 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Atin Mukherjee 2016-04-04 05:57:04 UTC

+++ This bug was initially created as a clone of Bug #1322805 +++

+++ This bug was initially created as a clone of Bug #1322306 +++

Description of problem:
Brick process does not start automatically after reboot of a node.

Failed to start bricks using start force. It worked after running it 2-3 times.

Setup:

4 node cluster running 120 dist-rep [2 x 2] volumes.

I have seen this issue with only 3 volumes, vol4, vol31, vol89. Though there are other volumes with bricks running on same node.

Version-Release number of selected component (if applicable):
3.7.5-19

How reproducible:
Not always, haven't faced this issue till 100 volume. 

Steps to Reproduce:
1. Reboot a Gluster node
2. Check the status of gluster volume 

Actual results:
Brick process is not running

Expected results:
Brick process should come up after node reboot

Additional info:
Will attach sosreport and setup details

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-03-30 06:38:39 EDT ---

This bug is automatically being proposed for the current z-stream release of Red Hat Gluster Storage 3 by setting the release flag 'rhgs‑3.1.z' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from SATHEESARAN on 2016-03-30 08:14:52 EDT ---

Neha,

Could you attach sosreports from both the nodes for analysis ?

--- Additional comment from Neha on 2016-03-31 01:00:51 EDT ---

Atin has already checked the setup. After reboot other process is consuming the port assigned to brick process. 

Brick logs:

[2016-03-30 12:21:33.014717] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.5 (args: /usr/sbin/glusterfsd -s 10.70.36.3 --volfile-id vol6.10.70.36.3.var-lib-heketi-mounts-vg_1d43e80bda3b78bff5f2bd4e788c5bb3-brick_24db93a1ba7c2021d30db0bb7523f1d4-brick -p /var/lib/glusterd/vols/vol6/run/10.70.36.3-var-lib-heketi-mounts-vg_1d43e80bda3b78bff5f2bd4e788c5bb3-brick_24db93a1ba7c2021d30db0bb7523f1d4-brick.pid -S /var/run/gluster/4d936da2aa9f690e51cb86f5cf49740a.socket --brick-name /var/lib/heketi/mounts/vg_1d43e80bda3b78bff5f2bd4e788c5bb3/brick_24db93a1ba7c2021d30db0bb7523f1d4/brick -l /var/log/glusterfs/bricks/var-lib-heketi-mounts-vg_1d43e80bda3b78bff5f2bd4e788c5bb3-brick_24db93a1ba7c2021d30db0bb7523f1d4-brick.log --xlator-option *-posix.glusterd-uuid=b5b78ebd-94f4-4a96-a9ba-6621e730a411 --brick-port 49161 --xlator-option vol6-server.listen-port=49161)
[2016-03-30 12:21:33.022880] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2016-03-30 12:21:48.059423] I [graph.c:269:gf_add_cmdline_options] 0-vol6-server: adding option 'listen-port' for volume 'vol6-server' with value '49161'
[2016-03-30 12:21:48.059447] I [graph.c:269:gf_add_cmdline_options] 0-vol6-posix: adding option 'glusterd-uuid' for volume 'vol6-posix' with value 'b5b78ebd-94f4-4a96-a9ba-6621e730a411'
[2016-03-30 12:21:48.059643] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2
[2016-03-30 12:21:48.059666] I [MSGID: 115034] [server.c:403:_check_for_auth_option] 0-/var/lib/heketi/mounts/vg_1d43e80bda3b78bff5f2bd4e788c5bb3/brick_24db93a1ba7c2021d30db0bb7523f1d4/brick: skip format check for non-addr auth option auth.login./var/lib/heketi/mounts/vg_1d43e80bda3b78bff5f2bd4e788c5bb3/brick_24db93a1ba7c2021d30db0bb7523f1d4/brick.allow
[2016-03-30 12:21:48.059687] I [MSGID: 115034] [server.c:403:_check_for_auth_option] 0-/var/lib/heketi/mounts/vg_1d43e80bda3b78bff5f2bd4e788c5bb3/brick_24db93a1ba7c2021d30db0bb7523f1d4/brick: skip format check for non-addr auth option auth.login.11c450ad-efc3-49a8-952f-68f8b37eb539.password
[2016-03-30 12:21:48.060695] I [rpcsvc.c:2215:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2016-03-30 12:21:48.060762] W [MSGID: 101002] [options.c:957:xl_opt_validate] 0-vol6-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction
[2016-03-30 12:21:48.060850] E [socket.c:769:__socket_server_bind] 0-tcp.vol6-server: binding to  failed: Address already in use
[2016-03-30 12:21:48.060861] E [socket.c:772:__socket_server_bind] 0-tcp.vol6-server: Port is already in use
[2016-03-30 12:21:48.060871] W [rpcsvc.c:1604:rpcsvc_transport_create] 0-rpc-service: listening on transport failed
[2016-03-30 12:21:48.060877] W [MSGID: 115045] [server.c:1060:init] 0-vol6-server: creation of listener failed
[2016-03-30 12:21:48.060892] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-vol6-server: Initialization of volume 'vol6-server' failed, review your volfile again
[2016-03-30 12:21:48.060898] E [graph.c:322:glusterfs_graph_init] 0-vol6-server: initializing translator failed
[2016-03-30 12:21:48.060902] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2016-03-30 12:21:48.061387] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterfsd(mgmt_getspec_cbk+0x331) [0x7f07cc09e801] -->/usr/sbin/glusterfsd(glusterfs_process_volfp+0x126) [0x7f07cc0991a6] -->/usr/sbin/glusterfsd(cleanup_and_exit+0x69) [0x7f07cc098789] ) 0-: received signum (0), shutting down

--- Additional comment from Vijay Bellur on 2016-03-31 07:15:11 EDT ---

REVIEW: http://review.gluster.org/13865 (glusterd: Do not persist brickinfo ports) posted (#2) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Vijay Bellur on 2016-03-31 08:56:58 EDT ---

REVIEW: http://review.gluster.org/13865 (glusterd: Allocate fresh port on brick (re)start) posted (#3) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Vijay Bellur on 2016-04-01 00:33:04 EDT ---

REVIEW: http://review.gluster.org/13865 (glusterd: Allocate fresh port on brick (re)start) posted (#4) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Vijay Bellur on 2016-04-01 02:37:02 EDT ---

REVIEW: http://review.gluster.org/13865 (glusterd: Allocate fresh port on brick (re)start) posted (#5) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Vijay Bellur on 2016-04-01 03:05:09 EDT ---

REVIEW: http://review.gluster.org/13865 (glusterd: Allocate fresh port on brick (re)start) posted (#6) for review on master by Atin Mukherjee (amukherj)

--- Additional comment from Vijay Bellur on 2016-04-01 16:39:01 EDT ---

COMMIT: http://review.gluster.org/13865 committed in master by Jeff Darcy (jdarcy) 
------
commit 34899d71f21fd2b4c523b68ffb2d7c655c776641
Author: Atin Mukherjee <amukherj>
Date:   Thu Mar 31 11:01:53 2016 +0530

    glusterd: Allocate fresh port on brick (re)start
    
    There is no point of using the same port through the entire volume life cycle
    for a particular bricks process since there is no guarantee that the same port
    would be free and no other application wouldn't consume it in between the
    glusterd/volume restart.
    
    We hit a race where on glusterd restart the daemon services start followed by
    brick processes and the time brick process tries to bind with the port which was
    allocated by glusterd before a restart is been already consumed by some other
    client like NFS/SHD/...
    
    Note : This is a short term solution as here we reduce the race window but don't
    eliminate it completely. As a long term solution the port allocation has to be
    done by glusterfsd and the same should be communicated back to glusterd for book
    keeping
    
    Change-Id: Ibbd1e7ca87e51a7cd9cf216b1fe58ef7783aef24
    BUG: 1322805
    Signed-off-by: Atin Mukherjee <amukherj>
    Reviewed-on: http://review.gluster.org/13865
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Jeff Darcy <jdarcy>

Comment 1 Vijay Bellur 2016-04-04 05:57:58 UTC

REVIEW: http://review.gluster.org/13896 (glusterd: Allocate fresh port on brick (re)start) posted (#1) for review on release-3.7 by Atin Mukherjee (amukherj)

Comment 2 Vijay Bellur 2016-04-29 13:14:15 UTC

REVIEW: http://review.gluster.org/14117 (glusterd: try to connect on GF_PMAP_PORT_FOREIGN aswell) posted (#1) for review on release-3.7 by Prasanna Kumar Kalever (pkalever)

Comment 3 Vijay Bellur 2016-05-01 08:28:21 UTC

REVIEW: http://review.gluster.org/14117 (glusterd: try to connect on GF_PMAP_PORT_FOREIGN aswell) posted (#2) for review on release-3.7 by Prasanna Kumar Kalever (pkalever)

Comment 4 Vijay Bellur 2016-05-02 03:53:08 UTC

COMMIT: http://review.gluster.org/14117 committed in release-3.7 by Atin Mukherjee (amukherj) 
------
commit dc6178714cd84d3de894d0972c20950b59d30017
Author: Prasanna Kumar Kalever <prasanna.kalever>
Date:   Sun May 1 13:53:47 2016 +0530

    glusterd: try to connect on GF_PMAP_PORT_FOREIGN aswell
    
    This patch fix couple of things mentioned below:
    
    1. previously we use to try to connect on only GF_PMAP_PORT_FREE
    in the pmap_registry_alloc(), it could happen that some foreign process
    would have freed the port by this time ?, hence it is worth giving a try on
    GF_PMAP_PORT_FOREIGN ports as well instead of wasting them all.
    
    2. fix pmap_registry_remove() to mark the port asGF_PMAP_PORT_FREE
    
    3. added useful comments on gf_pmap_port_type enum members
    
    Backport of:
    > Change-Id: Id2aa7ad55e76ae3fdece21bed15792525ae33fe1
    > BUG: 1322805
    > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever>
    > Reviewed-on: http://review.gluster.org/14080
    > Tested-by: Prasanna Kumar Kalever <pkalever>
    > Smoke: Gluster Build System <jenkins.com>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.com>
    > Reviewed-by: Atin Mukherjee <amukherj>
    
    Change-Id: Ib7852aa1c55611e81c78341aace4d374d516f439
    BUG: 1323564
    Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever>
    Reviewed-on: http://review.gluster.org/14117
    Tested-by: Prasanna Kumar Kalever <pkalever>
    Smoke: Gluster Build System <jenkins.com>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Atin Mukherjee <amukherj>

Comment 5 Vijay Bellur 2016-05-04 07:57:02 UTC

REVIEW: http://review.gluster.org/14205 (rpc: define client port range) posted (#1) for review on release-3.7 by Prasanna Kumar Kalever (pkalever)

Comment 6 Vijay Bellur 2016-05-04 09:44:25 UTC

REVIEW: http://review.gluster.org/14208 (glusterd: add defence mechanism to avoid brick port clashes) posted (#1) for review on release-3.7 by Prasanna Kumar Kalever (pkalever)

Comment 7 Vijay Bellur 2016-05-05 03:23:22 UTC

COMMIT: http://review.gluster.org/14208 committed in release-3.7 by Raghavendra G (rgowdapp) 
------
commit f9c59e29ccd770ae212da76b5e6f6ce3d8d09e61
Author: Prasanna Kumar Kalever <prasanna.kalever>
Date:   Wed Apr 27 19:12:19 2016 +0530

    glusterd: add defence mechanism to avoid brick port clashes
    
    Intro:
    Currently glusterd maintain the portmap registry which contains ports that
    are free to use between 49152 - 65535, this registry is initialized
    once, and updated accordingly as an then when glusterd sees they are been
    used.
    
    Glusterd first checks for a port within the portmap registry and gets a FREE
    port marked in it, then checks if that port is currently free using a connect()
    function then passes it to brick process which have to bind on it.
    
    Problem:
    We see that there is a time gap between glusterd checking the port with
    connect() and brick process actually binding on it. In this time gap it could
    be so possible that any process would have occupied this port because of which
    brick will fail to bind and exit.
    
    Case 1:
    To avoid the gluster client process occupying the port supplied by glusterd :
    
    we have separated the client port map range with brick port map range more @
    http://review.gluster.org/#/c/13998/
    
    Case 2: (Handled by this patch)
    To avoid the other foreign process occupying the port supplied by glusterd :
    
    To handle above situation this patch implements a mechanism to return EADDRINUSE
    error code to glusterd, upon which a new port is allocated and try to restart
    the brick process with the newly allocated port.
    
    Note: Incase of glusterd restarts i.e. runner_run_nowait() there is no way to
    handle Case 2, becuase runner_run_nowait() will not wait to get the return/exit
    code of the executed command (brick process). Hence as of now in such case,
    we cannot know with what error the brick has failed to connect.
    
    This patch also fix the runner_end() to perform some cleanup w.r.t
    return values.
    
    Backport of:
    > Change-Id: Iec52e7f5d87ce938d173f8ef16aa77fd573f2c5e
    > BUG: 1322805
    > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever>
    > Reviewed-on: http://review.gluster.org/14043
    > Tested-by: Prasanna Kumar Kalever <pkalever>
    > Reviewed-by: Atin Mukherjee <amukherj>
    > Smoke: Gluster Build System <jenkins.com>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.com>
    > Reviewed-by: Raghavendra G <rgowdapp>
    > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever>
    
    Change-Id: Ief247b4d4538c1ca03e73aa31beb5fa99853afd6
    BUG: 1323564
    Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever>
    Reviewed-on: http://review.gluster.org/14208
    Tested-by: Prasanna Kumar Kalever <pkalever>
    Smoke: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 8 Vijay Bellur 2016-05-05 03:24:44 UTC

COMMIT: http://review.gluster.org/14205 committed in release-3.7 by Raghavendra G (rgowdapp) 
------
commit 0b7631b44b829f44c2ebb3a2f2d01d97987e1fd7
Author: Prasanna Kumar Kalever <prasanna.kalever>
Date:   Wed May 4 13:25:06 2016 +0530

    rpc: define client port range
    
    Problem:
    when bind-insecure is 'off', all the clients bind to secure ports,
    if incase all the secure ports exhaust the client will no more bind
    to secure ports and tries gets a random port which is obviously insecure.
    
    we have seen the client obtaining a port number in the range 49152-65535
    which are actually reserved as part of glusterd's pmap_registry for bricks,
    hence this will lead to port clashes between client and brick processes.
    
    Solution:
    If we can define different port ranges for clients incase where secure ports
    exhaust, we can avoid the maximum port clashes with in gluster processes.
    
    Still we are prone to have clashes with other non-gluster processes, but
    the chances being very low, but that's a different story on its own, which
    will be handled in upcoming patches.
    
    > Change-Id: Ib5ce05991aa1290ccb17f6f04ffd65caf411feaf
    > BUG: 1322805
    > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever>
    > Reviewed-on: http://review.gluster.org/13998
    > Smoke: Gluster Build System <jenkins.com>
    > NetBSD-regression: NetBSD Build System <jenkins.org>
    > CentOS-regression: Gluster Build System <jenkins.com>
    > Reviewed-by: Atin Mukherjee <amukherj>
    > Reviewed-by: Raghavendra G <rgowdapp>
    > Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever>
    
    Change-Id: I712676d3e79145d78a17f2c361525e6ef82a4732
    BUG: 1323564
    Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever>
    Reviewed-on: http://review.gluster.org/14205
    Tested-by: Prasanna Kumar Kalever <pkalever>
    Smoke: Gluster Build System <jenkins.com>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Raghavendra G <rgowdapp>

Comment 9 Prasanna Kumar Kalever 2016-05-05 06:43:48 UTC

List of patches which address this bug:

http://review.gluster.org/14205
http://review.gluster.org/14208
http://review.gluster.org/14117

these patches were already modified in the code repository.

Comment 10 Kaushal 2016-06-28 12:13:55 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.12, please open a new bug report.

glusterfs-3.7.12 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-devel/2016-June/049918.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.