Bug 1185322 - Enabling SSL for glusterd will cause all nodes to crash when connection is interrupted
Summary: Enabling SSL for glusterd will cause all nodes to crash when connection is in...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: rpc
Version: 3.6.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Jeff Darcy
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-01-23 12:59 UTC by Michael Wikberg
Modified: 2016-08-16 13:11 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-16 13:11:31 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
glusterd log (5.08 KB, text/plain)
2015-01-23 12:59 UTC, Michael Wikberg
no flags Details
POC patch (1.53 KB, patch)
2015-01-23 13:00 UTC, Michael Wikberg
no flags Details | Diff

Description Michael Wikberg 2015-01-23 12:59:34 UTC
Created attachment 983340 [details]
glusterd log

Description of problem:

Added SSL support for volumes (client.ssl on, server.ssl on) and management (touch /var/lib/glusterd/secure-access), and after the initial successful restart of all glusterd instances, they all crashed one a single connection was reset.

Tried with several OpenSSL and GlusterFS version combinations, but the reason seems to be priv->ssl_ssl being NULL when passed to libssl functions.

Version-Release number of selected component (if applicable):

3.6.1, 3.6.2 7.3dev

How reproducible:
Set up SSL everywhere, wait until a single conenction fail (or glusterd instance is restarted)

Steps to Reproduce:
1.
2.
3.

Actual results:
If any glusterd instance is unavailable, the glusterd instance will crash leading to all other connected instances to crash.

Expected results:
Nodes are reconnected after connection failure.


Additional info:
Managed to get nodes online again after adding a check for priv->ssl_ssl == NULL in socket.c: ssl_do ssl_teardown_connection

Comment 1 Michael Wikberg 2015-01-23 13:00:37 UTC
Created attachment 983341 [details]
POC patch

A POC patch that fixed my issues

Comment 2 Jeff Darcy 2015-01-27 13:14:58 UTC
Looks like the same basic problem as was fixed by http://review.gluster.org/#/c/9059/ but that patch has languished in review since November 5.  I'll push on people to get it through.  Thanks for the report, and especially for the patch!

Comment 3 Michael Wikberg 2015-01-27 15:23:18 UTC
Ah. Yes. That mentioned patch is definitely better =) My hack was just to get the services up and running asap again. Wonder how I missed that one while searching for solutions to the issue.. Oh well =)

Hope it gets backported to 3.6 also.

Comment 4 Mike McCune 2016-03-28 22:45:31 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 5 Kaushal 2016-08-16 13:11:31 UTC
This bug is being closed as GlusterFS-3.6 is nearing its End-Of-Life and only important security bugs will be fixed. This bug has been fixed in more recent GlusterFS releases (v3.7.0 and above). If you still face this bug with the newer GlusterFS versions, please open a new bug.


Note You need to log in before you can comment on or make changes to this bug.