Bug 1160900 - cli segmentation fault with remote ssl (3.6.0)
Summary: cli segmentation fault with remote ssl (3.6.0)
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: cli
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On: 1211643
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-05 23:06 UTC by Patrick Hemmer
Modified: 2015-05-14 17:44 UTC (History)
5 users (show)

Fixed In Version: glusterfs-3.7.0
Clone Of:
Environment:
Last Closed: 2015-05-14 17:28:19 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Patrick Hemmer 2014-11-05 23:06:36 UTC
Description of problem:
While attempting to setup SSL for remote management, I managed to get the CLI to segfault.

Version-Release number of selected component (if applicable):
3.6.0


How reproducible:
Every time


Steps to Reproduce:
1. On 2 hosts: add an SSL cert & key to /etc/ssl/glusterfs.{key,pem,ca}
2. On host 1: add `option transport.socket.ssl on` to /etc/glusterfs/glusterd.vol & restart glusterd
3. On host 2: run `gluster --remote-host=host1 peer probe host2`.

Actual results:
Segmentation fault (core dumped)


Expected results:
No segmentation fault

Additional info:

Comment 1 Patrick Hemmer 2014-11-05 23:08:11 UTC
Here's the backtrace from GDB:

(gdb) bt
#0  0x00007f03692c98b0 in SSL_read () from /lib64/libssl.so.10
#1  0x00007f03694f8602 in ssl_do (buf=0x7f036d6bdd14, len=4, func=0x7f03692c98b0 <SSL_read>, 
    this=0x7f036d6b84e0, this=0x7f036d6b84e0) at socket.c:281
#2  0x00007f03694f8885 in __socket_ssl_readv (this=this@entry=0x7f036d6b84e0, 
    opvector=opvector@entry=0x7f036d6bdcc0, opcount=opcount@entry=1) at socket.c:416
#3  0x00007f03694f8b69 in __socket_cached_read (opcount=1, opvector=0x7f036d6bdcc0, 
    this=0x7f036d6b84e0) at socket.c:504
#4  __socket_rwv (this=this@entry=0x7f036d6b84e0, vector=<optimized out>, count=count@entry=1, 
    pending_vector=pending_vector@entry=0x7f036d6bdd08, 
    pending_count=pending_count@entry=0x7f036d6bdd10, bytes=bytes@entry=0x0, write=write@entry=0)
    at socket.c:578
#5  0x00007f03694fc211 in __socket_readv (bytes=0x0, pending_count=0x7f036d6bdd10, 
    pending_vector=0x7f036d6bdd08, count=1, vector=<optimized out>, this=0x7f036d6b84e0)
    at socket.c:671
#6  __socket_proto_state_machine (pollin=<synthetic pointer>, this=0x7f036d6b84e0) at socket.c:2049
#7  socket_proto_state_machine (pollin=<synthetic pointer>, this=0x7f036d6b84e0) at socket.c:2205
#8  socket_event_poll_in (this=this@entry=0x7f036d6b84e0) at socket.c:2221
#9  0x00007f03694fece4 in socket_event_handler (fd=<optimized out>, idx=0, 
    data=data@entry=0x7f036d6b84e0, poll_in=1, poll_out=4, poll_err=16) at socket.c:2338
#10 0x00007f036c3bb322 in event_dispatch_epoll_handler (i=<optimized out>, events=0x7f036d6efb90, 
    event_pool=0x7f036d68d410) at event-epoll.c:384
#11 event_dispatch_epoll (event_pool=0x7f036d68d410) at event-epoll.c:445
#12 0x00007f036c814f66 in main (argc=<optimized out>, argv=0x7fffc38a82f8) at cli.c:724

Comment 2 Patrick Hemmer 2014-11-05 23:13:22 UTC
Some more info. I forgot, on the client (host 2) I had performed `touch /var/lib/glusterd/secure-access`. But I had not done this on the server (host 1). After I do this on the server (and bounce glusterd), glusterd seg faults.

Comment 3 Jeff Darcy 2014-11-06 03:37:32 UTC
As far as I can tell, this problem has to do with sending requests on connections that failed - either CLI to glusterd, or glusterd to glusterd in the case of a "peer probe" operation.  That's a bug somewhere above the transport layer - probably a new one, which would explain why this wasn't seen before.  Still, it shouldn't cause a segfault.  I've written a patch to address that part, so it doesn't segfault, but there are other "infelicities" involved that I'm still looking into.

Also, it's not advisable to change the transport.socket.ssl option directly in the glusterd volfile.  It shouldn't be harmful, because that should only affect the I/O path and thus be irrelevant for glusterd, but it's also possible that it could interfere with the way we make decisions about when to use SSL and when not to (because in the portmapper case we need to switch based on two different settings).  The correct way to enable SSL for the management layer is via the secure-access file.  See this doc (unfortunately still in review) for more details.

http://review.gluster.org/#/c/8961/2/doc/admin-guide/en-US/markdown/admin_ssl.md

Comment 4 Anand Avati 2014-11-06 03:38:46 UTC
REVIEW: http://review.gluster.org/9059 (socket: fix segfaults when TLS management connections fail) posted (#1) for review on master by Jeff Darcy (jdarcy)

Comment 5 Anand Avati 2014-11-06 12:20:24 UTC
REVIEW: http://review.gluster.org/9059 (socket: fix segfaults when TLS management connections fail) posted (#2) for review on master by Jeff Darcy (jdarcy)

Comment 6 Patrick Hemmer 2014-11-06 15:15:30 UTC
Fix works. Used the RPMs from http://build.gluster.org/job/glusterfs-devrpms/3762/

Thanks for the link to the doc as well. Though some feedback: having to create an empty file to enable the feature feels dirty.

Comment 7 Jeff Darcy 2014-11-06 15:46:31 UTC
It feels dirty because it is dirty, but there aren't many alternatives that work for the CLI as well as all of the daemons.  The CLI doesn't use a config file for permanent options, so they have to be re-specified on every invocation.  That means anyone switching from insecure to secure management communications has to change all of their scripts that use the CLI.  Environment variables aren't as foolproof, and wrappers are even dirtier in their own way.

That said, perhaps it would be better to make this the start of a permanent CLI config file.  If we define a place and a format, then we could use it for future options with similar behavior, even if this is the only one for now.  Thanks for the feedback.

BTW, I'm working on a better fix that will pass all of our regression tests, and hopefully also eliminate the long delay before the CLI command fails.  Keep watching this space.  ;)

Comment 8 Anand Avati 2014-11-06 16:01:04 UTC
REVIEW: http://review.gluster.org/9059 (socket: fix segfaults when TLS management connections fail) posted (#3) for review on master by Jeff Darcy (jdarcy)

Comment 9 Patrick Hemmer 2014-11-06 16:48:49 UTC
Most recent build works fine still (http://build.gluster.org/job/glusterfs-devrpms/3768/)


I actually think a CLI config makes sense. Now that you can have full remote management over SSL, I think using it might become a more common practice. Potential things which I can think of which would go in the config:
* Path to glusterd unix domain socket or address of remote host.
* Whether to enable SSL
* Path to SSL certificate/key/CA

The docs even indicate that you can authenticate against bricks with a username & password (https://forge.gluster.org/glusterfs-core/glusterfs/blobs/master/doc/authentication.txt). Though I don't see a way to do that with the management connection, but perhaps that would be a future feature, in which you'd then have config params for username & pass.

Comment 10 Jeff Darcy 2014-11-06 18:18:19 UTC
OK, this latest one looks good to go.  Not sure if that's the one you already had.

Comment 11 Anand Avati 2015-01-27 14:04:03 UTC
COMMIT: http://review.gluster.org/9059 committed in master by Vijay Bellur (vbellur) 
------
commit 0b9a6a63b50e0c4947233aee33fc86f603f77dd1
Author: Jeff Darcy <jdarcy>
Date:   Wed Nov 5 22:37:48 2014 -0500

    socket: fix segfaults when TLS management connections fail
    
    Change-Id: I1fd085b04ad1ee68c982d3736b322c19dd12e071
    BUG: 1160900
    Signed-off-by: Jeff Darcy <jdarcy>
    Reviewed-on: http://review.gluster.org/9059
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Harshavardhana <harsha>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 12 Niels de Vos 2015-05-14 17:28:19 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 13 Niels de Vos 2015-05-14 17:35:41 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 14 Niels de Vos 2015-05-14 17:38:03 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 15 Niels de Vos 2015-05-14 17:44:36 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.