1244415 – Enabling management SSL on a gluster cluster already configured can crash glusterd

Bug 1244415 - Enabling management SSL on a gluster cluster already configured can crash glusterd

Summary: Enabling management SSL on a gluster cluster already configured can crash glu...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	core
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	unspecified
Target Milestone:	---
Target Release:	RHGS 3.1.1
Assignee:	Kaushal
QA Contact:	krishnaram Karthick
Docs Contact:
URL:
Whiteboard:
Depends On:	1243722
Blocks:	1251815
TreeView+	depends on / blocked

Reported:	2015-07-18 15:18 UTC by krishnaram Karthick
Modified:	2016-09-17 14:39 UTC (History)
CC List:	7 users (show)
Fixed In Version:	glusterfs-3.7.1-12
Doc Type:	Bug Fix
Doc Text:	Previously, glusterd was not fully initializing its transports when using management encryption. As a consequence, an unencrypted incoming connection would cause glusterd to crash. With this fix, the transports are now fully initialized and additional checks have been added to handle unencrypted incoming connections. Now, glusterd no longer crashes on incoming unencrypted connections when using management encryption.
Clone Of:
Environment:
Last Closed:	2015-10-05 07:12:09 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2015:1845	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.1 update	2015-10-05 11:06:22 UTC

Description krishnaram Karthick 2015-07-18 15:18:13 UTC

Description of problem:
On a gluster storage which has already nodes configured with volumes and mounted on clients, enabling management SSL crashes glusterd. Backtrace of the core is same as 1243722

Version-Release number of selected component (if applicable):
glusterfs-3.7.1-10.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Have few gluster volumes already on the system
2. stop glusterd service
3. enable management ssl. touch /var/lib/glusterd/secure-access
4. start glusterd service


Actual results:
glusterd crashes. 

Expected results:
glusterd shouldn't crash. Management SSL should be enabled.

Additional info:

Backtrace from the core:

#0  list_del (old=0x7f7bc00024c8) at ../../../../libglusterfs/src/list.h:76
#1  glusterd_rpcsvc_notify (rpc=<optimized out>, xl=0x7f7be5ee40f0, event=<optimized out>, data=0x7f7bc0001a10)
    at glusterd.c:347
#2  0x00007f7be428b35f in rpcsvc_handle_disconnect (svc=0x7f7be5eef110, trans=trans@entry=0x7f7bc0001a10)
    at rpcsvc.c:754
#3  0x00007f7be428d718 in rpcsvc_notify (trans=0x7f7bc0001a10, mydata=<optimized out>, event=<optimized out>, 
    data=0x7f7bc0001a10) at rpcsvc.c:792
#4  0x00007f7be428f873 in rpc_transport_notify (this=this@entry=0x7f7bc0001a10, 
    event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7f7bc0001a10) at rpc-transport.c:543
#5  0x00007f7bd6d5de64 in socket_poller (ctx=0x7f7bc0001a10) at socket.c:2582
#6  0x00007f7be332ddf5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f7be2c741ad in clone () from /lib64/libc.so.6

Comment 3 Kaushal 2015-07-23 07:02:27 UTC

(In reply to krishnaram Karthick from comment #0)
> Description of problem:
> On a gluster storage which has already nodes configured with volumes and
> mounted on clients, enabling management SSL crashes glusterd. Backtrace of
> the core is same as 1243722
> 
> Version-Release number of selected component (if applicable):
> glusterfs-3.7.1-10.el7rhgs.x86_64
> 
> How reproducible:
> Always
> 
> Steps to Reproduce:
> 1. Have few gluster volumes already on the system
> 2. stop glusterd service

The documentation [1] mentions that all Gluster processes need to be stopped. Not just GlusterD. Stopping just GlusterD is not enough in this case.

Though the crash is a bug, if the steps were correctly followed it wouldn't have happened. I'd recommend closing this bug as it is invalid.

[1]: http://jenkinscat.gsslab.pnq.redhat.com:8080/view/Gluster/job/doc-Red_Hat_Gluster_Storage-3.1-Administration_Guide%20(html-single)/136/artifact/tmp/en-US/html-single/index.html#idm140449326918528

> 3. enable management ssl. touch /var/lib/glusterd/secure-access
> 4. start glusterd service
> 
> 
> Actual results:
> glusterd crashes. 
> 
> Expected results:
> glusterd shouldn't crash. Management SSL should be enabled.
> 
> Additional info:
> 
> Backtrace from the core:
> 
> #0  list_del (old=0x7f7bc00024c8) at ../../../../libglusterfs/src/list.h:76
> #1  glusterd_rpcsvc_notify (rpc=<optimized out>, xl=0x7f7be5ee40f0,
> event=<optimized out>, data=0x7f7bc0001a10)
>     at glusterd.c:347
> #2  0x00007f7be428b35f in rpcsvc_handle_disconnect (svc=0x7f7be5eef110,
> trans=trans@entry=0x7f7bc0001a10)
>     at rpcsvc.c:754
> #3  0x00007f7be428d718 in rpcsvc_notify (trans=0x7f7bc0001a10,
> mydata=<optimized out>, event=<optimized out>, 
>     data=0x7f7bc0001a10) at rpcsvc.c:792
> #4  0x00007f7be428f873 in rpc_transport_notify
> (this=this@entry=0x7f7bc0001a10, 
>     event=event@entry=RPC_TRANSPORT_DISCONNECT,
> data=data@entry=0x7f7bc0001a10) at rpc-transport.c:543
> #5  0x00007f7bd6d5de64 in socket_poller (ctx=0x7f7bc0001a10) at socket.c:2582
> #6  0x00007f7be332ddf5 in start_thread () from /lib64/libpthread.so.0
> #7  0x00007f7be2c741ad in clone () from /lib64/libc.so.6

Comment 5 Atin Mukherjee 2015-08-14 06:47:57 UTC

downstream patch https://code.engineering.redhat.com/gerrit/#/c/53141/ merged now.

Comment 6 krishnaram Karthick 2015-09-11 16:23:16 UTC

Crash is no more seen when enabling management SSL on an existing setup. Steps followed to verify are,

1. Have few gluster volumes already on the system
2. stop glusterd service
3. enable management ssl. touch /var/lib/glusterd/secure-access
4. start glusterd service

No gluster crash is seen. Mount succeeds when proper steps are followed as per documentation. Marking the bug as verified.

Build used to verify: glusterfs-3.7.1-14.el7rhgs.x86_64

Comment 8 errata-xmlrpc 2015-10-05 07:12:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1845.html

Note You need to log in before you can comment on or make changes to this bug.