Bug 1398798 - [Ganesha+SSL] : Ganesha crashes on all nodes on volume restarts
Summary: [Ganesha+SSL] : Ganesha crashes on all nodes on volume restarts
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: libgfapi
Version: rhgs-3.2
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: RHGS 3.2.0
Assignee: rjoseph
QA Contact: Ambarish
URL:
Whiteboard:
Depends On:
Blocks: 1351528 1404181
TreeView+ depends on / blocked
 
Reported: 2016-11-26 06:31 UTC by Ambarish
Modified: 2017-03-28 06:56 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.8.4-10
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1404181 (view as bug list)
Environment:
Last Closed: 2017-03-23 05:51:25 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 0 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:18:45 UTC

Description Ambarish 2016-11-26 06:31:31 UTC
Description of problem:
-----------------------

4 node cluster,4 clients.All of them TLS authenticated.

Both I/O and management encryption were enabled :

[2016-11-26 05:39:43.825465] I [socket.c:4021:socket_init] 0-testvol-client-3: SSL support on the I/O path is ENABLED
[2016-11-26 05:39:43.825503] I [socket.c:4024:socket_init] 0-testvol-client-3: SSL support for glusterd is ENABLED


When I tried to restart the volume post setting up Ganesha cluster,Ganesha process crashed on all 4 nodes and dumped core.

************************
BT from core(on 3 nodes)
************************

#0  0x00007efe2414e7ac in SSL_get_error () from /lib64/libssl.so.10
#1  0x00007efe1fded772 in ssl_do (buf=buf@entry=0x0, len=len@entry=0, func=0x7efe2414f830 <SSL_connect>, 
    this=0x7efe21b8efb0, this=0x7efe21b8efb0) at socket.c:238
#2  0x00007efe1fdf156f in ssl_setup_connection (this=this@entry=0x7efe21b8efb0, server=0) at socket.c:322
#3  0x00007efe1fdf1b8c in socket_poller (ctx=0x7efe21b8efb0) at socket.c:2433
#4  0x00007efe34e7ddc5 in start_thread () from /lib64/libpthread.so.0
#5  0x00007efe3454c73d in clone () from /lib64/libc.so.6

************************
BT from core(on 1 node)
************************
(gdb) 
#0  0x00007fc76ff7202c in EC_GROUP_get_degree () from /lib64/libcrypto.so.10
#1  0x00007fc762d31bfa in ssl3_send_client_key_exchange () from /lib64/libssl.so.10
#2  0x00007fc762d33cb0 in ssl3_connect () from /lib64/libssl.so.10
#3  0x00007fc75bded90a in ssl_do (buf=buf@entry=0x0, len=len@entry=0, func=0x7fc762d52830 <SSL_connect>, 
    this=0x7fc75db8f050, this=0x7fc75db8f050) at socket.c:236
#4  0x00007fc75bdf156f in ssl_setup_connection (this=this@entry=0x7fc75db8f050, server=0) at socket.c:322
#5  0x00007fc75bdf1b8c in socket_poller (ctx=0x7fc75db8f050) at socket.c:2433
#6  0x00007fc774281dc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fc77395073d in clone () from /lib64/libc.so.6
(gdb) 


t a a bt from 256 threads is way too long to type/attach here,So I've copied cores to rhsqerepo.Exact loc in description.


Version-Release number of selected component (if applicable):
-------------------------------------------------------------

nfs-ganesha-gluster-2.4.1-1.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-5.el7rhgs.x86_64

openssl-1.0.1e-60.el7.x86_64


How reproducible:
-----------------

Every time I try.

Steps to Reproduce:
-------------------

1. TLS authenticate all servers and clients.Enable mgmnt and IO encryption. 

2. Post Ganesha cluster setup and export,restart the volume a couple of times.

Actual results:
---------------

Ganesha crashes and dumps core on all 4 nodes.

Expected results:
-----------------

No crashes.

Additional info:
----------------

OS : RHEL7.3

*Vol Config* :
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: 973991f6-8bdf-4b38-bef9-2abeaa829446
Status: Stopped
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gqas013.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick0
Brick2: gqas005.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick1
Brick3: gqas006.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick2
Brick4: gqas011.sbu.lab.eng.bos.redhat.com:/bricks/testvol_brick3
Options Reconfigured:
ganesha.enable: on
features.cache-invalidation: off
server.ssl: on
client.ssl: on
auth.ssl-allow: *
nfs.disable: on
performance.readdir-ahead: on
transport.address-family: inet
performance.stat-prefetch: off
server.allow-insecure: on
nfs-ganesha: enable
cluster.enable-shared-storage: enable

Comment 3 Soumya Koduri 2016-11-28 08:57:43 UTC
The issue seem to be with ssl-connection setup. Adjusting the components accordingly. Since there is no ssl component, moving it to core.

Comment 4 Mohit Agrawal 2016-11-28 10:27:59 UTC
Hi,

As per core dump shared in above location it seems ssl3_send_client_key_exchange is crashing due to SSL object value is NULL.

>>>>>>>>>>>>>>>>>>>>


(gdb) bt
#0  0x00007fc76ff7202c in EC_GROUP_get_degree () from /lib64/libcrypto.so.10
#1  0x00007fc762d31bfa in ssl3_send_client_key_exchange () from /lib64/libssl.so.10
#2  0x00007fc762d33cb0 in ssl3_connect () from /lib64/libssl.so.10
#3  0x00007fc75bded90a in ssl_do (buf=buf@entry=0x0, len=len@entry=0, func=0x7fc762d52830 <SSL_connect>, 
    this=0x7fc75db8f050, this=0x7fc75db8f050) at socket.c:236
#4  0x00007fc75bdf156f in ssl_setup_connection (this=this@entry=0x7fc75db8f050, server=0) at socket.c:322
#5  0x00007fc75bdf1b8c in socket_poller (ctx=0x7fc75db8f050) at socket.c:2433
#6  0x00007fc774281dc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007fc77395073d in clone () from /lib64/libc.so.6
(gdb) f 4
#4  0x00007fc75bdf156f in ssl_setup_connection (this=this@entry=0x7fc75db8f050, server=0) at socket.c:322
322			ret = ssl_connect_one(this);
(gdb) p priv
$7 = (socket_private_t *) 0x7fc75db15e40
(gdb) p priv->ssl_ssl
$8 = (SSL *) 0x0
(gdb) 

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Although in code we do already have a check to verify the object after SSL_new but we are not verifying the value after call
SSL_set_bio, i think we need to verify the same after call SSL_set_bio also 

ssl_setup_connection() 
{
........

priv->ssl_ssl = SSL_new(priv->ssl_ctx);
        if (!priv->ssl_ssl) {
                gf_log(this->name,GF_LOG_ERROR,"SSL_new failed");
                ssl_dump_error_stack(this->name);
                goto done;
        }
        priv->ssl_sbio = BIO_new_socket(priv->sock,BIO_NOCLOSE);
        if (!priv->ssl_sbio) {
                gf_log(this->name,GF_LOG_ERROR,"BIO_new_socket failed");
                ssl_dump_error_stack(this->name);
                goto free_ssl;
        }
        SSL_set_bio(priv->ssl_ssl,priv->ssl_sbio,priv->ssl_sbio);

..............
}

In upstream there is already one Vulnerability raised ssl3_send_client_key_exchange function is crashed if value is NULL. 
http://www.cvedetails.com/cve/CVE-2014-3470/


Regards
Mohit Agrawal

Comment 5 Mohit Agrawal 2016-11-29 09:02:15 UTC
Hi,


After debug it more why ssl api is crashing in gdb,i have found ctx used in socket_poller thread destroyed by pub_glfs_fini in thread 281.


>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

(gdb) bt
#0  0x00007ff8eea4402c in EC_GROUP_get_degree () from /lib64/libcrypto.so.10
#1  0x00007ff8e164bbfa in ssl3_send_client_key_exchange () from /lib64/libssl.so.10
#2  0x00007ff8e164dcb0 in ssl3_connect () from /lib64/libssl.so.10
#3  0x00007ff8e209a90a in ssl_do (buf=buf@entry=0x0, len=len@entry=0, func=0x7ff8e166c830 <SSL_connect>, 
    this=0x7ff8ddb8f080, this=0x7ff8ddb8f080) at socket.c:236
#4  0x00007ff8e209e56f in ssl_setup_connection (this=this@entry=0x7ff8ddb8f080, server=0) at socket.c:322
#5  0x00007ff8e209eb8c in socket_poller (ctx=0x7ff8ddb8f080) at socket.c:2433
#6  0x00007ff97f382dc5 in start_thread () from /lib64/libpthread.so.0
#7  0x00007ff97ea5173d in clone () from /lib64/libc.so.6
(gdb) f 5
#5  0x00007ff8e209eb8c in socket_poller (ctx=0x7ff8ddb8f080) at socket.c:2433
2433	                cname = ssl_setup_connection(this,priv->connected);
(gdb) p this->ctx
$3 = (glusterfs_ctx_t *) 0x7ff8f0001250
(gdb) thread 281
[Switching to thread 281 (Thread 0x7ff8f58ae700 (LWP 23008))]
#0  0x00007ff97f386a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007ff97f386a82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ff8ef718199 in event_dispatch_destroy (event_pool=0x7ff8f00d75d0) at event.c:261
#2  0x00007ff8ef9c4611 in pub_glfs_fini (fs=0x7ff8f0000bc0) at glfs.c:1218
#3  0x00007ff8efdf0561 in export_release (exp_hdl=0x7ff8f0000a80)
    at /usr/src/debug/nfs-ganesha-2.4.1/src/FSAL/FSAL_GLUSTER/export.c:88
#4  0x00007ff980ebe58d in mdcache_exp_release (exp_hdl=0x7ff8f007f2e0)
    at /usr/src/debug/nfs-ganesha-2.4.1/src/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_export.c:170
#5  0x00007ff980e9e61b in free_export_resources (export=0x7ff8f0098b58)
    at /usr/src/debug/nfs-ganesha-2.4.1/src/support/exports.c:2067
#6  0x00007ff980eae339 in free_export (export=0x7ff8f0098b58)
    at /usr/src/debug/nfs-ganesha-2.4.1/src/support/export_mgr.c:252
#7  0x00007ff980eaf845 in put_gsh_export (export=export@entry=0x7ff8f0098b58)
    at /usr/src/debug/nfs-ganesha-2.4.1/src/support/export_mgr.c:632
#8  0x00007ff980eaff94 in gsh_export_removeexport (args=<optimized out>, reply=<optimized out>, error=0x7ff8f58ad2e0)
    at /usr/src/debug/nfs-ganesha-2.4.1/src/support/export_mgr.c:1096
#9  0x00007ff980ed1229 in dbus_message_entrypoint (conn=0x7ff98223d640, msg=0x7ff98223d920, user_data=<optimized out>)
    at /usr/src/debug/nfs-ganesha-2.4.1/src/dbus/dbus_server.c:512
#10 0x00007ff98076bc76 in _dbus_object_tree_dispatch_and_unlock () from /lib64/libdbus-1.so.3
#11 0x00007ff98075de49 in dbus_connection_dispatch () from /lib64/libdbus-1.so.3
#12 0x00007ff98075e0e2 in _dbus_connection_read_write_dispatch () from /lib64/libdbus-1.so.3
#13 0x00007ff980ed22f1 in gsh_dbus_thread (arg=<optimized out>)
    at /usr/src/debug/nfs-ganesha-2.4.1/src/dbus/dbus_server.c:737
#14 0x00007ff97f382dc5 in start_thread () from /lib64/libpthread.so.0
#15 0x00007ff97ea5173d in clone () from /lib64/libc.so.6
(gdb) f 2
#2  0x00007ff8ef9c4611 in pub_glfs_fini (fs=0x7ff8f0000bc0) at glfs.c:1218
1218	                if (event_dispatch_destroy (ctx->event_pool) != 0)
(gdb) p ctx
$4 = (glusterfs_ctx_t *) 0x7ff8f0001250


>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

I think issue needs to be fix in nfs-ganesha code .

Regards
Mohit Agrawal

Comment 6 Atin Mukherjee 2016-11-29 09:36:54 UTC
Based on comment 5, moving this BZ back to nfs-ganesha.

Comment 7 Atin Mukherjee 2016-11-29 09:37:24 UTC
Somuya - could you check comment 5 please?

Comment 8 Soumya Koduri 2016-11-29 09:51:46 UTC
Poornima is looking at the core. We will update our findings.

Comment 12 Vivek Das 2016-12-01 14:19:29 UTC
Hi facing same crash issues with steps mentioned below

:: On a SSL setup both I/O and management encryption were enabled. Mount & unmount the volume share on windows atleast 20 times.

Thread 17 (Thread 0x7f40df7fe700 (LWP 23595)):
#0  0x00007f4137e8ad13 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f41209b73f0 in event_dispatch_epoll_worker (data=0x7f40e400b120) at event-epoll.c:664
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 16 (Thread 0x7f40f8183700 (LWP 2736)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 15 (Thread 0x7f40f8284700 (LWP 2735)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 14 (Thread 0x7f40f8385700 (LWP 26814)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 13 (Thread 0x7f413c2278c0 (LWP 23337)):
#0  0x00007f413be68ef7 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f412096b32a in gf_timer_registry_destroy (ctx=0x7f413e3d5a20) at timer.c:264
#2  0x00007f41210679a0 in pub_glfs_fini (fs=0x7f413f05c1c0) at glfs.c:1228
#3  0x00007f412128a1a6 in vfs_gluster_disconnect () from /usr/lib64/samba/vfs/glusterfs.so
#4  0x00007f413b7c41f1 in close_cnum () from /usr/lib64/samba/libsmbd-base-samba4.so
#5  0x00007f413b7f2814 in smbXsrv_tcon_disconnect () from /usr/lib64/samba/libsmbd-base-samba4.so
---Type <return> to continue, or q <return> to quit---
#6  0x00007f413b7d9c2f in smbd_smb2_tdis_wait_done () from /usr/lib64/samba/libsmbd-base-samba4.so
#7  0x00007f4138158c34 in tevent_common_loop_immediate () from /lib64/libtevent.so.0
#8  0x00007f413973121c in run_events_poll () from /lib64/libsmbconf.so.0
#9  0x00007f4139731504 in s3_event_loop_once () from /lib64/libsmbconf.so.0
#10 0x00007f413815840d in _tevent_loop_once () from /lib64/libtevent.so.0
#11 0x00007f41381585ab in tevent_common_loop_wait () from /lib64/libtevent.so.0
#12 0x00007f413b7c1731 in smbd_process () from /usr/lib64/samba/libsmbd-base-samba4.so
#13 0x00007f413c2aa304 in smbd_accept_connection ()
#14 0x00007f413973134c in run_events_poll () from /lib64/libsmbconf.so.0
#15 0x00007f41397315a0 in s3_event_loop_once () from /lib64/libsmbconf.so.0
#16 0x00007f413815840d in _tevent_loop_once () from /lib64/libtevent.so.0
#17 0x00007f41381585ab in tevent_common_loop_wait () from /lib64/libtevent.so.0
#18 0x00007f413c2a5ad4 in main ()

Thread 12 (Thread 0x7f40bdad6700 (LWP 13081)):
#0  0x00007f413be6ebdd in nanosleep () from /lib64/libpthread.so.0
#1  0x00007f412096ace6 in gf_timer_proc (data=0x7f413f0115a0) at timer.c:176
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 11 (Thread 0x7f40f8587700 (LWP 24222)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 10 (Thread 0x7f412324c700 (LWP 24221)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) t a a bt

Thread 17 (Thread 0x7f40df7fe700 (LWP 23595)):
#0  0x00007f4137e8ad13 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f41209b73f0 in event_dispatch_epoll_worker (data=0x7f40e400b120) at event-epoll.c:664
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 16 (Thread 0x7f40f8183700 (LWP 2736)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 15 (Thread 0x7f40f8284700 (LWP 2735)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 14 (Thread 0x7f40f8385700 (LWP 26814)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 13 (Thread 0x7f413c2278c0 (LWP 23337)):
#0  0x00007f413be68ef7 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f412096b32a in gf_timer_registry_destroy (ctx=0x7f413e3d5a20) at timer.c:264
#2  0x00007f41210679a0 in pub_glfs_fini (fs=0x7f413f05c1c0) at glfs.c:1228
#3  0x00007f412128a1a6 in vfs_gluster_disconnect () from /usr/lib64/samba/vfs/glusterfs.so
#4  0x00007f413b7c41f1 in close_cnum () from /usr/lib64/samba/libsmbd-base-samba4.so
#5  0x00007f413b7f2814 in smbXsrv_tcon_disconnect () from /usr/lib64/samba/libsmbd-base-samba4.so
---Type <return> to continue, or q <return> to quit---
#6  0x00007f413b7d9c2f in smbd_smb2_tdis_wait_done () from /usr/lib64/samba/libsmbd-base-samba4.so
#7  0x00007f4138158c34 in tevent_common_loop_immediate () from /lib64/libtevent.so.0
#8  0x00007f413973121c in run_events_poll () from /lib64/libsmbconf.so.0
#9  0x00007f4139731504 in s3_event_loop_once () from /lib64/libsmbconf.so.0
#10 0x00007f413815840d in _tevent_loop_once () from /lib64/libtevent.so.0
#11 0x00007f41381585ab in tevent_common_loop_wait () from /lib64/libtevent.so.0
#12 0x00007f413b7c1731 in smbd_process () from /usr/lib64/samba/libsmbd-base-samba4.so
#13 0x00007f413c2aa304 in smbd_accept_connection ()
#14 0x00007f413973134c in run_events_poll () from /lib64/libsmbconf.so.0
#15 0x00007f41397315a0 in s3_event_loop_once () from /lib64/libsmbconf.so.0
#16 0x00007f413815840d in _tevent_loop_once () from /lib64/libtevent.so.0
#17 0x00007f41381585ab in tevent_common_loop_wait () from /lib64/libtevent.so.0
#18 0x00007f413c2a5ad4 in main ()

Thread 12 (Thread 0x7f40bdad6700 (LWP 13081)):
#0  0x00007f413be6ebdd in nanosleep () from /lib64/libpthread.so.0
#1  0x00007f412096ace6 in gf_timer_proc (data=0x7f413f0115a0) at timer.c:176
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 11 (Thread 0x7f40f8587700 (LWP 24222)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 10 (Thread 0x7f412324c700 (LWP 24221)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

---Type <return> to continue, or q <return> to quit---
Thread 9 (Thread 0x7f40de7fc700 (LWP 23718)):
#0  0x00007f4137e8ad13 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f41209b73f0 in event_dispatch_epoll_worker (data=0x7f40d9abd380) at event-epoll.c:664
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 8 (Thread 0x7f410a8b3700 (LWP 23717)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f411842990c in iot_worker (data=0x7f40d891ec30) at io-threads.c:180
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x7f40dffff700 (LWP 23594)):
#0  0x00007f413be68ef7 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f41209b7998 in event_dispatch_epoll (event_pool=0x7f413f0a4a40) at event-epoll.c:758
#2  0x00007f4121065fd4 in glfs_poller (data=<optimized out>) at glfs.c:610
#3  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 6 (Thread 0x7f40f8ff9700 (LWP 23593)):
#0  0x00007f413be6ebdd in nanosleep () from /lib64/libpthread.so.0
#1  0x00007f412096ace6 in gf_timer_proc (data=0x7f413e451730) at timer.c:176
#2  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7f40f97fa700 (LWP 23592)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f4120995fc8 in syncenv_task (proc=proc@entry=0x7f413f0a7ff0) at syncop.c:603
#2  0x00007f4120996e10 in syncenv_processor (thdata=0x7f413f0a7ff0) at syncop.c:695
#3  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

---Type <return> to continue, or q <return> to quit---
Thread 4 (Thread 0x7f4108ed6700 (LWP 23591)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f4120995fc8 in syncenv_task (proc=proc@entry=0x7f413f0a7c30) at syncop.c:603
#2  0x00007f4120996e10 in syncenv_processor (thdata=0x7f413f0a7c30) at syncop.c:695
#3  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f411f11e700 (LWP 23340)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f4120995fc8 in syncenv_task (proc=proc@entry=0x7f413e411440) at syncop.c:603
#2  0x00007f4120996e10 in syncenv_processor (thdata=0x7f413e411440) at syncop.c:695
#3  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f411f91f700 (LWP 23339)):
#0  0x00007f413be6ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f4120995fc8 in syncenv_task (proc=proc@entry=0x7f413e411080) at syncop.c:603
#2  0x00007f4120996e10 in syncenv_processor (thdata=0x7f413e411080) at syncop.c:695
#3  0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#4  0x00007f4137e8a73d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f40b3fff700 (LWP 13253)):
#0  0x00007f4137dc81d7 in raise () from /lib64/libc.so.6
#1  0x00007f4137dc98c8 in abort () from /lib64/libc.so.6
#2  0x00007f4139728b9b in dump_core () from /lib64/libsmbconf.so.0
#3  0x00007f413971bf97 in smb_panic_s3 () from /lib64/libsmbconf.so.0
#4  0x00007f413bc0e57f in smb_panic () from /lib64/libsamba-util.so.0
#5  0x00007f413bc0e796 in sig_fault () from /lib64/libsamba-util.so.0
#6  <signal handler called>
#7  0x00007f412cb027ac in SSL_get_error () from /lib64/libssl.so.10
#8  0x00007f4119993772 in ssl_do (buf=buf@entry=0x0, len=len@entry=0, func=0x7f412cb03830 <SSL_connect>, 
    this=0x7f40e1b008b0, this=0x7f40e1b008b0) at socket.c:238
---Type <return> to continue, or q <return> to quit---
#9  0x00007f411999756f in ssl_setup_connection (this=this@entry=0x7f40e1b008b0, server=0) at socket.c:322
#10 0x00007f4119997b8c in socket_poller (ctx=0x7f40e1b008b0) at socket.c:2433
#11 0x00007f413be67dc5 in start_thread () from /lib64/libpthread.so.0
#12 0x00007f4137e8a73d in clone () from /lib64/libc.so.6
(gdb)

Comment 13 Atin Mukherjee 2016-12-15 12:50:52 UTC
upstream patch http://review.gluster.org/#/c/16141/ posted for review.

Comment 14 Atin Mukherjee 2016-12-22 05:21:03 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/93555

Comment 16 Ambarish 2017-01-11 09:42:48 UTC
Cannot reproduce on glusterfs-3.8.4-10 + Ganesha 2.4.1-3 on multiple tries.

Confirmed with Vivek that the issue he mentions in Comment #12 isn't reproducible on a Samba+SSL setup either. 
 
Verified by QE.

Comment 18 errata-xmlrpc 2017-03-23 05:51:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html


Note You need to log in before you can comment on or make changes to this bug.