Bug 504257

Summary: OpenSSL locking callbacks does not work with OpenSSL included in Fedora 10
Product: [Fedora] Fedora Reporter: Andris Pavenis <andris.pavenis>
Component: curlAssignee: Kamil Dudka <kdudka>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 10CC: kdudka, tmraz
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-06-05 07:27:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Source of OpenSSL locking callbacks test. none

Description Andris Pavenis 2009-06-05 07:14:02 UTC
Created attachment 346611 [details]
Source of OpenSSL locking callbacks test.

Description of problem:

OpenSSL locking callbacks does not work with OpenSSL included in Fedora 10.
If the attached program is compiled and run under Fedora 10 (with current updates at 05.06.2009) locking callbacks is not called at all. It makes OpenSSL
unusable from several threads.

For comparisson tested in CentOS-5.2 and CentOS-5.3: locking callbacks
are called as expected.

Also built openssl-1.0bet2 and rebuilt curl, installed into a separate prefix
(not /usr) and linked test program with these. Also locking callbacks
work as expected

Version-Release number of selected component (if applicable):

openssl-0.9.8g-13.fc10.i386

How reproducible:

Always.

Steps to Reproduce:
1. compile attached source (openssl, openssl-devel, curl and curl-devel must be installed:
gcc -Wall -Wextra openssl-locking-test.c $(curl-config --libs) -o openssl-locking-test -pthread

2. run it. At end program shows how many times locking callbacks have been called. It aborts if they are not called at all.

Actual results:

Program aborts when compiled and run in Fedora 10 as locking callbacks are
not called at all. 

Expected results:

OpenSSL locking callbacks should be called for OpenSSL to be usable in
multithreaded application.

Comment 1 Tomas Mraz 2009-06-05 07:27:03 UTC
Curl does not link to OpenSSL anymore in Fedora.

Comment 2 Andris Pavenis 2009-06-05 07:45:39 UTC
ldd says otherwise.

[apavenis@callisto openssl-locking]$ ldd /usr/lib/libcurl.so.4.1.1 
	linux-gate.so.1 =>  (0x00904000)
	libidn.so.11 => /lib/libidn.so.11 (0x02581000)
	libssh2.so.1 => /usr/lib/libssh2.so.1 (0x00d98000)
	libldap-2.4.so.2 => /usr/lib/libldap-2.4.so.2 (0x00508000)
	librt.so.1 => /lib/librt.so.1 (0x0066d000)
	libgssapi_krb5.so.2 => /usr/lib/libgssapi_krb5.so.2 (0x00c8a000)
	libkrb5.so.3 => /usr/lib/libkrb5.so.3 (0x041c5000)
	libk5crypto.so.3 => /usr/lib/libk5crypto.so.3 (0x00cbb000)
	libcom_err.so.2 => /lib/libcom_err.so.2 (0x00c41000)
	libz.so.1 => /lib/libz.so.1 (0x00410000)
	libssl3.so => /lib/libssl3.so (0x001d3000)
	libsmime3.so => /lib/libsmime3.so (0x00d6f000)
	libnss3.so => /lib/libnss3.so (0x05110000)
	libplds4.so => /lib/libplds4.so (0x07fe4000)
	libplc4.so => /lib/libplc4.so (0x07fea000)
	libnspr4.so => /lib/libnspr4.so (0x07fa8000)
	libpthread.so.0 => /lib/libpthread.so.0 (0x003f4000)
	libdl.so.2 => /lib/libdl.so.2 (0x003ed000)
	libc.so.6 => /lib/libc.so.6 (0x0024c000)
	libssl.so.7 => /lib/libssl.so.7 (0x07e0f000)
	libcrypto.so.7 => /lib/libcrypto.so.7 (0x029db000)
	liblber-2.4.so.2 => /usr/lib/liblber-2.4.so.2 (0x009fb000)
	libresolv.so.2 => /lib/libresolv.so.2 (0x00be2000)
	libsasl2.so.2 => /usr/lib/libsasl2.so.2 (0x00b46000)
	/lib/ld-linux.so.2 (0x00227000)
	libkrb5support.so.0 => /usr/lib/libkrb5support.so.0 (0x00b7e000)
	libkeyutils.so.1 => /lib/libkeyutils.so.1 (0x00c49000)
	libnssutil3.so => /lib/libnssutil3.so (0x00af7000)
	libcrypt.so.1 => /lib/libcrypt.so.1 (0x07f4e000)
	libselinux.so.1 => /lib/libselinux.so.1 (0x00553000)

[apavenis@callisto openssl-locking]$ rpm -qf /usr/lib/libcurl.so.4.1.1 
libcurl-7.19.4-5.fc10.i386

How else one can use libcurl from several threads for https://... URLs?

Noticed the problem when tried to use libcurl from several threads and got
random segfaults.

Comment 3 Kamil Dudka 2009-06-05 08:13:39 UTC
(In reply to comment #2)
> ldd says otherwise.

OpenSSL is most likely used by some libraries used by libcurl for other protocols, but it's definitely not used for https by libcurl itself. Anyway it's not good idea at all to use libcurl API mixed together with direct OpenSSL API calls. It could stop working after libcurl update even if we didn't move to NSS.

> How else one can use libcurl from several threads for https://... URLs?

By using libcurl callbacks only.

> Noticed the problem when tried to use libcurl from several threads and got
> random segfaults.

Then attach an example causing the segfault. But we can fix it only if you adhere to libcurl API, otherwise it's not a libcurl bug.

Comment 4 Andris Pavenis 2009-06-05 09:58:40 UTC
1) Tried to use libcurl callbacks (curlshare_init(), curl_share_setopt(), etc) instead of OpenSSL callbacks. Works nicely in CentOS 5.2. Does not however helping for Fedora 10. Random crashes remain.

2) Tried under GDB another application which uses OpenSSL through ACE (http://www.cs.wustl.edu/~schmidt/ACE.html). Putting breakpoint on ACE_SSL_locking_callback() shows that OpenSSL does however call locking callback. So the earlier reason really was not using OpenSSL for https://.

libcurl unstability remain, so it perhaps would be best to file a new bug report about curl. A good test example which randomly crashes is however needed so somebody else could reproduce the problem.

Comment 5 Kamil Dudka 2009-06-05 10:22:32 UTC
libcurl itself should be thread-safe:
http://curl.haxx.se/docs/faq.html#Is_libcurl_thread_safe

If the application crashes it might be rather caused by the code using libcurl in a wrong way. But feel free to open a new bug if you have a minimal example showing the crash.

Comment 6 Tomas Mraz 2009-06-05 10:29:35 UTC
If the problem appears only with https URLs and not http it might be a problem with the SSL implementation through NSS. Not within the NSS itself as it is threadsafe, but in the way libcurl calls it.

Comment 7 Andris Pavenis 2009-06-05 10:38:40 UTC
I'm not sure how usefull this can be, but here is backtrace of one of such random crashes from coredump:

(gdb) l
86	    return rv;
87	}
88	
89	int ssl_DefRecv(sslSocket *ss, unsigned char *buf, int len, int flags)
90	{
91	    PRFileDesc *lower = ss->fd->lower;
92	    int rv;
93	
94	    rv = lower->methods->recv(lower, (void *)buf, len, flags, ss->rTimeout);
95	    if (rv < 0) {
(gdb) where
#0  ssl_DefRecv (ss=0xaf37788, buf=0xaf37a08 "", len=5, flags=0) at ssldef.c:91
#1  0x001c7b9c in ssl3_GatherData () at ssl3gthr.c:90
#2  ssl3_GatherCompleteHandshake (ss=0xaf37788, flags=0) at ssl3gthr.c:195
#3  0x001cad9b in ssl_GatherRecord1stHandshake (ss=0xaf37788) at sslcon.c:1258
#4  0x001d0955 in ssl_Do1stHandshake (ss=0xaf37788) at sslsecur.c:151
#5  0x001d2057 in SSL_ForceHandshake (fd=0xaadef98) at sslsecur.c:407
#6  0x001d2125 in SSL_ForceHandshakeWithTimeout (fd=0xaadef98, timeout=30000) at sslsecur.c:428
#7  0x00d1fa0b in Curl_nss_connect (conn=0x9dbe768, sockindex=0) at nss.c:1199
#8  0x00d1633f in Curl_ssl_connect (conn=0x9dbe768, sockindex=0) at sslgen.c:185
#9  0x00cf586a in Curl_http_connect (conn=0x9dbe768, done=0xb6b510ba) at http.c:1793
#10 0x00cfc9c1 in Curl_protocol_connect (conn=0x9dbe768, protocol_done=0xb6b510ba) at url.c:2996
#11 0x00d01daa in setup_conn () at url.c:4628
#12 Curl_connect (data=0xb66347d8, in_connect=0xb6b510b4, asyncp=0xb6b510bb, protocol_done=0xb6b510ba) at url.c:4704
#13 0x00d0aa79 in connect_host () at transfer.c:2414
#14 Curl_perform (data=0xb66347d8) at transfer.c:2495
#15 0x00d0b7a3 in curl_easy_perform (curl=0xb66347d8) at easy.c:540
#16 0x0807fd75 in HtmlLoadTestBase::HtmlClient::handleClient (this=0xb68024f8) at src/HtmlLoadTestBase.cpp:120
#17 0x0807a087 in LoadTestBase::TestClientBase::clientThreadProc (this=0xb68024f8) at src/LoadTestBase.cpp:162
#18 0x0807c14f in sigc::bound_mem_functor0<void, LoadTestBase::TestClientBase>::operator() (this=0xb6802904) at /usr/include/sigc++-2.0/sigc++/functors/mem_fun.h:1787
#19 0x0807c166 in sigc::adaptor_functor<sigc::bound_mem_functor0<void, LoadTestBase::TestClientBase> >::operator() (this=0xb6802900)
    at /usr/include/sigc++-2.0/sigc++/adaptors/adaptor_trait.h:251
#20 0x0807c182 in sigc::internal::slot_call0<sigc::bound_mem_functor0<void, LoadTestBase::TestClientBase>, void>::call_it (rep=0xb68028e8)
    at /usr/include/sigc++-2.0/sigc++/functors/slot.h:103
#21 0x07f05cf5 in sigc::slot0<void>::operator() () at /usr/include/sigc++-2.0/sigc++/functors/slot.h:440
#22 call_thread_entry_slot (data=0xb68028d8) at thread.cc:46
#23 0x002abcaf in g_thread_create_proxy (data=0xb6802918) at gthread.c:635
#24 0x003fa51f in start_thread (arg=0xb6b51b90) at pthread_create.c:297
#25 0x00a3004e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
(gdb) p ss->fd
$9 = (PRFileDesc *) 0x0

Some additional information is added.

Comment 8 Kamil Dudka 2009-06-05 11:08:29 UTC
It seems like NSS is trying to perform SSL handshake using an invalid TCP socket. Is there any chance another thread closes the connection in the meantime?

Comment 9 Andris Pavenis 2009-06-05 11:20:58 UTC
Should not be possible. curl_easy_init() is called from constructor of an object, after that thread is started for doing real job and when thread is joined from destructor. There are more than 1 such objects, but they act in the same way (the application is intended to do load testing of server ...).

So connection is local to C++ object and inside it no concurrent access is possible.

Comment 10 Kamil Dudka 2009-06-05 12:21:18 UTC
(In reply to comment #9)
> So connection is local to C++ object and inside it no concurrent access is
> possible.  

I am not sure if I understand it correctly. curl_easy_init() does not connect anything, curl_easy_perform() does (if there is no established connection associated with the handle which can be reused). If you call curl_easy_perform() from different threads on the same handle, you need to manage locking on the application level. Why don't you create one curl easy handle per each thread?

Comment 11 Andris Pavenis 2009-06-05 12:39:36 UTC
In my test each CURL handle is used only once.

It is however closed from another thread, but only after thread which
called curl_easy_perform() has reported results (success or failure) and
is joined.

I should sometimes really write a simpler test case and file it in a
new bug report if it still fails. Application which generates randomly
distributed test events (in this case make HTTPS request and report result)
with given average frequency is bit too complicated
to be such test example.

Comment 12 Kamil Dudka 2009-06-05 12:46:13 UTC
Sure, a simple test case would be pretty helpful. Thanks in advance!

Comment 13 Kamil Dudka 2009-11-12 11:21:59 UTC
(In reply to comment #7)
A backtrace like this has been reported again as bug 534176 and resolved.