Bug 960765 - cURL does not time out properly on SSL connections (fix backport request)
Summary: cURL does not time out properly on SSL connections (fix backport request)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: curl
Version: 17
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-05-07 21:58 UTC by David Strauss
Modified: 2013-05-25 12:14 UTC (History)
3 users (show)

Fixed In Version: curl-7.24.0-9.fc17
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-05-15 03:25:29 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description David Strauss 2013-05-07 21:58:51 UTC
Description of problem:
Fedora builds cURL against NSS, which has an integration bug where libcurl asks NSS to receive data with no timeout, regardless of the timeout option set in libcurl.

Version-Release number of selected component (if applicable):
All (including upstream's master branch)

I'm leaving out reproduction data because we're already looking at a fix upstream: http://comments.gmane.org/gmane.comp.web.curl.library/39357

I'll post back once it's in libcurl's master branch.

Comment 1 Kamil Dudka 2013-05-08 12:06:01 UTC
I believe this is already fixed in rawhide and upstream:

https://github.com/bagder/curl/commit/9d0af301

How are you confirming that the recv/send functions provided by NSS can block?

Comment 2 David Strauss 2013-05-08 18:09:13 UTC
I think you're right. I'm using the F17 curl package, and that predates the fix for properly setting the non-blocking status for NSS.

I just looked at the relevant NSS source, and it seems to always assume the sockets it uses could be non-blocking while simulating blocking behavior by looping internally until the timeout gets hit if fd->secret->nonblocking is false for the connection. If F18 and earlier, I'm guessing it thinks that nonblocking variable is false right now, based on behavior in my traces.

But, the timeout is set to a PRIntervalTime of PR_INTERVAL_NO_TIMEOUT. So, in F18 and earlier, combined with the (1) libcurl setting the actual socket to non-blocking and (2) *not* properly setting non-blocking properly for NSS, NSS just polls for at least eight hours.

In any case, I'll update upstream and still push for using a timeout of PR_INTERVAL_NO_WAIT. It will help unmask future regressions around non-blocking status in NSS and increase code clarity around how the send and receive functions behave in potentially blocking scenarios.

Would you be amenable to back-porting the fix in setting the non-blocking status to F17 and F18 packages? It would be very helpful to have HTTPS timeouts work properly in these existing releases.

Comment 3 David Strauss 2013-05-08 18:10:29 UTC
Setting version to F19. In all cases where I mention F17 and F18, I believe I actually mean "F19 and earlier."

Comment 4 David Strauss 2013-05-08 18:13:36 UTC
Also, let me directly respond to your question.

> How are you confirming that the recv/send functions provided by NSS can block?

I never saw them blocking, just looping around a poll() call for hours even with a three-minute timeout set in libcurl. I think this is because libcurl sets up the socket as non-blocking, but NSS loops around a poll() unless it's also set properly to do non-blocking.

Comment 5 David Strauss 2013-05-08 21:51:05 UTC
I'm trying to test out this change on F17, but I keep running into pycurl issues. After rebuilding curl, libcurl-devel, libcurl, and curl-debuginfo and then installing them, pycurl ceases to work with this error:

ImportError: build/lib.linux-x86_64-2.7/pycurl.so: undefined symbol: CRYPTO_num_locks

I also can't rebuild pycurl with my rebuilds of curl in place. It's funny because pycurl seems to expect OpenSSL resources (like the CRYPTO_num_locks symbol), but curl builds with NSS in the Fedora packages. It's not clear how to make pycurl happy.

And, of course, a broken pycurl means a broken Yum.

Comment 6 David Strauss 2013-05-08 21:53:39 UTC
> I also can't rebuild pycurl with my rebuilds of curl in place. It's funny because pycurl seems to expect OpenSSL resources (like the CRYPTO_num_locks symbol), but curl builds with NSS in the Fedora packages. It's not clear how to make pycurl happy.

Maybe it's because I'm disabling libssh2 in my curl build? Keeping libssh2 breaks the curl build with a Valgrind test failure, though.

Comment 7 David Strauss 2013-05-08 23:11:31 UTC
Okay, things work now with libssh2 enabled in the build and test582 disabled. I can't get the build to work with test582 enabled, even for the stock SRPM.

Comment 8 Kamil Dudka 2013-05-09 09:46:27 UTC
(In reply to comment #4)
> Also, let me directly respond to your question.
> 
> > How are you confirming that the recv/send functions provided by NSS can block?
> 
> I never saw them blocking, just looping around a poll() call for hours even
> with a three-minute timeout set in libcurl. I think this is because libcurl
> sets up the socket as non-blocking, but NSS loops around a poll() unless
> it's also set properly to do non-blocking.

If you saw poll() being called repeatedly by strace, it was most likely happening at the NSS level, which means the PR_Recv/PR_Send() calls were seen as blocking by libcurl.  This should not happen in rawhide and will be fixed in stable Fedora releases.

(In reply to comment #7)
> Okay, things work now with libssh2 enabled in the build and test582
> disabled. I can't get the build to work with test582 enabled, even for the
> stock SRPM.

I suspect you are hitting bug #821440 -- you can either upgrade nss-softokn to a version that contains the fix, or create a valgrind suppression for that memory leak.

Comment 9 Kamil Dudka 2013-05-09 11:14:42 UTC
already fixed in curl-7.29.0-3.fc19

Comment 10 Fedora Update System 2013-05-09 12:07:39 UTC
curl-7.24.0-9.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/curl-7.24.0-9.fc17

Comment 11 Fedora Update System 2013-05-09 12:08:18 UTC
curl-7.27.0-10.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/curl-7.27.0-10.fc18

Comment 12 David Strauss 2013-05-09 17:39:08 UTC
> If you saw poll() being called repeatedly by strace, it was most likely happening at the NSS level, which means the PR_Recv/PR_Send() calls were seen as blocking by libcurl.

That is correct. It's the effect of giving NSS a non-blocking socket fd but not telling it to treat it as non-blocking.

> This should not happen in rawhide and will be fixed in stable Fedora releases.

Thank you. You're awesome. This is why I love free, open-source software.

Comment 13 Fedora Update System 2013-05-10 04:53:09 UTC
Package curl-7.24.0-9.fc17:
* should fix your issue,
* was pushed to the Fedora 17 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing curl-7.24.0-9.fc17'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-7797/curl-7.24.0-9.fc17
then log in and leave karma (feedback).

Comment 14 Fedora Update System 2013-05-15 03:25:29 UTC
curl-7.27.0-10.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 15 Fedora Update System 2013-05-25 12:14:30 UTC
curl-7.24.0-9.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.