Bug 501138 - Tracker: bugs introduced by migration to NSS
Summary: Tracker: bugs introduced by migration to NSS
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: curl
Version: 10
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 453612 454151 464862 485032 500180 500815 501058 501080 501118 501191 509705 512019 516056 603783
Blocks: CryptoConsolidation 517000
TreeView+ depends on / blocked
 
Reported: 2009-05-16 22:46 UTC by Kamil Dudka
Modified: 2010-06-14 15:30 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-12-18 09:26:52 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
patch using UnloadUserModule()/DestroyModule() to free internal objects (5.67 KB, patch)
2009-05-17 00:11 UTC, Kamil Dudka
no flags Details | Diff
spec files and patches used for scratch builds (20.55 KB, application/x-gzip)
2009-06-01 17:50 UTC, Kamil Dudka
no flags Details
Simple pycurl test case (366 bytes, application/octet-stream)
2009-06-01 19:32 UTC, Toshio Ernie Kuratomi
no flags Details
more efficient test case (377 bytes, text/x-python)
2009-06-01 21:47 UTC, Kamil Dudka
no flags Details
the original Toshio's test including c.close() and pycurl.global_cleanup() (393 bytes, text/x-python)
2009-06-02 15:27 UTC, Kamil Dudka
no flags Details

Description Kamil Dudka 2009-05-16 22:46:19 UTC
Description of problem:
The curl package is broken by migration to NSS. This bug consolidates all bugs need to be resolved to make curl working.

Version-Release number of selected component (if applicable):
All current Fedora releases are affected.

Actual results:
Users need to compile OpenSSL based curl by hand.

Expected results:
NSS based curl fully equivalent to OpenSSL based curl

Additional info:
There are patches for almost all bugs. They just need to be reviewed and applied.

Comment 1 Kamil Dudka 2009-05-17 00:11:23 UTC
Created attachment 344307 [details]
patch using UnloadUserModule()/DestroyModule() to free internal objects

The attached patch makes curl leak-less in most cases. The PEM module is loaded on new connection and unloaded/destroyed on connection close. Unloading the module is now AFAIK the only way to free internal objects (now means Bug 501080 is resolved). I've conducted some testing against curl/nss with all proposed patches applied. It gives me zero memory leaks when using CA certificate only.

Rob, do you think the patch could break anything? Do we have another way to free internal objects?

Comment 2 Rob Crittenden 2009-05-18 14:12:45 UTC
Well, I would contend that this bug is a bit misleading since curl is not completely broken with the NSS/pem code and most users do in fact not recompile with OpenSSL (or if they are they aren't filing bugs against curl to this effect).

As to freeing objects, it depends. Why do you need to free them, because you're done with the certificate/key? I suppose a good enhancement would be to run through the list of certs and keys when trying to add new ones to be sure it isn't already loaded. I doubt it does this currently.

Comment 3 Kamil Dudka 2009-05-18 14:38:44 UTC
(In reply to comment #2)
> Well, I would contend that this bug is a bit misleading since curl is not
> completely broken with the NSS/pem code and most users do in fact not recompile
> with OpenSSL (or if they are they aren't filing bugs against curl to this
> effect).

I didn't say "completely" :-) It is sufficient for one-shot applications. But each application performing SSL transfers in a loop crashes on memory exhaustion sooner or later. I am not talking now about memory leaks on application's exit, it's minor problem. I am talking about the fast memory allocation during application's run-time (IMO caused primarily by the bug 501118).

> As to freeing objects, it depends. Why do you need to free them, because you're
> done with the certificate/key? I suppose a good enhancement would be to run
> through the list of certs and keys when trying to add new ones to be sure it
> isn't already loaded. I doubt it does this currently.  

I can't see any code freeing internal objects and/or the global array for pointers to them in the current version of PEM. Is it intentionally? I think the patch attached to bug 501080 does the right think anyway. I was only not sure with the patch for curl for loading/unloading the module per each connection. Do we have another way to free certificates which are no more used?

Comment 4 Rob Crittenden 2009-05-18 14:47:23 UTC
I think the right thing to do would be to remove the token. That should trigger all the appropriate PKCS#11 calls to free objects. Not sure it will actually do the rights thing though.

Comment 5 Kamil Dudka 2009-05-18 15:00:13 UTC
Does it mean the test case in bug 501080 is missing something? It gives me zero memory leaks with the patch applied. (not counting undestroyed mutex resolved by bug 501058 or by adding params="" to configstring)

Comment 6 Kamil Dudka 2009-05-18 15:22:09 UTC
I've prepared scratch builds of leak-less curl/nss so it can be easily tested together. Note the bug fix for #501191 is not included yet.

rawhide:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1358256
http://koji.fedoraproject.org/koji/taskinfo?taskID=1358338

F-10:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1358343
http://koji.fedoraproject.org/koji/taskinfo?taskID=1358251

Comment 7 Elio Maldonado Batiz 2009-05-22 16:50:08 UTC
This tracking bug depends on 500636. It looks like a circular dependency to me. I think 500636, which is a libcurl bug and refers to nss work which this bug tracks, should depend on this bug instead. What do you think?

Comment 8 Kamil Dudka 2009-05-22 16:59:30 UTC
Yes, it makes sense. Fixing now...

Comment 9 Elio Maldonado Batiz 2009-05-22 17:23:50 UTC
Shall we make scratch builds which include the 501191 patches as well?

Comment 10 Kamil Dudka 2009-05-22 17:55:45 UTC
It depends on how long will it take to release the update. Since I have no feedback for the current scratch builds yet, I am in no hurry to create a new one, but feel free to build it :-)

Just for summary. All nss' patches are reviewed already and now waiting for Rob's r+, am I right?

Comment 11 Toshio Ernie Kuratomi 2009-06-01 16:28:15 UTC
Do you have the SRPMs for your scratch builds somewhere koji's garbage collected the scratch builds and I missed seeing this bug to test before.

Comment 12 Kamil Dudka 2009-06-01 17:50:35 UTC
Created attachment 346105 [details]
spec files and patches used for scratch builds

It sounds like good enough reason to create new scratch builds (including the bug fixes which came afterwards):

http://koji.fedoraproject.org/koji/taskinfo?taskID=1387836
http://koji.fedoraproject.org/koji/taskinfo?taskID=1387880

The SRPMs are too big to attach, so I am only attaching the spec files and patches used for them. Maybe I can place the SRPMs somewhere on web if I find a proper place for them. Note there is still present a SIGSEGV bug while using a client certificate stored on a smartcard. I'll fix it later this week:

http://curl.haxx.se/mail/lib-2009-05/0305.html

Any feedback is welcome. Thanks in advance!

Comment 13 Toshio Ernie Kuratomi 2009-06-01 19:31:39 UTC
The packages work for a simple C test case but fails with a simple pycurl test case after one successful iteration with the following traceback:

[badger@clingman curl]$ ./break-pycurl.py
1 1
2
Traceback (most recent call last):
  File "./break-pycurl.py", line 12, in <module>
    c.perform()
pycurl.error: (60, 'Peer certificate cannot be authenticated with known CA certificates')

This is the same symptom as https://bugzilla.redhat.com/show_bug.cgi?id=488791

Comment 14 Toshio Ernie Kuratomi 2009-06-01 19:32:57 UTC
Created attachment 346129 [details]
Simple pycurl test case

Comment 15 Kamil Dudka 2009-06-01 19:51:56 UTC
Toshio, thanks for testing it! I can confirm your results with the py test case. I'll try to solve it properly this time. Meanwhile I suggest you using the new (testing) build of nss with the stable build of curl. The test passes in that case and the memory leak situation should be much better.

Comment 16 Toshio Ernie Kuratomi 2009-06-01 20:13:35 UTC
Thanks Kamil!  I can confirm that the new build of nss and current stable build of curl works and leaks *much* less than before.

Comment 17 Kamil Dudka 2009-06-01 21:47:55 UTC
Created attachment 346145 [details]
more efficient test case

Well, the patch for curl was a really bad idea. It is not thread-safe enough, that's why pycurl is broken by the patch. We need to implement a thread-safe way to free PEM internal objects and maybe also reduce their count if possible (as Rob proposed).

I am attaching a test case which gives me better performance since it allows curl to reuse already opened connections. It also appends a call of the global_cleanup(). The test case gives me not so bad memory leak summary (per 2000 connections) with valgrind:

LEAK SUMMARY:
   definitely lost: 120 bytes in 2 blocks.
     possibly lost: 13,872 bytes in 49 blocks.
   still reachable: 936,942 bytes in 478 blocks.
        suppressed: 0 bytes in 0 blocks.

Note it also reports 575 occurrences of invalid memory access, but it seems to be caused by the py binding. The py equipment can be a bit out-dated on my box.

Comment 18 Toshio Ernie Kuratomi 2009-06-02 14:12:59 UTC
Yeah, reusing the curl object can be nice.  I coded my test case to use multiple curl objects, however, because I'm using this in a library which is called by a multithreaded app.  Keeping a single curl object to do all the requests would necessitate locking around the object to prevent c.perform() being called in one thread while a different thread is changing the parameters sent via c.setopt().  Instead, the library creates a curl object whenever it needs to make a request.  Less efficient but simpler code for this application.

Comment 19 Kamil Dudka 2009-06-02 15:27:20 UTC
Created attachment 346271 [details]
the original Toshio's test including c.close() and pycurl.global_cleanup()

In that case the original test case will be closer to reality. I am only changing it to call c.close() and pycurl.global_cleanup(). It gives me the same leak summary on application exit.

I know it's still a bit hungry on memory during the run time, but it will take same additional time to fix it. So I'll ask NSS guys to release the update including only these 6 bug fixes for now.

Comment 20 Kamil Dudka 2009-07-05 12:38:02 UTC
Toshio, ready to test yet another improvement of pem/nss?

http://koji.fedoraproject.org/koji/taskinfo?taskID=1454959

Comment 21 Toshio Ernie Kuratomi 2009-07-06 21:53:48 UTC
Looks good.  All the test cases I've tried have worked.  Memory usage stays constant and https requests are being made as expected.

Comment 22 Kamil Dudka 2009-07-06 22:07:24 UTC
Toshio, thanks for testing it!

The patch review process will follow at bug 509705 if anybody interested. I am sure this is not the last patch for pem/nss from me. I can still see a lot of opened questions there.

Comment 23 Bug Zapper 2009-11-18 09:13:35 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 24 Bug Zapper 2009-12-18 09:26:52 UTC
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.