Bug 905013

Summary: NSPR pthread_key_t leak and memory corruption
Product: Red Hat Enterprise Linux 6 Reporter: Aleš Mareček <amarecek>
Component: nsprAssignee: Elio Maldonado Batiz <emaldona>
Status: CLOSED ERRATA QA Contact: Alicja Kario <hkario>
Severity: high Docs Contact:
Priority: high    
Version: 6.4CC: ablum, amarecek, azelinka, cww, dpal, emaldona, hkario, jbastian, jorton, kengert, ksrot, nalin, rcritten, rrelyea, sforsber
Target Milestone: alpha   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: nspr-4.9.5-2.el6 Doc Type: Bug Fix
Doc Text:
Cause: The NSPR posix threads library did not delete keys it had allocated and failed to perform proper checks at cleanup time. Consequence: If the NSPR shared libraries got repeatedly loaded and unloaded during the lifetime of a single process, NSPR might repeatedly allocate the same key again and again, with never freeing it up. Eventually the memory available for thread specific keys will be exhausted and the application will fail. Fix: NSPR keeps track of whether a key should be deleted. At finalization time it checks whether it has has been successfully intialized, cleanup has been done already, and whether a key has been created and must be deleted. Result: Processes no longer experience memory exhaustion or corruption when they have to repeatedly load and unload the NSPR shared libraries.
Story Points: ---
Clone Of: 633519 Environment:
Last Closed: 2013-11-21 06:10:41 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 633519    
Bug Blocks: 817178, 835616    
Attachments:
Description Flags
Patch made from what I picked up from the upstream bug
none
The original bug reproducer attached to Bug 633519
none
Log file produced by executing the test none

Comment 3 Elio Maldonado Batiz 2013-08-09 04:36:50 UTC
Created attachment 784716 [details]
Patch made from what I picked up from the upstream bug

Comment 4 Kai Engert (:kaie) (inactive account) 2013-08-12 20:47:26 UTC
I refreshed my memory by rereading... the upstream bug.

The good news is:
  Upstream has applied a basic fix, and that fix has been contained
  in NSPR since version 4.9.3 

I believe that RHEL 6.4 has already been updated to NSPR 4.9.5, and therefore the basic fix should already be available.

I'd like to ask that you please repeat the testing of the server issue, to check that it has been sufficiently fixed.


However, more details: Bob had identified another scenario where we might still leak. There has been disagreements and stalling upstream which would be the correct approach to fix the remaining issue.

The patch that Elio has attached in comment 3 was my proposal to fix the remaining issue at upstream. That patch already god some reviewing from Bob, but not yet a final review.

In other words, here's my recommandation:


(A) Please test if existing NSPR 4.9.5 is sufficient to fix the issue.

    If it is, we should mark this bug a resolved and wait for upstream
    to complete and pick up remaining fixes later.

(B) If your testing with NSPR 4.9.5 shows we still have this bug,
    we will provide a scratch RPM with the suggested remaining fix.

Comment 5 Elio Maldonado Batiz 2013-08-14 16:50:36 UTC
Created attachment 786613 [details]
The original bug reproducer attached to Bug 633519

Comment 6 Elio Maldonado Batiz 2013-08-14 16:53:33 UTC
Created attachment 786615 [details]
Log file produced by executing the test

Produce by follwing the instructions at the top of the source file:

$ make tsdleak CFLAGS="-Wall -O2 -I/usr/include/nspr4" LDFLAGS="-ldl -lpthread"
cc -Wall -O2 -I/usr/include/nspr4  -ldl -lpthread  tsdleak.c   -o tsdleak
$ MALLOC_CHECK_=2 ./tsdleak 1> tsdleak.log 2>&1
$ grep fail tsdleak.log

Neither of printf's with 'dlopen failed:' or 'dlsym failed: were hit.

Comment 12 errata-xmlrpc 2013-11-21 06:10:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1558.html