Bug 975755
Comment 4
Kamil Dudka
2013-06-25 14:33:13 UTC
Waht seems new (relative to either to the nss.13.6 backtrace or 3.14.0.0 success) are SEC_Module lines in the backtarce .... ==14107== by 0x44C2B0F: SECMOD_GetModuleSpecList (pk11pars.c:914) ==14107== by 0x44C4D5F: SECMOD_LoadModule (pk11pars.c:1028) ==14107== by 0x44C4DB3: SECMOD_LoadModule (pk11pars.c:1045) ==14107== by 0x44C4DB3: SECMOD_LoadModule (pk11pars.c:1045) ==14107== by 0x4486C2B: nss_Init (nssinit.c:438) ==14107== by 0x44874D3: NSS_InitContext (nssinit.c:837) .... Comparing 3.14.0.0 against 3.14.3 and found very little changes in lib/nss/nssinit.c. Where I see changes is in lib/util/utilmod.c and lib/util/pk11pars.c (they belong to nss-util). They don't strike me as problematic. I'll attach that difference file for what it may be worth. Actually, the trace goes through pk11pars.c and that code hasn't changed. Created attachment 768434 [details]
changes in lib/nss from 3.14.0.0 to 3.14.3 in patch format
I don't see anything signifficance as far as memory goes.
Created attachment 768440 [details]
changes in lib/pk11wrap from 3.14.0.0 to 3.14.3 in patch format
This is not a patch as sucj, the format helps studying the changes which may be of relevance to the problem at hand.
Bob, The pk11auth.c and chapk11slot.c changes are ones that called my attention. Could they in any way have a bearing on this problem?
Created attachment 768442 [details]
changes in lib/util from 3.14.0.0 to 3.14.3 in patch format
Here the changes on utilmod.c and utilpars.c are ones that call my attention.
By inspection, I don't see anything in mod util that could be an issue. pk11auth and pk11slot are your best bests. What you should do is reproduce the curl leak on your machine and do a binary check to see what patch created the leak. It's possible the leak is caused by a slot reference leak, preventing a slot from being freed, and thus the stuff related to the module. bob Since I don't have Beaker on my own system I studied the test and I see that it invokes Kamil's original program, invokes valgrind on it, and collect results. So I did something not as soffisticated but that it is still effectively the same. I couldn't reproduce the problem. I fact. valgrind reports better results on 3.14.3. Attachments coming next. Created attachment 769019 [details]
Kamil's program fom the original bug report.
This is the basis for both the Beaker regression test and my own.
I compile the test with this Makefile: ---------------------- all: bz772053-test-01 bz772053-test-01: bz772053-test-01.c gcc -Werror -Wfatal-errors -g $^ -o $@ `pkg-config --cflags --libs nss` `pkg-config --cflags --libs libcurl` chmod 755 $@ clean: rm -f ./bz772053-test-01 valgrind-test: all ./runtest.sh -------------------------- and runtest.sh is simply #!/bin/bash valgrind --leak-check=full --show-reachable=yes ./bz772053-test-01 > ./bz772053-test-01-$(rpm -q nss).log 2>&1 --------------------------------- The version of nss being used gets embedded on the log file name. Created attachment 769020 [details]
Valgrind results with nss-3.14.0.0 as currently ships
Created attachment 769021 [details]
Valgrind results with nss-3.14.3 as we plan for rhel-6.4.z
Created attachment 769022 [details] the results diiferences in patch format for viewing convenience Generated with diff -u bz772053-test-01-nss-3.14.0.0-12.el6.x86_64.log bz772053-test-01-nss-3.14.3-4.el6_4.x86_64.log > results-diffs.patch Towards the end are the summaries. For nss-3.14.0.0: ==9937== LEAK SUMMARY: ==9937== definitely lost: 320 bytes in 16 blocks ==9937== indirectly lost: 0 bytes in 0 blocks ==9937== possibly lost: 176 bytes in 4 blocks ==9937== still reachable: 24,421 bytes in 101 blocks ==9937== suppressed: 0 bytes in 0 blocks ==9937== ==9937== For counts of detected and suppressed errors, rerun with: -v ==9937== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 86 from 9) For nss-3.14.3-4: ==10109== LEAK SUMMARY: ==10109== definitely lost: 0 bytes in 0 blocks ==10109== indirectly lost: 0 bytes in 0 blocks ==10109== possibly lost: 176 bytes in 4 blocks ==10109== still reachable: 24,421 bytes in 101 blocks ==10109== suppressed: 0 bytes in 0 blocks ==10109== ==10109== For counts of detected and suppressed errors, rerun with: -v ==10109== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 86 from 9) It actually looks better with nss-3.14.3-4.el6_4! I'll rerun with the -v option. Created attachment 769036 [details]
results differences - verbose version - in patch format
Lines starting with '-' are with 3.14.0.0 and those starting with '+' are 3.14.3.
The leak is in nssutil_ReadSecmodDB() -- the moduleList array is going to leak at this point: 357│ /* old one exists */ 358│ status = PR_Access(olddbname, PR_ACCESS_EXISTS); 359│ if (status == PR_SUCCESS) { 360│ PR_smprintf_free(olddbname); 361│ PORT_SetError(SEC_ERROR_LEGACY_DATABASE); 362├> return NULL; 363│ } Created attachment 770598 [details] Plug memory leak in nssutil_ReadSecmodDB A simple, and perhaps naive, patch. A scratch build based on it is at https://brewweb.devel.redhat.com/taskinfo?taskID=6005880 The leak is still in the latest upstream code. I tested a new build from: https://brewweb.devel.redhat.com/taskinfo?taskID=6005880 and it seems it had solved the problem. Log: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ LOG ] :: Test :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Stopping httpd: [ OK ] Starting httpd: [ OK ] :: [ PASS ] :: Starting httpd server (with SSL) :: [ PASS ] :: Running the reproducer :: [ PASS ] :: NSS memory leak check ==23264== Memcheck, a memory error detector ==23264== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==23264== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==23264== Command: /home/nsstestuser/nss_test ==23264== test_instance() succeeded 1/16 test_instance() succeeded 2/16 test_instance() succeeded 3/16 test_instance() succeeded 4/16 test_instance() succeeded 5/16 test_instance() succeeded 6/16 test_instance() succeeded 7/16 test_instance() succeeded 8/16 test_instance() succeeded 9/16 test_instance() succeeded 10/16 test_instance() succeeded 11/16 test_instance() succeeded 12/16 test_instance() succeeded 13/16 test_instance() succeeded 14/16 test_instance() succeeded 15/16 test_instance() succeeded 16/16 ==23264== ==23264== HEAP SUMMARY: ==23264== in use at exit: 24,085 bytes in 104 blocks ==23264== total heap usage: 438,057 allocs, 437,953 frees, 164,042,265 bytes allocated ==23264== ==23264== 40 bytes in 1 blocks are possibly lost in loss record 38 of 88 ==23264== at 0x4A069EE: malloc (vg_replace_malloc.c:270) ==23264== by 0x37C760D78B: _PR_Getfd (prfdcach.c:112) ==23264== by 0x37C7625014: pt_SetMethods (ptio.c:3301) ==23264== by 0x37C7625181: PR_ImportTCPSocket (ptio.c:4575) ==23264== by 0x4C5E048: Curl_nss_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C55481: Curl_ssl_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C33ECA: Curl_http_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C3A681: Curl_protocol_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C40B3B: Curl_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C48BAF: Curl_perform (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x40095A: test_session (in /home/nsstestuser/nss_test) ==23264== by 0x4009A4: test_instance (in /home/nsstestuser/nss_test) ==23264== ==23264== 40 bytes in 1 blocks are possibly lost in loss record 39 of 88 ==23264== at 0x4A069EE: malloc (vg_replace_malloc.c:270) ==23264== by 0x37C760D78B: _PR_Getfd (prfdcach.c:112) ==23264== by 0x37C7625014: pt_SetMethods (ptio.c:3301) ==23264== by 0x37C7626210: PR_Socket (ptio.c:3503) ==23264== by 0x4C5D637: Curl_nss_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C55481: Curl_ssl_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C33ECA: Curl_http_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C3A681: Curl_protocol_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C40B3B: Curl_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C48BAF: Curl_perform (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x40095A: test_session (in /home/nsstestuser/nss_test) ==23264== by 0x4009A4: test_instance (in /home/nsstestuser/nss_test) ==23264== ==23264== 48 bytes in 1 blocks are possibly lost in loss record 43 of 88 ==23264== at 0x4A069EE: malloc (vg_replace_malloc.c:270) ==23264== by 0x37C760D779: _PR_Getfd (prfdcach.c:109) ==23264== by 0x37C7625014: pt_SetMethods (ptio.c:3301) ==23264== by 0x37C7625181: PR_ImportTCPSocket (ptio.c:4575) ==23264== by 0x4C5E048: Curl_nss_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C55481: Curl_ssl_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C33ECA: Curl_http_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C3A681: Curl_protocol_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C40B3B: Curl_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C48BAF: Curl_perform (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x40095A: test_session (in /home/nsstestuser/nss_test) ==23264== by 0x4009A4: test_instance (in /home/nsstestuser/nss_test) ==23264== ==23264== 48 bytes in 1 blocks are possibly lost in loss record 44 of 88 ==23264== at 0x4A069EE: malloc (vg_replace_malloc.c:270) ==23264== by 0x37C760D779: _PR_Getfd (prfdcach.c:109) ==23264== by 0x37C7625014: pt_SetMethods (ptio.c:3301) ==23264== by 0x37C7626210: PR_Socket (ptio.c:3503) ==23264== by 0x4C5D637: Curl_nss_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C55481: Curl_ssl_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C33ECA: Curl_http_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C3A681: Curl_protocol_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C40B3B: Curl_connect (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x4C48BAF: Curl_perform (in /usr/lib64/libcurl.so.4.1.1) ==23264== by 0x40095A: test_session (in /home/nsstestuser/nss_test) ==23264== by 0x4009A4: test_instance (in /home/nsstestuser/nss_test) ==23264== ==23264== LEAK SUMMARY: ==23264== definitely lost: 0 bytes in 0 blocks ==23264== indirectly lost: 0 bytes in 0 blocks ==23264== possibly lost: 176 bytes in 4 blocks ==23264== still reachable: 23,909 bytes in 100 blocks ==23264== suppressed: 0 bytes in 0 blocks ==23264== Reachable blocks (those to which a pointer was found) are not shown. ==23264== To see them, rerun with: --leak-check=full --show-reachable=yes ==23264== ==23264== For counts of detected and suppressed errors, rerun with: -v ==23264== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 134 from 9) Comment on attachment 770598 [details]
Plug memory leak in nssutil_ReadSecmodDB
r+ egad, that function could use a rewrite (says the original author).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1558.html |