1395803 – (rhel7-libcurl-unbounded-mem-issue) provide a create object API that manages the object references.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1395803 (rhel7-libcurl-unbounded-mem-issue) - provide a create object API that manages the object references.

Summary: provide a create object API that manages the object references.

Keywords:
Status:	CLOSED ERRATA
Alias:	rhel7-libcurl-unbounded-mem-issue
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	nss
Sub Component:
Version:	7.3
Hardware:	All
OS:	All
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Bob Relyea
QA Contact:	Alicja Kario
Docs Contact:	Mirek Jahoda
URL:
Whiteboard:
Depends On:
Blocks:	1057388 1377248 1420851 1476743 1510247
TreeView+	depends on / blocked

Reported:	2016-11-16 17:51 UTC by Alicja Kario
Modified:	2021-06-10 11:40 UTC (History)
CC List:	23 users (show)
Fixed In Version:	nss-3.34.0-0.1.beta1.el7
Doc Type:	Enhancement
Doc Text:	`PK11_CreateManagedGenericObject()` has been added to NSS to prevent memory leaks in applications The `PK11_DestroyGenericObject()` function does not destroy objects allocated by `PK11_CreateGenericObject()` properly, but some applications depend on a function for creating objects that persist after the use of the object. For this reason, the Network Security Services (NSS) libraries now include the `PK11_CreateManagedGenericObject()` function. If you create objects with `PK11_CreateManagedGenericObject()`, the `PK11_DestroyGenericObject()` function also properly destroys underlying associated objects. Applications, such as the curl utility, can now use `PK11_CreateManagedGenericObject()` to prevent memory leaks.
Clone Of:	1057388
Clones:	1510247 (view as bug list)
Environment:
Last Closed:	2018-04-10 09:23:57 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Mozilla Foundation	1202413	0	P2	RESOLVED	because PK11_CreateGenericObject() leaks by design, add non-leaking PK11_CreateManagedGenericObject() API	2021-01-15 07:35:46 UTC
Red Hat Product Errata	RHEA-2018:0679	0	None	None	None	2018-04-10 09:25:17 UTC

Description Alicja Kario 2016-11-16 17:51:09 UTC

Same issue as described in Bug #1057388 exists when libcurl-devel-7.29.0-35.el7.x86_64 is used together with nss-3.21.3-2.el7_3.x86_64

+++ This bug was initially created as a clone of Bug #1057388 +++

Description of problem:

s3backer is a FUSE filesystem that exposes an Amazon S3 bucket as a file in your local filesystem. It relies on libcurl for communication with Amazon. RHEL's libcurl relies on nss as its TLS provider. Subjecting s3backer to prolonged stress on RHEL/CentOS 6.4 with the stock libcurl will result in unbounded memory consumption. Analysis with valgrind's memcheck tool shows no significant memory leaks. However, recompiling libcurl to use OpenSSL as its TLS provider instead of NSS makes the issue go away.

Version-Release number of selected component (if applicable):
nss-3.15.3-3.el6_5.x86_64
libcurl-7.19.7-37.el6_4.x86_64

How reproducible:

It happens every time.

Steps to Reproduce:

1. Spin up a RHEL/CentOS 6.x AWS instance and create an Amazon S3 bucket. In my case, the bucket has a 1M object size. Note that the only thing that is probably necessary here is a S3-like service, but to ensure reproducibility, I am providing precise parameters.
2. Install s3backer.
3. Mount the S3 bucket with a command such as the following; note sensitive details such as keys have been sanitized:

sudo s3backer --readAhead=0 --blockSize=1M --size=500G --listBlocks --vhost --baseURL=https://s3.amazonaws.com/ --timeout=15 --blockCacheThreads=10 --accessId=<accessId> --accessKey=<accessKey> --block
CacheSize=10 <bucket> <mountpoint>

4. Run dd on the mounted s3 bucket:

sudo dd if=/dev/urandom of=<mountpoint>/file bs=1M count=102400 iflag=nonblock,noatime oflag=nonblock,noatime

The workload where I originally observed this involved running a file system on top of s3backer that is not supported by Redhat. However, the same behavior can be reproduced by running dd on top of s3backer. It becomes increasingly obvious when using /proc/$PID/status to view memory usage over the span of a 2-4 hours. While the system running s3backer need not be on AWS, access to S3 from outside Amazon is quite slow and I highly recommend placing it there. A long running program that exercises the same code paths in libcurl and nss as s3backer should also exhibit this problem.

Actual results:

VmRSS slowly grows until either the dd stops or the OOM killer triggers.

Expected results:

VmRSS should stabilize with VmHWM not changing after the first hour or so.

Additional info:

Analysis with Valgrind's massif tool suggests shows multiple backtraces containing NSS in the same snapshot and the share of their memory usage grows as a function of instructions. Here is an excerpt of one of the backtraces:

| ->21.93% (13,056,000B) 0xE91C59A: ???
| | ->21.93% (13,056,000B) 0xE9218C7: ???
| |   ->21.93% (13,056,000B) 0xE928520: ???
| |     ->21.93% (13,056,000B) 0x37C56470C8: PK11_CreateNewObject (pk11obj.c:378)
| |       ->21.93% (13,056,000B) 0x37C5647361: PK11_CreateGenericObject (pk11obj.c:1415)
| |         ->21.93% (13,056,000B) 0x37C823F49E: nss_create_object (nss.c:349)
| |           ->21.93% (13,056,000B) 0x37C823F625: nss_load_cert (nss.c:383)
| |             ->21.93% (13,056,000B) 0x37C8240E0E: Curl_nss_connect (nss.c:1095)
| |               ->21.93% (13,056,000B) 0x37C8238480: Curl_ssl_connect (sslgen.c:185)
| |                 ->21.93% (13,056,000B) 0x37C8216EC9: Curl_http_connect (http.c:1796)
| |                   ->21.93% (13,056,000B) 0x37C821D680: Curl_protocol_connect (url.c:3077)
| |                     ->21.93% (13,056,000B) 0x37C8223B3A: Curl_connect (url.c:4743)
| |                       ->21.93% (13,056,000B) 0x37C822BBAE: Curl_perform (transfer.c:2523)
| |                         ->21.93% (13,056,000B) 0x409E30: http_io_perform_io (http_io.c:1437)
| |                           ->18.49% (11,009,280B) 0x40CB36: http_io_write_block (http_io.c:1348)
| |                           | ->18.49% (11,009,280B) 0x407422: ec_protect_write_block (ec_protect.c:433)
| |                           |   ->18.49% (11,009,280B) 0x404CFD: block_cache_worker_main (block_cache.c:1090)
| |                           |     ->18.49% (11,009,280B) 0x379120784F: start_thread (pthread_create.c:301)
| |                           |       ->18.49% (11,009,280B) 0x3790AE894B: clone (clone.S:115)

Recompiling libcurl with OpenSSL as its TLS provider makes the issue go away, which strongly suggests a problem either in NSS or in how libcurl uses NSS.

I am not familiar with the NSS codebase and I am concerned about the possibility of introducing a security hole should I attempt to patch this myself. Redhat employees in #rhel on freenode informed me that Redhat employs multiple NSS developers and suggested that I file a report here for them.

Comment 5 Bob Relyea 2017-03-27 21:55:17 UTC

There are 2 fixes needed for this problem. The nsspem and nss. The nss portion needs an upstream patch as it changes the ABI semantics. Moving to 7.5

Comment 9 Bob Relyea 2017-08-28 20:48:25 UTC

We better fix it for 7.5. I'll dev ack it .

Comment 10 Bob Relyea 2017-11-07 16:50:37 UTC

Fixed in  nss-3.34.0-0.1.beta1.el7
The new api is PK11_CreateManagedObject.

Comment 19 errata-xmlrpc 2018-04-10 09:23:57 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:0679

Note You need to log in before you can comment on or make changes to this bug.

ccheney
charles.stanley
cww
dueno
hkario
jaskalnik
jch
john.haxby
kdudka
kengert
mgrepl
mjahoda
nkinder
nmavrogi
pandrade
rakesh.petkar
rmetrich
rrelyea
ryao
sandis.neilands
serge.savard
szidek
zhoujianxiong2