Bug 1331973 - memory leak in erroneous derived-metrics
Summary: memory leak in erroneous derived-metrics
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: pcp
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: Mark Goodwin
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-04-30 17:54 UTC by Frank Ch. Eigler
Modified: 2016-07-09 20:19 UTC (History)
8 users (show)

Fixed In Version: pcp-3.11.3-1 pcp-3.11.3-1.fc24 pcp-3.11.3-1.fc22 pcp-3.11.3-1.fc23 pcp-3.11.3-1.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-27 18:28:10 UTC
Type: Bug


Attachments (Terms of Use)
patch to fix memory leak in derived metric error handling (1.96 KB, patch)
2016-06-15 07:53 UTC, Mark Goodwin
no flags Details | Diff

Description Frank Ch. Eigler 2016-04-30 17:54:55 UTC
In pcp 3.11.2, derived metrics are automatically added as a part of context initialization.  However, if there is an error, the derive.c code does not clean up fully.

% mkdir foo
% echo 'foo.bar = 100 * delta(non.existent)' > foo/derived.conf
% PCP_DERIVED_CONFIG=`pwd`/foo valgrind --leak-check=full pminfo -f foo.bar

==20471== 144 (64 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 35 of 40
==20471==    at 0x4C28C50: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20471==    by 0x4E7F751: newnode (derive.c:532)
==20471==    by 0x4E80D13: bind_expr (derive.c:569)
==20471==    by 0x4E80D27: bind_expr (derive.c:571)
==20471==    by 0x4E8329A: __dmopencontext (derive.c:2130)
==20471==    by 0x4E4A612: pmNewContext (context.c:1151)
==20471==    by 0x4017AD: main (pminfo.c:646)

derive.c bind_expr() is clearly implicated.  If the left or right bind_expr()
subcalls fail, the new newnode() is never freed.  But one should audit all
the related code to guarantee a proper cleanup on an error.

Comment 1 Frank Ch. Eigler 2016-05-22 21:46:14 UTC
Please note that this memory leak affects every pm$CLIENT that, e.g., deals with archives that lack some of the input metrics to satisfy any derived metric, even:

% valgrind --leak-check=full pminfo -f -a $PCP/qa/archives/19970807.09.54

pmwebd (graphite charting) is particularly severely affected.

Comment 2 Frank Ch. Eigler 2016-06-14 15:43:24 UTC
If you have no plan to correct this regression, would you at least accept a patch to remove the iostat.conf file from the default distribution?

Comment 3 Nathan Scott 2016-06-14 23:26:49 UTC
I have no plans, but chatting to mgoodwin last week I understand he is planning to tackle this one at some point.

A workaround would not be helpful, no - its 100% reproducible, easily regression tested - it should just be fixed properly.

Comment 4 Mark Goodwin 2016-06-15 07:53:10 UTC
Created attachment 1168237 [details]
patch to fix memory leak in derived metric error handling

Error handling needs to recursively free the current node, since it may have been built recursively, so call free_expr() instead of free() in the appropriate places where a derived expression fails. The patch also fortifies free_expr() itself a bit.

Passes qa for group 'derive', and Frank's valgrind repro script passes now too. A new QA test should probably be added.

BTW, this is not a regression per-se - the leak has always been there - it's just more noticeable now that global derived metrics defs are loaded by default.

Comment 5 Mark Goodwin 2016-06-15 08:08:50 UTC
Posted upstream patch for review http://oss.sgi.com/pipermail/pcp/2016-June/010756.html

Comment 6 Frank Ch. Eigler 2016-06-16 19:00:21 UTC
Thanks, Mark, that looks good.  (I haven't had the chance yet to try it against a larger sustained pmwebd load.)

Comment 7 Fedora Update System 2016-06-17 09:16:10 UTC
pcp-3.11.3-1.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0b45289bc4

Comment 8 Fedora Update System 2016-06-17 09:17:48 UTC
pcp-3.11.3-1.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-331bd3e9bf

Comment 9 Fedora Update System 2016-06-17 09:18:21 UTC
pcp-3.11.3-1.el5 has been submitted as an update to Fedora EPEL 5. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-4745f3e292

Comment 10 Fedora Update System 2016-06-17 10:56:26 UTC
pcp-3.11.3-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-aad44ac639

Comment 11 Fedora Update System 2016-06-18 05:23:49 UTC
pcp-3.11.3-1.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-331bd3e9bf

Comment 12 Fedora Update System 2016-06-18 05:24:18 UTC
pcp-3.11.3-1.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-aad44ac639

Comment 13 Fedora Update System 2016-06-18 06:17:21 UTC
pcp-3.11.3-1.el5 has been pushed to the Fedora EPEL 5 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-4745f3e292

Comment 14 Fedora Update System 2016-06-18 16:26:04 UTC
pcp-3.11.3-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0b45289bc4

Comment 15 Fedora Update System 2016-06-27 18:27:55 UTC
pcp-3.11.3-1.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 16 Fedora Update System 2016-06-27 22:52:55 UTC
pcp-3.11.3-1.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.

Comment 17 Fedora Update System 2016-06-27 22:56:41 UTC
pcp-3.11.3-1.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Comment 18 Fedora Update System 2016-07-09 20:18:50 UTC
pcp-3.11.3-1.el5 has been pushed to the Fedora EPEL 5 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.