This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1331973 - memory leak in erroneous derived-metrics
memory leak in erroneous derived-metrics
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: pcp (Show other bugs)
rawhide
Unspecified Unspecified
unspecified Severity urgent
: ---
: ---
Assigned To: Mark Goodwin
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-04-30 13:54 EDT by Frank Ch. Eigler
Modified: 2016-07-09 16:19 EDT (History)
8 users (show)

See Also:
Fixed In Version: pcp-3.11.3-1 pcp-3.11.3-1.fc24 pcp-3.11.3-1.fc22 pcp-3.11.3-1.fc23 pcp-3.11.3-1.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-06-27 14:28:10 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch to fix memory leak in derived metric error handling (1.96 KB, patch)
2016-06-15 03:53 EDT, Mark Goodwin
no flags Details | Diff

  None (edit)
Description Frank Ch. Eigler 2016-04-30 13:54:55 EDT
In pcp 3.11.2, derived metrics are automatically added as a part of context initialization.  However, if there is an error, the derive.c code does not clean up fully.

% mkdir foo
% echo 'foo.bar = 100 * delta(non.existent)' > foo/derived.conf
% PCP_DERIVED_CONFIG=`pwd`/foo valgrind --leak-check=full pminfo -f foo.bar

==20471== 144 (64 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 35 of 40
==20471==    at 0x4C28C50: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20471==    by 0x4E7F751: newnode (derive.c:532)
==20471==    by 0x4E80D13: bind_expr (derive.c:569)
==20471==    by 0x4E80D27: bind_expr (derive.c:571)
==20471==    by 0x4E8329A: __dmopencontext (derive.c:2130)
==20471==    by 0x4E4A612: pmNewContext (context.c:1151)
==20471==    by 0x4017AD: main (pminfo.c:646)

derive.c bind_expr() is clearly implicated.  If the left or right bind_expr()
subcalls fail, the new newnode() is never freed.  But one should audit all
the related code to guarantee a proper cleanup on an error.
Comment 1 Frank Ch. Eigler 2016-05-22 17:46:14 EDT
Please note that this memory leak affects every pm$CLIENT that, e.g., deals with archives that lack some of the input metrics to satisfy any derived metric, even:

% valgrind --leak-check=full pminfo -f -a $PCP/qa/archives/19970807.09.54

pmwebd (graphite charting) is particularly severely affected.
Comment 2 Frank Ch. Eigler 2016-06-14 11:43:24 EDT
If you have no plan to correct this regression, would you at least accept a patch to remove the iostat.conf file from the default distribution?
Comment 3 Nathan Scott 2016-06-14 19:26:49 EDT
I have no plans, but chatting to mgoodwin last week I understand he is planning to tackle this one at some point.

A workaround would not be helpful, no - its 100% reproducible, easily regression tested - it should just be fixed properly.
Comment 4 Mark Goodwin 2016-06-15 03:53 EDT
Created attachment 1168237 [details]
patch to fix memory leak in derived metric error handling

Error handling needs to recursively free the current node, since it may have been built recursively, so call free_expr() instead of free() in the appropriate places where a derived expression fails. The patch also fortifies free_expr() itself a bit.

Passes qa for group 'derive', and Frank's valgrind repro script passes now too. A new QA test should probably be added.

BTW, this is not a regression per-se - the leak has always been there - it's just more noticeable now that global derived metrics defs are loaded by default.
Comment 5 Mark Goodwin 2016-06-15 04:08:50 EDT
Posted upstream patch for review http://oss.sgi.com/pipermail/pcp/2016-June/010756.html
Comment 6 Frank Ch. Eigler 2016-06-16 15:00:21 EDT
Thanks, Mark, that looks good.  (I haven't had the chance yet to try it against a larger sustained pmwebd load.)
Comment 7 Fedora Update System 2016-06-17 05:16:10 EDT
pcp-3.11.3-1.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0b45289bc4
Comment 8 Fedora Update System 2016-06-17 05:17:48 EDT
pcp-3.11.3-1.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-331bd3e9bf
Comment 9 Fedora Update System 2016-06-17 05:18:21 EDT
pcp-3.11.3-1.el5 has been submitted as an update to Fedora EPEL 5. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-4745f3e292
Comment 10 Fedora Update System 2016-06-17 06:56:26 EDT
pcp-3.11.3-1.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-aad44ac639
Comment 11 Fedora Update System 2016-06-18 01:23:49 EDT
pcp-3.11.3-1.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-331bd3e9bf
Comment 12 Fedora Update System 2016-06-18 01:24:18 EDT
pcp-3.11.3-1.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-aad44ac639
Comment 13 Fedora Update System 2016-06-18 02:17:21 EDT
pcp-3.11.3-1.el5 has been pushed to the Fedora EPEL 5 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-4745f3e292
Comment 14 Fedora Update System 2016-06-18 12:26:04 EDT
pcp-3.11.3-1.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0b45289bc4
Comment 15 Fedora Update System 2016-06-27 14:27:55 EDT
pcp-3.11.3-1.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Comment 16 Fedora Update System 2016-06-27 18:52:55 EDT
pcp-3.11.3-1.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.
Comment 17 Fedora Update System 2016-06-27 18:56:41 EDT
pcp-3.11.3-1.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
Comment 18 Fedora Update System 2016-07-09 16:18:50 EDT
pcp-3.11.3-1.el5 has been pushed to the Fedora EPEL 5 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.