RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1262721 - pcp -a archive uptime core dumps
Summary: pcp -a archive uptime core dumps
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pcp
Version: 7.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Nathan Scott
QA Contact: Miloš Prchlík
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-09-14 08:00 UTC by Marko Myllynen
Modified: 2016-11-04 04:22 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-04 04:22:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
pcp uptime core dump (1.01 MB, application/x-gzip)
2015-09-14 08:00 UTC, Marko Myllynen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:2344 0 normal SHIPPED_LIVE pcp bug fix and enhancement update 2016-11-03 13:46:51 UTC

Description Marko Myllynen 2015-09-14 08:00:56 UTC
Created attachment 1073121 [details]
pcp uptime core dump

Description of problem:
$ pcp -a ~/20150823.10.41 uptime
zsh: segmentation fault (core dumped)  pcp -a ~/20150823.10.41 uptime

gzipped core dump attached. The archive can be processed just fine with other tools.

Version-Release number of selected component (if applicable):
pcp-3.10.6-2.el7.x86_64
pcp-system-tools-3.10.6-2.el7.x86_64

Comment 2 Mark Goodwin 2015-09-14 13:55:51 UTC
Looks like a bad pmValue has been passed into pmExtractValue() :

[mgoodwin@goblin ~]$ ulimit -c unlimited
[mgoodwin@goblin ~]$ pcp -a /var/log/pcp/pmlogger/goblin/20150914.19.31 uptime
Segmentation fault (core dumped)
[mgoodwin@goblin ~]$ file core.14442 
core.14442: ELF 64-bit LSB core file x86-64, version 1 (SYSV), too many program headers (222)
[mgoodwin@goblin ~]$ file -Pelf_phnum=10000 core.14442
core.14442: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'python3 /usr/libexec/pcp/bin/pcp-uptime'
[mgoodwin@goblin ~]$ gdb `which python3` core.14442 
GNU gdb (GDB) Fedora 7.9.1-17.fc22
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python3...Reading symbols from /usr/bin/python3...(no debugging symbols found)...done.
(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 14442]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `python3 /usr/libexec/pcp/bin/pcp-uptime'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f00d1544254 in pmExtractValue (valfmt=<optimized out>, 
    ival=0xee7570, itype=4, oval=0x7f00ce21cbc0, otype=4) at units.c:945
945			if (ival->value.pval->vlen != PM_VAL_HDR_SIZE + sizeof(float)
Missing separate debuginfos, use: dnf debuginfo-install python3-3.4.2-6.fc22.x86_64
(gdb) where
#0  0x00007f00d1544254 in pmExtractValue (valfmt=<optimized out>, 
    ival=0xee7570, itype=4, oval=0x7f00ce21cbc0, otype=4) at units.c:945
#1  0x00007f00cefb5db0 in ffi_call_unix64 () from /lib64/libffi.so.6
#2  0x00007f00cefb5818 in ffi_call () from /lib64/libffi.so.6
#3  0x00007f00cf1c877f in _ctypes_callproc ()
   from /usr/lib64/python3.4/lib-dynload/_ctypes.cpython-34m.so
#4  0x00007f00cf1c26c9 in PyCFuncPtr_call ()
   from /usr/lib64/python3.4/lib-dynload/_ctypes.cpython-34m.so
#5  0x00007f00d901df31 in PyObject_Call () from /lib64/libpython3.4m.so.1.0
#6  0x00007f00d90cde0a in PyEval_EvalFrameEx ()
   from /lib64/libpython3.4m.so.1.0
#7  0x00007f00d90d5162 in PyEval_EvalFrameEx ()
   from /lib64/libpython3.4m.so.1.0
#8  0x00007f00d90d5162 in PyEval_EvalFrameEx ()
   from /lib64/libpython3.4m.so.1.0
#9  0x00007f00d90d5de6 in PyEval_EvalCodeEx () from /lib64/libpython3.4m.so.1.0
#10 0x00007f00d90d5e8b in PyEval_EvalCode () from /lib64/libpython3.4m.so.1.0
#11 0x00007f00d90f1ef4 in run_mod () from /lib64/libpython3.4m.so.1.0
#12 0x00007f00d90f4135 in PyRun_FileExFlags () from /lib64/libpython3.4m.so.1.0
#13 0x00007f00d90f51b3 in PyRun_SimpleFileExFlags ()
   from /lib64/libpython3.4m.so.1.0
#14 0x00007f00d910bd74 in Py_Main () from /lib64/libpython3.4m.so.1.0
#15 0x0000000000400ad7 in main ()
(gdb) p ival
$1 = (const pmValue *) 0xee7570
(gdb) p *ival
$2 = {inst = -649913344, value = {pval = 0x51, lval = 81}}
(gdb) p ival->value
$3 = {pval = 0x51, lval = 81}
(gdb) p ival->value.pval
$4 = (pmValueBlock *) 0x51
(gdb) p ival->value.pval->vlen
Cannot access memory at address 0x51

Comment 5 Frank Ch. Eigler 2015-09-14 20:19:10 UTC
% pcp -Dall -a 20150823.10.41.meta uptime 2>&1 | tail -10
__pmLogFetchInterp: log reads: forward 3 (+1 cached) backwards 1 (+1 cached)
pmFetch returns ...
pmResult dump from 0x12c4bf0 timestamp: 1440315703.207307 03:41:43.207 numpmid: 3
  60.26.0 (kernel.all.uptime): No values returned!
  60.25.0 (kernel.all.nusers): No values returned!
  60.2.0 (kernel.all.load): No values returned!
pmUseContext(0) -> 0
pmExtractValue:  145 [U32] -> 145 [U32]
pmExtractValue:  33 [U32] -> 33 [U32]
pmExtractValue: [1]    17781 segmentation fault (core dumped)

i.e., the pmFetch in src/pcp/uptime/pcp-uptime.py returns a pmResult structure with numpmid=3, but a pmValueSet.numval=0.  A get_vlist() against a nonexistent pmValue[] gives us the crashy situation.

Chances are src/python/pcp/pmapi.py get_vlist() should do a check on vset_idx, comparing it against get_numval(), and throw an exception if <=.

Comment 6 Nathan Scott 2015-09-15 06:35:04 UTC
(In reply to Marko Myllynen from comment #0)
> Created attachment 1073121 [details]
> pcp uptime core dump
> 
> Description of problem:
> $ pcp -a ~/20150823.10.41 uptime
> zsh: segmentation fault (core dumped)  pcp -a ~/20150823.10.41 uptime
> 

The problem is related to the offset of the initial fetch - if you use:

$ pcp -O@+1 -a ~/20150823.10.41 uptime

it works.  The reason is there is a pmResult record at the head of the log, recorded before any record containing the metrics pcp-uptime is looking for.  So when pcp-uptime fetches at offset zero it gets no data (even though the metrics exist in the log, and if it did a second fetch it would find 'em).

The existing qa/742 happens to test with an archive that does not exhibit this characteristic (first pmResult record contains all the data needed), so the initial fetch works just fine there and the test passes.

I have a trivial fix for pcp-uptime ready, and have updated that test to exercise both cases now.

Since there is an easy workaround (via -O option), I'll mark this one for 7.3.

thanks Marko!

Comment 7 Frank Ch. Eigler 2015-09-15 10:49:35 UTC
(In reply to Nathan Scott from comment #6)

> I have a trivial fix for pcp-uptime ready, and have updated that test to
> exercise both cases now.

That check is worthwhile, but the addition of python-pcp-binding range checking seems even more important (to protect against future bugs).

Comment 9 Miloš Prchlík 2016-08-20 11:56:47 UTC
Verified for build pcp-3.11.3-3.el7.

Comment 11 errata-xmlrpc 2016-11-04 04:22:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2344.html


Note You need to log in before you can comment on or make changes to this bug.