Bug 1373590
Summary: | pcp-atop killed by SIGFPE | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Deepu K S <dkochuka> | ||||||
Component: | pcp | Assignee: | Nathan Scott <nathans> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Miloš Prchlík <mprchlik> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 6.8 | CC: | brolley, dkochuka, fche, lberk, mbenitez, mcermak, mgoodwin, mprchlik | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-03-21 11:20:54 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Deepu K S
2016-09-06 16:29:21 UTC
Created attachment 1198357 [details]
ABRT captured problem directory (coredump included)
Hi Deepu, What does $ pminfo -f hinv.ncpu report on this system? (I'm expecting some kind of error, just curious as to which one) So far, I've been unable to reproduce the problem locally (with/without pmcd running, with/without pmdalinux running). Thanks! (In reply to Deepu K S from comment #0) > Description of problem: > Process /usr/libexec/pcp/bin/pcp-atop was killed by signal 8 (SIGFPE) > It looks like pcp-atop crashed due to a divide by zero condition Were you able to collect $PCP_DEBUG level traces? % env PCP_DEBUG=2 pcp atop 2>/tmp/LOGFILE (In reply to Nathan Scott from comment #3) > Hi Deepu, > > What does > > $ pminfo -f hinv.ncpu > > report on this system? (I'm expecting some kind of error, just curious as > to which one) > > So far, I've been unable to reproduce the problem locally (with/without pmcd > running, with/without pmdalinux running). > > Thanks! Sorry for the delay. I now have the output collected. # pminfo -f hinv.ncpu hinv.ncpu: pmLookupDesc: No PMCD agent for domain of request # service pmcd status Checking for pmcd: running Output of # env PCP_DEBUG=10 pcp atop 2>pcp-atop.log is attached. The crash happens whenever the command is run. It also happens right away. Most lines from logfile show PM_ID_NULL (<noname>): No PMCD agent for domain of request Created attachment 1200232 [details]
pcp atop log
pmFetch returns ... pmResult dump from 0x83c2e0 timestamp: 1473338279.564024 14:37:59.564 numpmid: 11 PM_ID_NULL (<noname>): No PMCD agent for domain of request PM_ID_NULL (<noname>): No PMCD agent for domain of request PM_ID_NULL (<noname>): No PMCD agent for domain of request Oh, dear. That suggests that pmdalinux and/or pmdaproc crashed or were taken out of service, and that automatic restarting (if any) was not successful. (What version of PCP was this?) A # service pmcd restart should bring them back to life. It is a bug in pcp-atop that it fails to report the problem and advise the user. Thanks Deepu, I understand whats happening now & know how to reproduce, a fix will follow shortly. This is fixed in upstream PCP via git commit 7157edb93 and will make its way into the next available RHEL6 PCP update from there. Verified with build pcp-3.10.9-8.el6. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0735.html |