Bug 1235962 - [abrt] cockpit-pcp: __pmFindProfile(): cockpit-pcp killed by SIGSEGV
Summary: [abrt] cockpit-pcp: __pmFindProfile(): cockpit-pcp killed by SIGSEGV
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: cockpit
Version: 27
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Martin Pitt
QA Contact: Fedora Extras Quality Assurance
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:be910f191aa1ec647dfd10f2452...
Depends On:
Blocks: 1550995
TreeView+ depends on / blocked
 
Reported: 2015-06-26 08:27 UTC by Juan Antonio Clavero García
Modified: 2018-11-30 23:47 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-30 23:47:35 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
File: backtrace (20.61 KB, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: cgroup (333 bytes, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: core_backtrace (4.36 KB, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: dso_list (4.09 KB, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: environ (341 bytes, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: limits (1.29 KB, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: maps (19.68 KB, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: mountinfo (3.99 KB, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: namespaces (85 bytes, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: open_fds (836 bytes, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: proc_pid_status (933 bytes, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
File: var_log_messages (1.38 KB, text/plain)
2015-06-26 08:28 UTC, Juan Antonio Clavero García
no flags Details
Fedora 25 backtrace (4.98 KB, text/plain)
2017-03-15 11:22 UTC, Martin Pitt
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github https://github.com/cockpit-project cockpit issues 6108 0 None None None 2020-06-08 10:09:03 UTC

Description Juan Antonio Clavero García 2015-06-26 08:27:58 UTC
Version-Release number of selected component:
cockpit-pcp-0.60-1.fc22

Additional info:
reporter:       libreport-2.6.0
backtrace_rating: 4
cmdline:        /usr/libexec/cockpit-pcp
crash_function: __pmFindProfile
executable:     /usr/libexec/cockpit-pcp
global_pid:     12215
kernel:         4.0.5-300.fc22.x86_64
runlevel:       N 5
type:           CCpp
uid:            0

Truncated backtrace:
Thread no. 1 (10 frames)
 #0 __pmFindProfile at profile.c:144
 #1 __pmInProfile at profile.c:163
 #2 __pmdaNextInst at callback.c:148
 #3 pmdaFetch at callback.c:514
 #4 linux_fetch at pmda.c:5714
 #5 __pmFetchLocal at fetchlocal.c:131
 #6 pmFetch at fetch.c:147
 #7 cockpit_pcp_metrics_tick at src/bridge/cockpitpcpmetrics.c:349
 #8 on_timeout_tick at src/bridge/cockpitmetrics.c:178
 #13 g_main_context_iteration at gmain.c:3869

Comment 1 Juan Antonio Clavero García 2015-06-26 08:28:01 UTC
Created attachment 1043431 [details]
File: backtrace

Comment 2 Juan Antonio Clavero García 2015-06-26 08:28:03 UTC
Created attachment 1043432 [details]
File: cgroup

Comment 3 Juan Antonio Clavero García 2015-06-26 08:28:04 UTC
Created attachment 1043433 [details]
File: core_backtrace

Comment 4 Juan Antonio Clavero García 2015-06-26 08:28:05 UTC
Created attachment 1043434 [details]
File: dso_list

Comment 5 Juan Antonio Clavero García 2015-06-26 08:28:06 UTC
Created attachment 1043435 [details]
File: environ

Comment 6 Juan Antonio Clavero García 2015-06-26 08:28:07 UTC
Created attachment 1043436 [details]
File: limits

Comment 7 Juan Antonio Clavero García 2015-06-26 08:28:08 UTC
Created attachment 1043437 [details]
File: maps

Comment 8 Juan Antonio Clavero García 2015-06-26 08:28:10 UTC
Created attachment 1043438 [details]
File: mountinfo

Comment 9 Juan Antonio Clavero García 2015-06-26 08:28:11 UTC
Created attachment 1043439 [details]
File: namespaces

Comment 10 Juan Antonio Clavero García 2015-06-26 08:28:12 UTC
Created attachment 1043440 [details]
File: open_fds

Comment 11 Juan Antonio Clavero García 2015-06-26 08:28:13 UTC
Created attachment 1043441 [details]
File: proc_pid_status

Comment 12 Juan Antonio Clavero García 2015-06-26 08:28:14 UTC
Created attachment 1043442 [details]
File: var_log_messages

Comment 13 Fedora End Of Life 2016-07-19 15:03:36 UTC
Fedora 22 changed to end-of-life (EOL) status on 2016-07-19. Fedora 22 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 14 Martin Pitt 2017-03-15 11:22:21 UTC
Created attachment 1263283 [details]
Fedora 25 backtrace

We still see this crash during cockpit tests on Fedora 25. I attach a current backtrace, but it looks almost the same as the original one. Relevant package versions:

pcp-libs-3.11.8-2.fc25.x86_64
pcp-selinux-3.11.8-2.fc25.x86_64
pcp-3.11.8-2.fc25.x86_64
pcp-conf-3.11.8-2.fc25.x86_64
cockpit-pcp-134-1.fc25.x86_64
pcp-debuginfo-3.11.8-2.fc25.x86_64

Comment 15 Martin Pitt 2017-03-15 11:34:00 UTC
It crashes in this loop:

__pmInDomProfile *
__pmFindProfile(pmInDom indom, const __pmProfile *prof)
{
    __pmInDomProfile    *p, *p_end;

    if (prof != NULL && prof->profile_len > 0)
        /* search for the profile entry for this instance domain */
        for (p=prof->profile, p_end=p+prof->profile_len; p < p_end; p++) {
            if (p->indom == indom)
                /* found : an entry for this instance domain already exists */
                return p;
        }

    /* not found */
    return NULL;
}

"prof" looks valid, but it appears like `profile_len` has some bogus value and thus the iteration goes far beyond the real array:

(gdb) p prof
$16 = (const __pmProfile *) 0x5604723756e0
(gdb) p *prof
$17 = {state = 1916235728, profile_len = 22020, profile = 0x560472332a60}
(gdb) p p
$18 = (__pmInDomProfile *) 0x560472398010
(gdb) p *p
Cannot access memory at address 0x560472398010
(gdb) p sizeof(__pmProfile)
$19 = 16
(gdb) p (p - prof->profile)
$20 = 17298

I don't know the pcp code and thus I'm not sure how to interpret `profile_len`: But usually it's either the number of entries (and then 22020 is implausibly high) or it's the total array size. Each entry has 16 bytes, but 22020/16 == 1376.25. So it looks like some housekeeping error on the length?

Can you please reopen this bug? I don't immediately see how this could be influenced from outside from cockpit-pcp, but we don't current have a standalone reproducer at hand.

Comment 16 Fedora End Of Life 2017-11-16 14:14:04 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 17 Martin Pitt 2017-11-16 14:59:39 UTC
Still happening on Fedora 26, moving version.

Comment 18 Martin Pitt 2018-03-02 14:20:11 UTC
Reassigning to PCP. Version 4.x fixed a bunch of crashes, maybe it got fixed by now.

Comment 19 Nathan Scott 2018-03-05 03:01:34 UTC
| We still see this crash during cockpit tests on Fedora 25.

Martin, can you describe what the test is doing when this happens?  From what I can tell, it looks like Cockpit is extracting live stats using the "local context" mode of operation in libpcp.  Do you know which metrics would be fetched?  Does it happen immediately, or only after some time?  Do you know if the Cockpit code has loaded multiple DSO (local context) PMDAs here, or does it only use pmda_linux.so in this mode?

And does anyone have a core dump I can examine in more detail?  Thanks.

I don't think this issue has been reported by others or observed by PCP maintainers, so at this stage I suspect pcp v4 will not resolve it unfortunately.

Comment 20 Martin Pitt 2018-03-07 23:13:36 UTC
> Martin, can you describe what the test is doing when this happens?

Nothing special actually - This happens randomly, spread out across tests that e. g. cover IPA (https://fedorapeople.org/groups/cockpit/logs/pull-8769-20180307-130707-dbbabf0f-verify-ubuntu-stable/log.html#35) or SSL connection settings (https://fedorapeople.org/groups/cockpit/logs/pull-8726-20180307-115757-31aa0ed6-verify-rhel-7/log.html#103) - in all of those it doesn't actually fail the tests, as none of them cover PCP - these just get spotted as after each test we check the journal for unexpected messages.

As Cockpit's front page shows the host's performance metrics for  CPU, memory, IO, and network, PCP is more or less involved in the cockpit startup of every test. But I'm afraid there's no particular action that triggers it, that just happens randomly.

I'm following the development of bug 1550995, which most probably is closely related.

Comment 21 Nathan Scott 2018-03-26 22:54:50 UTC
Reassigning, as discussed in BZ 1550995

Comment 23 Fedora End Of Life 2018-05-03 08:49:17 UTC
This message is a reminder that Fedora 26 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 26. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '26'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 26 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 24 Martin Pitt 2018-05-03 10:51:00 UTC
Moving to Fedora 27 for now, as there we definitively still see the crashes. We don't have these naughty overrides for Fedora 28, so it remains to be seen if it's still actually an issue there.

Comment 25 Ben Cotton 2018-11-27 13:29:49 UTC
This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 26 Ben Cotton 2018-11-30 23:47:35 UTC
Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.