Bug 1015126

Summary: crm_mon crashed with segfault
Product: Red Hat Enterprise Linux 7 Reporter: michal novacek <mnovacek>
Component: pacemakerAssignee: Andrew Beekhof <abeekhof>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.0CC: cluster-maint, dvossel, fdinitto, mnovacek
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pacemaker-1.1.10-21.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 12:51:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
crm_report
none
/var/tmp/abrt tarball
none
another abrt dump of crm_mon crash
none
patch fixes crm_mon crash none

Description michal novacek 2013-10-03 13:31:13 UTC
Created attachment 807085 [details]
crm_report

Description of problem:
crm_mon crashed with segfault.

Version-Release number of selected component (if applicable):
pacemaker-1.1.10-15.el7
RHEL-7.0-20130823.n.1

I'm attaching crm_report tarball and /var/tmp/abrt tarball.

Comment 1 michal novacek 2013-10-03 13:31:46 UTC
Created attachment 807086 [details]
/var/tmp/abrt tarball

Comment 3 Andrew Beekhof 2013-10-07 02:38:33 UTC
I can see there is a problem from the system logs: 

Oct  3 15:17:38 marathon-03 kernel: [ 1473.929893] abrt-handle-eve[2857]: segfault at 0 ip 00007f01c583c6c4 sp 00007fffc9c41590 error 4 in libsatyr.so.1.0.0[7f01c57e5000+13b000]
Oct  3 15:17:38 marathon-03 abrt[2890]: Saved core dump of pid 2857 (/usr/libexec/abrt-handle-event) to /var/tmp/abrt/abrt-handle-event-coredump (1769472 bytes)
Oct  3 15:17:38 marathon-03 abrtd: 'post-create' on '/var/tmp/abrt/ccpp-2013-10-03-15:17:37-2854' killed by signal 11
Oct  3 15:17:38 marathon-03 abrtd: Deleting problem directory '/var/tmp/abrt/ccpp-2013-10-03-15:17:37-2854'
Oct  3 15:17:39 marathon-03 kernel: [ 1475.179342] crm_mon[2893]: segfault at 0 ip 00000000004065d2 sp 00007fff34a140a0 error 4 in crm_mon[400000+c000]
Oct  3 15:17:39 marathon-03 abrt[2894]: Saved core dump of pid 2893 (/usr/sbin/crm_mon) to /var/tmp/abrt/ccpp-2013-10-03-15:17:39-2893 (2535424 bytes)
Oct  3 15:17:39 marathon-03 abrtd: Directory 'ccpp-2013-10-03-15:17:39-2893' creation detected
Oct  3 15:17:40 marathon-03 abrtd: Generating core_backtrace
Oct  3 15:17:40 marathon-03 abrtd: Generating backtrace


but apart from a couple of kernel ooops', a pythin crash and a crash in network manager, there is nothing from crm_mon in attachment #807086 [details].

Comment 4 michal novacek 2013-11-07 12:19:29 UTC
Created attachment 821052 [details]
another abrt dump of crm_mon crash


The problem appeared again in normal use. 

After the first crash crm_mon crashed on every other run on both nodes of the cluster reporting 'There is no cluster running onthis node' althought pacemaker was runnning. 

After pacemaker restart through 'systemctl restart pacemaker' it started working again.

Comment 5 David Vossel 2014-01-20 17:41:37 UTC
Created attachment 852809 [details]
patch fixes crm_mon crash

This patch fixes the crash.

Comment 7 michal novacek 2014-03-28 12:58:15 UTC
Marking SanityOnly as there is no reliable reproducer.

Comment 8 Ludek Smid 2014-06-13 12:51:29 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.