Bug 1299339

Summary: Pacemaker's lrmd crashes after certain systemd errors
Product: Red Hat Enterprise Linux 7 Reporter: Jan Kurik <jkurik>
Component: pacemakerAssignee: Ken Gaillot <kgaillot>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.2CC: 252150241, abeekhof, cfeist, cluster-maint, kgaillot, mjuricek, mnavrati, royoung, tlavigne, vlad.socaciu
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pacemaker-1.1.13-10.el7_2.1 Doc Type: Bug Fix
Doc Text:
Previously, Pacemaker's Local Resource Management Daemon (lrmd) used an invalid format string when logging certain rare systemd errors. As a consequence, lrmd could terminate unexpectedly with a segmentation fault. A patch has been applied to fix the format string. As a result, lrmd no longer crashes and logs the aforementioned rare error messages as intended.
Story Points: ---
Clone Of: 1284069
: 1394068 (view as bug list) Environment:
Last Closed: 2016-02-16 11:17:51 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1284069    
Bug Blocks: 1394068    
Attachments:
Description Flags
Crash Data none

Description Jan Kurik 2016-01-18 07:45:45 UTC
This bug has been copied from bug #1284069 and has been proposed
to be backported to 7.2 z-stream (EUS).

Comment 7 errata-xmlrpc 2016-02-16 11:17:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0216.html

Comment 8 vlad.socaciu 2016-11-10 23:46:30 UTC
lrmd still crashes quite consistently in pacemaker-1.1.13-10.el7_2.4.x86_64. Here are just two examples:

/opt/unisys_ccg_gid/cluster_scripts #abrt-cli list --since 1478651531
id b10052d2f8f15628c3343c336a446ed40078a16b
reason:         lrmd killed by SIGSEGV
time:           Tue 08 Nov 2016 09:33:42 PM PST
cmdline:        /usr/libexec/pacemaker/lrmd
package:        pacemaker-1.1.13-10.el7_2.4
uid:            0 (root)
count:          1
Directory:      /var/spool/abrt/ccpp-2016-11-08-21:33:42-1563

/var/log/cluster #abrt-cli list
id b01b01d4708d3664f696ae9bea577f5c73dab35d
reason:         lrmd killed by SIGSEGV
time:           Tue 08 Nov 2016 09:36:28 PM PST
cmdline:        /usr/libexec/pacemaker/lrmd
package:        pacemaker-1.1.13-10.el7_2.4
uid:            0 (root)
count:          2
Directory:      /var/spool/abrt/ccpp-2016-11-08-21:36:28-1714

The content of /var/spool/abrt/ccpp-2016-11-08-21:36:28-1714 will be uploaded if the upload works.

Comment 9 vlad.socaciu 2016-11-10 23:48:24 UTC
Created attachment 1219568 [details]
Crash Data

The content of /var/spool/abrt/ccpp-2016-11-08-21:36:28-1714 is posted

Comment 10 Ken Gaillot 2016-11-11 00:00:17 UTC
(In reply to vlad.socaciu from comment #8)
> lrmd still crashes quite consistently in pacemaker-1.1.13-10.el7_2.4.x86_64.

The core backtrace in the abrt instance shows this is a separate issue, so I have opened Bug 1394068 for it.