Bug 1312094 - crmd can crash after unexpected remote connection takeover
crmd can crash after unexpected remote connection takeover
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker (Show other bugs)
7.2
All All
high Severity medium
: rc
: 7.3
Assigned To: Ken Gaillot
cluster-qe@redhat.com
:
Depends On: 1304771
Blocks: CVE-2016-7797
  Show dependency treegraph
 
Reported: 2016-02-25 12:45 EST by Ken Gaillot
Modified: 2016-11-03 14:58 EDT (History)
4 users (show)

See Also:
Fixed In Version: pacemaker-1.1.15-1.2c148ac.git.el7
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: 1312092
Environment:
Last Closed: 2016-11-03 14:58:51 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Cluster Labs 5269 None None None 2016-02-25 12:45 EST
Red Hat Product Errata RHSA-2016:2578 normal SHIPPED_LIVE Moderate: pacemaker security, bug fix, and enhancement update 2016-11-03 08:07:24 EDT

  None (edit)
Comment 3 Mike McCune 2016-03-28 18:52:26 EDT
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune@redhat.com with any questions
Comment 5 Patrik Hagara 2016-09-08 10:28:34 EDT
Setup: 3-node cluster + 1 pacemaker_remote node

Before the fix:

> Sep 08 14:26:11 [27822] virt-166       crmd:    error: remote_lrm_op_callback:	Unexpected pacemaker_remote client takeover. Disconnecting
> Sep 08 14:26:11 [27822] virt-166       crmd:     info: lrmd_api_disconnect:	Disconnecting from 3 lrmd service
> Sep 08 14:26:11 [27822] virt-166       crmd:     info: lrmd_api_disconnect:	Disconnecting from 3 lrmd service
> Sep 08 14:26:11 [27822] virt-166       crmd:     info: lrmd_tls_connection_destroy:	TLS connection destroyed
> Sep 08 14:26:11 [27816] virt-166 pacemakerd:    error: child_waitpid:	Managed process 27822 (crmd) dumped core
> Sep 08 14:26:11 [27816] virt-166 pacemakerd:    error: pcmk_child_exit:	The crmd process (27822) terminated with signal 6 (core=1)

pacemaker_remote node got disconnected from the cluster, crmd on cluster node hosting the pacemaker_remote connection crashed and was restarted,  the cluster returned to a fully operational state shortly thereafter.


After the fix:

> Sep  8 16:23:46 virt-055 pacemaker_remoted[17977]:  notice: LRMD client connection established. 0xd8e120 id: f93cb6a1-a321-4ff5-8c75-398190f50b28
> Sep  8 16:23:56 virt-055 pacemaker_remoted[17977]:  notice: LRMD client disconnecting remote client - name: <unknown> id: f93cb6a1-a321-4ff5-8c75-398190f50b28
> Sep  8 16:23:56 virt-055 pacemaker_remoted[17977]:   error: Remote client authentication timed out

Cluster remained fully operational without service disruption, no log messages on cluster node hosting the pacemaker_remote connection, the remote node itself logs auth time-out error.

Marking as verified in pacemaker-1.1.15-1.2c148ac.git.el7
Comment 7 errata-xmlrpc 2016-11-03 14:58:51 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2578.html

Note You need to log in before you can comment on or make changes to this bug.