Bug 809390 - fenced segfaults on empty method in cluster.conf
Summary: fenced segfaults on empty method in cluster.conf
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.8
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: David Teigland
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-04-03 08:59 UTC by Fabio Massimo Di Nitto
Modified: 2013-01-08 03:37 UTC (History)
4 users (show)

Fixed In Version: cman-2.0.115-103.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-01-08 03:37:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:0076 0 normal SHIPPED_LIVE cman bug fix and enhancement update 2013-01-08 08:27:31 UTC

Description Fabio Massimo Di Nitto 2012-04-03 08:59:04 UTC
Description of problem:

In the process of simulating a customer setup, i found fenced crashing when an agent fails to fence.

<cluster name="fabbione" config_version="3">
  <cman two_node="1" expected_votes="1"/>
  <clusternodes>
    <clusternode name="rhel5-node1" votes="1" nodeid="1">
      <fence>
        <method name="single">
          <device name="xvm" delay="30" domain="rhel5-node1"/>
        </method>
        <method name="2"/>
      </fence>
    </clusternode>
    <clusternode name="rhel5-node2" votes="1" nodeid="2">
      <fence>
        <method name="single">
          <device name="xvm" domain="rhel5-node2"/>
        </method>
        <method name="2"/>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice name="xvm" agent="fence_xvm"/>
  </fencedevices>
</cluster>


Version-Release number of selected component (if applicable):

cman-2.0.115-96.el5_8.1


How reproducible:

always

Steps to Reproduce:
1. start the 2 nodes cluster, let the nodes join (this case it´s VMs)
2. stop fence_xvmd on the host (or replace agent with /bin/false)
3. killall -9 aisexec on one of the nodes
  
Actual results:

Apr  3 10:53:01 rhel5-node1 openais[2694]: [CPG  ] got joinlist message from node 1 
Apr  3 10:53:31 rhel5-node1 fenced[2713]: agent "fence_xvm" reports: Could not read /etc/cluster/fence_xvm.key; trying without authentication Timed out waiting for response 
Apr  3 10:53:31 rhel5-node1 kernel: fenced[2713]: segfault at 0000000000000018 rip 00002aeb6ecdc53b rsp 00007fffe7e4be40 error 4
Apr  3 10:53:31 rhel5-node1 groupd[2706]: fence daemon appears to be dead
Apr  3 10:53:32 rhel5-node1 openais[2694]: [SERV ] Unloading all openais components 
[SNIP]

Comment 1 Fabio Massimo Di Nitto 2012-04-03 09:02:33 UTC
extra info, the crash is caused by empty <method name="2"/>

Comment 2 RHEL Program Management 2012-04-03 09:17:03 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 3 David Teigland 2012-07-23 22:05:12 UTC
pushed to RHEL59 branch

http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=77bb92ad9dbad9c8755e45211214ab1e099cdb0e

tested this by killing node-03 with this config:

<clusternode name="node-03" nodeid="3">
        <fence>
        <method name="1">
        <device name="f"/>
        </method>
        <method name="2"/>
        </fence>
</clusternode>

<fencedevices>
<fencedevice name="t" agent="/root/fence_test0"/>
<fencedevice name="f" agent="/root/fence_test1"/>
</fencedevices>

fence_test1 does exit(1)

without fix, fenced segfaults, with fix it doesn't.

Comment 7 errata-xmlrpc 2013-01-08 03:37:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-0076.html


Note You need to log in before you can comment on or make changes to this bug.