RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 745526 - pacemaker+cman fencing is unreliable
Summary: pacemaker+cman fencing is unreliable
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: pacemaker
Version: 6.2
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: 6.2
Assignee: Andrew Beekhof
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 748554
TreeView+ depends on / blocked
 
Reported: 2011-10-12 15:25 UTC by Jaroslav Kortus
Modified: 2012-04-03 09:56 UTC (History)
1 user (show)

Fixed In Version: pacemaker-1.1.6-3.el6
Doc Type: Technology Preview
Doc Text:
Prior to this update, an error in the interaction between Pacemaker and CMAN's fencing subsystem prevented reliable fencing operation. This update applies a patch that corrects this error so that such fencing operations are now reliable.
Clone Of:
Environment:
Last Closed: 2011-12-06 16:50:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1669 0 normal SHIPPED_LIVE pacemaker bug fix and enhancement update 2011-12-06 00:50:15 UTC

Description Jaroslav Kortus 2011-10-12 15:25:48 UTC
Description of problem:
when pacemaker is configured together with cman the fencing does not work in a reliable way.

The manual says that pacemaker should take over the fencing responsibility and for that there is fence_pcmk fencedevice replacement (pacemaker fencing redirect).

The problem is that it also fakes the response too early in the process. This means that the fencing is acknowledged even before any attempt is made (!).

To illustrate the problem, create pacemaker+cman combo and mount gfs2 filesystem. Then pkill -9 corosync on one of the nodes and watch the recovery.

Relevant snips:
Oct 12 10:07:15 marathon-01 corosync[18840]:   [TOTEM ] A processor failed, forming new configuration.
Oct 12 10:07:27 marathon-01 corosync[18840]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Oct 12 10:07:27 marathon-01 fenced[19023]: fencing node marathon-05
Oct 12 10:07:27 marathon-01 fence_pcmk: Requesting Pacemaker fence marathon-05 (reset)
Oct 12 10:07:27 marathon-01 fenced[19023]: fence marathon-05 success
Oct 12 10:07:28 marathon-01 stonith-ng: [19198]: info: make_args: reboot-ing node 'marathon-05' as 'port=5'
Oct 12 10:07:28 marathon-01 kernel: GFS2: fsid=marathon:vedder0.0: jid=4: Looking at journal...
Oct 12 10:07:28 marathon-01 kernel: GFS2: fsid=marathon:vedder0.0: jid=4: Acquiring the transaction lock...
Oct 12 10:07:28 marathon-01 kernel: GFS2: fsid=marathon:vedder0.0: jid=4: Replaying journal...
Oct 12 10:07:28 marathon-01 kernel: GFS2: fsid=marathon:vedder0.0: jid=4: Replayed 0 of 0 blocks
Oct 12 10:07:28 marathon-01 kernel: GFS2: fsid=marathon:vedder0.0: jid=4: Found 1 revoke tags
Oct 12 10:07:28 marathon-01 kernel: GFS2: fsid=marathon:vedder0.0: jid=4: Journal replayed in 1s
Oct 12 10:07:28 marathon-01 kernel: GFS2: fsid=marathon:vedder0.0: jid=4: Done
Oct 12 10:07:32 marathon-01 stonith-ng: [19198]: info: log_operation: Operation 'reboot' [19798] (call 0 from (null)) for host 'marathon-05' with device 'apc-fencing' returned: 0
Oct 12 10:07:32 marathon-01 stonith-ng: [19198]: info: log_operation: apc-fencing: Parse error: Ignoring unknown option 'nodename=marathon-05'
Oct 12 10:07:32 marathon-01 stonith-ng: [19198]: info: log_operation: apc-fencing: Success: Rebooted

It's clearly visible that the recovery took place way before the node could actually confirm that the failing node can't touch the device any more.

Version-Release number of selected component (if applicable):
cman-3.0.12.1-23.el6.x86_64
pacemaker-1.1.6-2.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. setup cman+pacemaker
2. create and mount gfs2 FS
3. pkill -9 one one of the nodes and see the recovery
  
Actual results:
- fencing is faked as successful via fence_pcmk
- recovery happens before the node is fenced
- fencing may fail, while the recovery would still be performed (journals replayed). This is VERY dangerous and should not happen.

Expected results:
one of:
- fencing is not faked and the reply is sent after pacemaker finishes fencing event
- fencing is disabled in pacemaker and cman handles it as it used to do (+doc fix to reflect this)

Additional info:
cluster.conf:
<?xml version="1.0"?>
<cluster name="marathon" config_version="1">
  <cman>


  </cman>
  <fence_daemon post_join_delay="20" clean_start="0"/>
  <clusternodes>
    <clusternode name="marathon-01" votes="1" nodeid="1">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="marathon-01"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="marathon-02" votes="1" nodeid="2">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="marathon-02"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="marathon-03" votes="1" nodeid="3">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="marathon-03"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="marathon-04" votes="1" nodeid="4">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="marathon-04"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="marathon-05" votes="1" nodeid="5">
      <fence>
        <method name="pcmk-redirect">
          <device name="pcmk" port="marathon-05"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="pcmk"/>
  </fencedevices>
</cluster>

Comment 2 Andrew Beekhof 2011-10-17 21:54:58 UTC
Would you be able to run crm_report for the time period covered by the test?  (Remember to quote the date/time string).

I did check to see that fence_pcmk was running synchronously, apparently not hard enough. My apologies.

Comment 3 Andrew Beekhof 2011-10-19 00:24:03 UTC
A related patch has been committed upstream: https://github.com/ClusterLabs/pacemaker/commit/2d8fad5

Comment 4 Andrew Beekhof 2011-10-19 00:40:12 UTC
In fact it is worse, the additional logging I added after testing actually prevents the agent from passing the request on to pacemaker.

I have since tested the above patch and had it reviewed by Lon.

Without the patch, running:
  /usr/sbin/fence_pcmk -n east-01 < /dev/null
results in no additional logs from stonith-ng in /var/log/messages (because stonith_admin is not being invoked)

With the patch, at the very minimum, there should be a log similar to:

Oct 18 20:00:48 east-03 stonith-ng: [18764]: info: initiate_remote_stonith_op: Initiating remote operation off for east-01: c5111dd8-8a1c-4b6a-aaf0-5a793dc2ed79

Additionally, when trying to fence an unknown node the command can now be seen to (correctly) wait and receive the error:

[root@east-03 ~]# /usr/sbin/fence_pcmk -n unknown-node < /dev/null
Command failed: Operation timed out
failed: unknown-node 248

Comment 10 Jaromir Hradilek 2011-10-26 09:32:19 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Prior to this update, an error in the interaction between Pacemaker and CMAN's fencing subsystem prevented reliable fencing operation. This update applies a patch that corrects this error so that such fencing operations are now reliable.

Comment 13 errata-xmlrpc 2011-12-06 16:50:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1669.html


Note You need to log in before you can comment on or make changes to this bug.