RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1319070 - adding a node on RHEL6 may crash due to hardcoded fencing names in pcs
Summary: adding a node on RHEL6 may crash due to hardcoded fencing names in pcs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: pcs
Version: 6.8
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: rc
: ---
Assignee: Tomas Jelinek
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-18 15:11 UTC by Radek Steiger
Modified: 2017-03-21 11:03 UTC (History)
5 users (show)

Fixed In Version: pcs-0.9.154-1.el6
Doc Type: Bug Fix
Doc Text:
Cause: User adds a node into a cluster. Consequence: Pcs exits with an error leaving the cluster configuration in an inconsistent state (the node is half added) if the cluster configuration has been updated out of pcs scope and fence devices has been changed. Fix: Read the configuration and make sure a required fence device exists in the configuration, create it if it does not. Result: The node is added successfully.
Clone Of:
Environment:
Last Closed: 2017-03-21 11:03:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
proposed fix (6.13 KB, patch)
2016-08-26 06:47 UTC, Tomas Jelinek
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0707 0 normal SHIPPED_LIVE pcs bug fix update 2017-03-21 12:40:33 UTC

Description Radek Steiger 2016-03-18 15:11:39 UTC
> Description of problem:

Fencing related names in cluster.conf like those in <method>, <device> and <fencedevice> tags can cause pcs to crash on adding a node to cluster if the cluster configuration has been created or modified outside pcs.

The reason is that pcs creates the config file with "pcmk-method" and "pcmk-redirect" used as a name identifier and presumes it is always there when adding additional nodes. If the configuration however has been created or been changed manually to include custom identifiers, the node add procedure will fail.

Example cluster.conf:

<cluster config_version="4" name="STSRHTS30477">
  <cman/>
  <totem token="3000"/>
  <fence_daemon clean_start="0" post_join_delay="20"/>
  <clusternodes>
    <clusternode name="virt-006" nodeid="1" votes="1">
      <fence>
        <method name="mymethod">
          <device name="mypcmk" port="virt-006"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="virt-007" nodeid="2" votes="1">
      <fence>
        <method name="mymethod">
          <device name="mypcmk" port="virt-007"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="virt-008" nodeid="3" votes="1">
      <fence>
        <method name="mymethod">
          <device name="mypcmk" port="virt-008"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="mypcmk"/>
  </fencedevices>
</cluster>

Now when I try to add a node:

[root@virt-007 ententyky]# pcs cluster node add virt-009
Error: unable to add virt-009 on virt-006 - Error connecting to virt-006 - (HTTP error: 400)
Error: unable to add virt-009 on virt-007 - Error connecting to virt-007 - (HTTP error: 400)
Error: unable to add virt-009 on virt-008 - Error connecting to virt-008 - (HTTP error: 400)
Error: Unable to update any nodes

An incomplete intersection with missing device details and # of votes is added into cluster.conf:

    ...
    <clusternode name="virt-009" nodeid="4">
      <fence>
        <method name="pcmk-method"/>
      </fence>
    </clusternode>
    ...

This is probably because pcs runs ccs internally to do the job, but ccs fails, having pcs silently ignoring the error and failing later with HTTP 400. This is what pcs runs in the background:

[root@virt-007 ~]# ccs -i -f /etc/cluster/cluster.conf --addfenceinst "pcmk-redirect" virt-009 "pcmk-method"  "port=virt-009"
Fence device 'pcmk-redirect' not found.
[root@virt-017 ententyky]# echo $?
1



> Version-Release number of selected component (if applicable):

pcs-0.9.148-5.el6.x86_64



> Additional info:

We could either:
 - read cluster.conf beforehand to figure out what name has been used for the pcmk fencing device and use that one automatically
 - read cluster.conf beforehand and add our own secondary device if not present under the expected name
 - read cluster.conf beforehand and error out properly

Comment 2 Tomas Jelinek 2016-08-26 06:47:47 UTC
Created attachment 1194217 [details]
proposed fix

Comment 3 Ivan Devat 2016-10-19 07:06:19 UTC
Setup:
> modify /etc/cluster/cluster.conf:
> in tag method attribute name: pcmk-method -> mymethod
> in tag device attribute name: pcmk_redirect -> mypcmk
> in tag fencdevice attribute name: pcmk_redirect -> mypcmk
> something like:

<cluster config_version="23" name="devcluster6">
  <fence_daemon/>
  <clusternodes>
    <clusternode name="vm-rhel67-1" nodeid="1">
      <fence>
        <method name="mymethod">
          <device name="mypcmk" port="vm-rhel67-1"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="vm-rhel67-2" nodeid="2">
      <fence>
        <method name="mymethod">
          <device name="mypcmk" port="vm-rhel67-2"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="vm-rhel67-3" nodeid="3">
      <fence>
        <method name="pcmk-method"/>
      </fence>
    </clusternode>
  </clusternodes>
  <cman broadcast="no" expected_votes="1" transport="udp" two_node="1"/>
  <fencedevices>
    <fencedevice agent="fence_pcmk" name="mypcmk"/>
  </fencedevices>
  <rm>
    <failoverdomains/>
    <resources/>
  </rm>
</cluster>


Before Fix:

[vm-rhel67-1 ~] $ rpm -q pcs
pcs-0.9.148-7.el6_8.1.x86_64

[vm-rhel67-1 ~] $ pcs status | grep Online:
Online: [ vm-rhel67-1 vm-rhel67-2 ]
[vm-rhel67-1 ~] $ pcs status |grep "2 nodes"
2 nodes and 1 resource configured
[vm-rhel67-1 ~] $ pcs cluster localnode add vm-rhel67-3
Fence device 'pcmk-redirect' not found.

Error: error adding fence instance: vm-rhel67-3


After Fix:

[vm-rhel67-1 ~] $ rpm -q pcs
pcs-0.9.154-1.el6.x86_64

[vm-rhel67-1 ~] $ pcs status | grep Online:
Online: [ vm-rhel67-1 vm-rhel67-2 ]
[vm-rhel67-1 ~] $ pcs status |grep "2 nodes"
2 nodes and 1 resource configured

[vm-rhel67-1 ~] $ pcs cluster localnode add vm-rhel67-3
vm-rhel67-3: successfully added!

Comment 7 errata-xmlrpc 2017-03-21 11:03:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0707.html


Note You need to log in before you can comment on or make changes to this bug.