RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 728230 - cman crashes on startup if cluster name is too long or is not set at all
Summary: cman crashes on startup if cluster name is too long or is not set at all
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: cluster
Version: 6.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: rc
: ---
Assignee: Fabio Massimo Di Nitto
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: GSS_6_2_PROPOSED
TreeView+ depends on / blocked
 
Reported: 2011-08-04 13:07 UTC by Christine Caulfield
Modified: 2018-11-14 11:12 UTC (History)
8 users (show)

Fixed In Version: cluster-3.0.12.1-9.el6
Doc Type: Bug Fix
Doc Text:
Cause: The lack of 2 sanity checks related to the length of cluster name would cause cman to crash at startup. Consequence: cman would crash when starting up Fix: Implemented the correct sanity checks and report proper error as necessary Result: cman does not crash anylonger and inform the users of the incorrect value of cluster name
Clone Of:
Environment:
Last Closed: 2011-12-06 14:52:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Patch to add a check on the cluster name length (665 bytes, patch)
2011-08-04 13:58 UTC, Christine Caulfield
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Legacy) 62056 0 None None None Never
Red Hat Product Errata RHBA-2011:1516 0 normal SHIPPED_LIVE cluster and gfs2-utils bug fix update 2011-12-06 00:51:09 UTC

Description Christine Caulfield 2011-08-04 13:07:02 UTC
Description of problem:
Cluster names can be a maximum of 15 characters but there seems to be no useful checking in cman in RHEL6. Starting a cluster with an invalid cluster name causes corosync/cman to crash with signal 6.

Version-Release number of selected component (if applicable):
RHEL 6.1

How reproducible:
Every time

Steps to Reproduce:
1. Create a cluster.conf with a long cluster name
2. start cman
  
Actual results:
cman crashes

Expected results:
cman should not crash.

Additional info:
In RHEL5 an error message was printed if the cluster name was too long. This appears not to be the case.

# cman_tool join
corosync died with signal: 6

or:

# cman_tool join -d
Validating configuration
calling '/usr/sbin/ccs_config_validate  '
Configuration validates
Starting /usr/sbin/corosync corosync -f
CMAN_DEBUG=255
COROSYNC_DEFAULT_CONFIG_IFACE=xmlconfig:cmanpreconfig
CMAN_PIPE=4
Aug 04 14:05:56 corosync [MAIN  ] Corosync Cluster Engine ('1.2.3'): started and ready to provide service.
Aug 04 14:05:56 corosync [MAIN  ] Corosync built-in features: nss rdma
Aug 04 14:05:56 corosync [MAIN  ] Successfully read config from /etc/cluster/cluster.conf
Aug 04 14:05:56 corosync [MAIN  ] Successfully parsed cman config
Aug 04 14:05:56 corosync [TOTEM ] Initializing transport (UDP/IP).
Aug 04 14:05:56 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
*** buffer overflow detected ***: corosync terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7fd47460a6a7]
/lib64/libc.so.6(+0xfe5a0)[0x7fd4746085a0]
/usr/libexec/lcrso/service_cman.lcrso(read_cman_config+0xae)[0x7fd46bbf78de]
/usr/libexec/lcrso/service_cman.lcrso(+0x4087)[0x7fd46bbf2087]
corosync(corosync_service_link_and_init+0xf7)[0x408177]
corosync(corosync_service_defaults_link_and_init+0xf1)[0x4084c1]
corosync[0x405e18]
/usr/lib64/libtotem_pg.so.4(main_iface_change_fn+0x10f)[0x7fd4752e2aff]
/usr/lib64/libtotem_pg.so.4(+0xa07a)[0x7fd4752dc07a]
/usr/lib64/libtotem_pg.so.4(poll_run+0x29d)[0x7fd4752d875d]
corosync(main+0x6cb)[0x4056ab]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7fd474528c9d]
corosync[0x404219]
======= Memory map: ========
<snip>

Aug 04 14:05:56 corosync [TOTEM ] The network interface [192.168.1.201] is now up.
Aug 04 14:05:56 corosync [QUORUM] Using quorum provider quorum_cman
Aug 04 14:05:56 corosync [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
forked process ID is 1706
corosync died with signal: 6

Comment 2 Christine Caulfield 2011-08-04 13:58:58 UTC
Created attachment 516709 [details]
Patch to add a check on the cluster name length

Comment 3 Christine Caulfield 2011-08-04 14:00:13 UTC
I should add that a customer has seen this problem, it is not 'internal' or theoretical.

Comment 5 Fabio Massimo Di Nitto 2011-08-04 14:36:25 UTC
http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=03e9af7db105bcfbb7a013974084d2ed171fb258

commit exists upstream, ACK for rhel6.

Comment 6 Fabio Massimo Di Nitto 2011-08-05 08:11:21 UTC
http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=ac195524d4a520b7f5bbd25e01715f4e0aa1ab19

little amendment to the original patch.

Unit test results:

<cluster name="fabbionefabbionefabbionefabbionefabbionefabbionefabbionefabbione" config_version="1">

/etc/init.d/cman start

*** buffer overflow detected ***: corosync terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7fd7b1767127]
/lib64/libc.so.6(+0xf8100)[0x7fd7b1765100]
[yadayada]

apply patches

[root@rhel6-node2 ~]# /etc/init.d/cman start
Starting cluster:
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... Invalid cluster name. It must be 15 characters or fewer

Unable to get the configuration
Invalid cluster name. It must be 15 characters or fewer

cman_tool: corosync daemon didn't start Check cluster logs for details
                                                           [FAILED]

Comment 7 Fabio Massimo Di Nitto 2011-08-05 09:27:32 UTC
Further testing did show another problem related to the lack of cluster name. Missing cluster name will also cause a crash.

This is the final patch set and unit test results:

http://git.fedorahosted.org/git?p=cluster.git;a=commitdiff;h=1f345b45a5eeaedfcf5c48ac328c32d32d30ac26
http://git.fedorahosted.org/git?p=cluster.git;a=commitdiff;h=79aafcef1dafff42afcc085d55188f495ee3cc54
http://git.fedorahosted.org/git?p=cluster.git;a=commitdiff;h=eecdcabac84dd93abf026fbfdb6f1c850c98fa5b

old packages:

<cluster config_version="1" >

[root@rhel6-node2 ~]# /etc/init.d/cman start
Starting cluster:
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... /usr/sbin/ccs_config_validate: line 186:  2007 Segmentation fault      (core dumped) ccs_config_dump > $tempfile

Unable to get the configuration

<cluster name="fabbionefabbionefabbionefabbionefabbionefabbione" config_version="1" >

[root@rhel6-node2 ~]# /etc/init.d/cman start
[snip]
*** buffer overflow detected ***: corosync terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7f1edb35b427]
/lib64/libc.so.6(+0xfd310)[0x7f1edb359310]
/usr/libexec/lcrso/service_cman.lcrso(read_cman_config+0xae)[0x7f1ed6a108fe]
/usr/libexec/lcrso/service_cman.lcrso(+0x4077)[0x7f1ed6a0b077]
corosync(corosync_service_link_and_init+0xf7)[0x408e97]
corosync(corosync_service_defaults_link_and_init+0xf1)[0x4091e1]
[snip]

new packages:

<cluster config_version="1" >

[root@rhel6-node2 ~]# /etc/init.d/cman start
Starting cluster:
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... Unable to determine cluster name.

Unable to get the configuration
Unable to determine cluster name.

cman_tool: corosync daemon didn't start Check cluster logs for details
                                                           [FAILED]

<cluster name="fabbionefabbionefabbionefabbionefabbionefabbione" config_version="1" >

[root@rhel6-node2 ~]# /etc/init.d/cman start
Starting cluster:
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... Invalid cluster name. It must be 15 characters or fewer

Unable to get the configuration
Invalid cluster name. It must be 15 characters or fewer

cman_tool: corosync daemon didn't start Check cluster logs for details
                                                           [FAILED]

Comment 9 Martin Juricek 2011-08-08 11:20:22 UTC
Verified in version cman-3.0.12.1-9.el6, kernel 2.6.32-131.0.15.el6


1)Cluster name longer than 15 characters:
...
<cluster config_version="1" name="Z_ClusterZ_ClusterZ_Cluster">
...

[root@z2 /]# service cman start
Starting cluster: 
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... Invalid cluster name. It must be 15 characters or fewer

Unable to get the configuration
Invalid cluster name. It must be 15 characters or fewer

cman_tool: corosync daemon didn't start Check cluster logs for details
                                                           [FAILED]



2) Cluster name not set:
...
<cluster config_version="1">
...

[root@z2 /]# service cman start
Starting cluster: 
   Checking if cluster has been disabled at boot...        [  OK  ]
   Checking Network Manager...                             [  OK  ]
   Global setup...                                         [  OK  ]
   Loading kernel modules...                               [  OK  ]
   Mounting configfs...                                    [  OK  ]
   Starting cman... Unable to determine cluster name.

Unable to get the configuration
Unable to determine cluster name.

cman_tool: corosync daemon didn't start Check cluster logs for details
                                                           [FAILED]

Comment 10 Fabio Massimo Di Nitto 2011-10-27 08:22:11 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: The lack of 2 sanity checks related to the length of cluster name would cause cman to crash at startup.
Consequence: cman would crash when starting up
Fix: Implemented the correct sanity checks and report proper error as necessary
Result: cman does not crash anylonger and inform the users of the incorrect value of cluster name

Comment 11 errata-xmlrpc 2011-12-06 14:52:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1516.html


Note You need to log in before you can comment on or make changes to this bug.