Bug 731041 - ricci returns empty cluster.conf file when /etc/cluster/cluster.conf file is invalid
Summary: ricci returns empty cluster.conf file when /etc/cluster/cluster.conf file is ...
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: clustermon
Version: 6.1
Hardware: All
OS: Linux
Target Milestone: rc
: ---
Assignee: Ryan McCabe
QA Contact: Cluster QE
Depends On:
TreeView+ depends on / blocked
Reported: 2011-08-16 14:44 UTC by Christine Caulfield
Modified: 2011-09-29 19:20 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 731113 (view as bug list)
Last Closed: 2011-09-15 14:06:30 UTC

Attachments (Terms of Use)

Description Christine Caulfield 2011-08-16 14:44:29 UTC
Description of problem:

If a user makes a mistake when manually editting cluster.conf, subsequent use of the 'ccs' command can destroy all the information in the file

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

during cluster configuration I modified a line in cluster.conf. But the xml syntax was wrong because closing tag was missing for quroumd

<quorumd label="qdisk_cbgc_cluster" min_score="0">

Then I want to increase the version by 

ccs -h <hostname> --incversion

and the whole configuration was reseted to 

<?xml version="1.0"?>
<cluster config_version="2" name="cluster">

see case: https://access.redhat.com/support/cases/00515395 - the salesforce version contains a patch :-)

If I used the command ccs -f cluster.conf --incversion then a python xml.parsers.expat.ExpatError occurs.

Comment 2 Christine Caulfield 2011-08-16 15:00:07 UTC
Patch from Shane Bradley:

Reviewing the source code for the error that was created in comment #9 when -f is given: 

$ ccs -h localhost --incversion --file /root/sdfc/00515395/cluster.conf 
Traceback (most recent call last):
  File "/usr/sbin/ccs", line 1974, in <module>
  File "/usr/sbin/ccs", line 240, in main
    if (incrementversion): increment_version()
  File "/usr/sbin/ccs", line 699, in increment_version
    new_version = int(get_version()) + 1
  File "/usr/sbin/ccs", line 690, in get_version
    dom = minidom.parseString(get_cluster_conf_xml())
  File "/usr/sbin/ccs", line 715, in get_cluster_conf_xml
    dom = minidom.parseString(xml)
  File "/usr/lib64/python2.6/xml/dom/minidom.py", line 1928, in parseString
    return expatbuilder.parseString(string)
  File "/usr/lib64/python2.6/xml/dom/expatbuilder.py", line 940, in parseString
    return builder.parseString(string)
  File "/usr/lib64/python2.6/xml/dom/expatbuilder.py", line 223, in parseString
    parser.Parse(string, True)
xml.parsers.expat.ExpatError: mismatched tag: line 37, column 2


Here is patch for fixing the above error from src.rpm: 

$ cat ../SOURCES/ccs-catchparseerror.patch 
diff -up ricci-0.16.2/ccs/ccs.org ricci-0.16.2/ccs/ccs
--- ricci-0.16.2/ccs/ccs.org	2011-08-15 15:13:22.076064502 -0400
+++ ricci-0.16.2/ccs/ccs	2011-08-15 15:14:07.454047969 -0400
@@ -3,6 +3,7 @@
 import getopt, sys
 import socket, ssl
 from xml.dom import minidom
+from xml.parsers import expat
 import logging
 import os
 import os.path
@@ -712,7 +713,13 @@ def get_cluster_conf_xml():
         xml = send_ricci_command("cluster", "get_cluster.conf")
-    dom = minidom.parseString(xml)
+    dom = None
+    try: 
+        dom = minidom.parseString(xml)
+    except expat.ExpatError:
+        print "Cluster.conf file specified is not a valid cluster.conf file and contains invalid tags which cannot be parsed."
+        sys.exit(1)
     if dom.getElementsByTagName('cluster').length > 0:
         xml =  dom.getElementsByTagName('cluster')[0].toxml()
         if verifyconf and verify_cluster_conf(xml) != 0:


Still does not fix error when not using -f

Comment 3 Chris Feist 2011-08-16 18:16:59 UTC
There are two separate bugs here:

1.  Ricci will respond to get_cluster_conf with a blank cluster.conf file if it doesn't understand the cluster.conf file present.  Which is what happened in the bug description.  However, this has been how ricci worked in RHEL6 in 6.0 & 6.1 (as well as all of RHEL5), so this isn't a regression.  Also note, that this will cause luci to overwrite a bad configuration as well.

2.  ccs should not segfault when encountering a bad configuration, it should just display an error message and exit.  But it's important to note that current ccs does not create a blank configuration if it doesn't understand the config, it just gives a less than helpful error.

I'm duping this bug to split out the two bugs that need to be fixed.

Comment 4 Chris Feist 2011-08-16 18:31:29 UTC
Not this bug is not a regression, should this still get an exception or blocker flag?

Comment 7 Chris Feist 2011-08-22 22:16:00 UTC
Easy reproducer, start ricci & clustermon on node (in this example it's ask-03)

[cfeist@gold ~]$ ccs -h ask-03 --createcluster testcluster
[cfeist@gold ~]$ ccs -h ask-03 --getconf
<cluster config_version="1" name="testcluster">  

... Now on ask-03 remove the very first character of the file '<' ...

[cfeist@gold ~]$ ccs -h ask-03 --getconf
<cluster config_version="1" name="cluster">  

... Notice how it responds with the cluster name 'cluster' instead of test cluster, because it has auto generated a new config file ...

Note You need to log in before you can comment on or make changes to this bug.