Bug 731041

Summary: ricci returns empty cluster.conf file when /etc/cluster/cluster.conf file is invalid
Product: Red Hat Enterprise Linux 6 Reporter: Christine Caulfield <ccaulfie>
Component: clustermonAssignee: Ryan McCabe <rmccabe>
Status: CLOSED NOTABUG QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: ccaulfie, cluster-maint, djansa, jwest, lhh, rdassen, rpeterso, teigland
Target Milestone: rcKeywords: Patch
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 731113 (view as bug list) Environment:
Last Closed: 2011-09-15 14:06:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Christine Caulfield 2011-08-16 14:44:29 UTC
Description of problem:

If a user makes a mistake when manually editting cluster.conf, subsequent use of the 'ccs' command can destroy all the information in the file


Version-Release number of selected component (if applicable):
RHEL6.1

How reproducible:
Easily

Steps to Reproduce:

during cluster configuration I modified a line in cluster.conf. But the xml syntax was wrong because closing tag was missing for quroumd

...
</rm>
<quorumd label="qdisk_cbgc_cluster" min_score="0">
<cman>
...

Then I want to increase the version by 

ccs -h <hostname> --incversion

and the whole configuration was reseted to 

<?xml version="1.0"?>
<cluster config_version="2" name="cluster">
        <fence_daemon/>
        <clusternodes/>
        <cman/>
        <fencedevices/>
        <rm>
                <failoverdomains/>
                <resources/>
        </rm>
</cluster>

see case: https://access.redhat.com/support/cases/00515395 - the salesforce version contains a patch :-)


If I used the command ccs -f cluster.conf --incversion then a python xml.parsers.expat.ExpatError occurs.

Comment 2 Christine Caulfield 2011-08-16 15:00:07 UTC
Patch from Shane Bradley:

Reviewing the source code for the error that was created in comment #9 when -f is given: 

$ ccs -h localhost --incversion --file /root/sdfc/00515395/cluster.conf 
Traceback (most recent call last):
  File "/usr/sbin/ccs", line 1974, in <module>
    main(sys.argv[1:])
  File "/usr/sbin/ccs", line 240, in main
    if (incrementversion): increment_version()
  File "/usr/sbin/ccs", line 699, in increment_version
    new_version = int(get_version()) + 1
  File "/usr/sbin/ccs", line 690, in get_version
    dom = minidom.parseString(get_cluster_conf_xml())
  File "/usr/sbin/ccs", line 715, in get_cluster_conf_xml
    dom = minidom.parseString(xml)
  File "/usr/lib64/python2.6/xml/dom/minidom.py", line 1928, in parseString
    return expatbuilder.parseString(string)
  File "/usr/lib64/python2.6/xml/dom/expatbuilder.py", line 940, in parseString
    return builder.parseString(string)
  File "/usr/lib64/python2.6/xml/dom/expatbuilder.py", line 223, in parseString
    parser.Parse(string, True)
xml.parsers.expat.ExpatError: mismatched tag: line 37, column 2

------------------

Here is patch for fixing the above error from src.rpm: 
  ~/redhat/TMP/ricci-0.16.2-35.el6.src.rpm

$ cat ../SOURCES/ccs-catchparseerror.patch 
diff -up ricci-0.16.2/ccs/ccs.org ricci-0.16.2/ccs/ccs
--- ricci-0.16.2/ccs/ccs.org	2011-08-15 15:13:22.076064502 -0400
+++ ricci-0.16.2/ccs/ccs	2011-08-15 15:14:07.454047969 -0400
@@ -3,6 +3,7 @@
 import getopt, sys
 import socket, ssl
 from xml.dom import minidom
+from xml.parsers import expat
 import logging
 import os
 import os.path
@@ -712,7 +713,13 @@ def get_cluster_conf_xml():
     else:
         xml = send_ricci_command("cluster", "get_cluster.conf")
 
-    dom = minidom.parseString(xml)
+    dom = None
+    try: 
+        dom = minidom.parseString(xml)
+    except expat.ExpatError:
+        print "Cluster.conf file specified is not a valid cluster.conf file and contains invalid tags which cannot be parsed."
+        sys.exit(1)
+    
     if dom.getElementsByTagName('cluster').length > 0:
         xml =  dom.getElementsByTagName('cluster')[0].toxml()
         if verifyconf and verify_cluster_conf(xml) != 0:

-----

Still does not fix error when not using -f

Comment 3 Chris Feist 2011-08-16 18:16:59 UTC
There are two separate bugs here:

1.  Ricci will respond to get_cluster_conf with a blank cluster.conf file if it doesn't understand the cluster.conf file present.  Which is what happened in the bug description.  However, this has been how ricci worked in RHEL6 in 6.0 & 6.1 (as well as all of RHEL5), so this isn't a regression.  Also note, that this will cause luci to overwrite a bad configuration as well.

2.  ccs should not segfault when encountering a bad configuration, it should just display an error message and exit.  But it's important to note that current ccs does not create a blank configuration if it doesn't understand the config, it just gives a less than helpful error.

I'm duping this bug to split out the two bugs that need to be fixed.

Comment 4 Chris Feist 2011-08-16 18:31:29 UTC
Not this bug is not a regression, should this still get an exception or blocker flag?

Comment 7 Chris Feist 2011-08-22 22:16:00 UTC
Easy reproducer, start ricci & clustermon on node (in this example it's ask-03)

[cfeist@gold ~]$ ccs -h ask-03 --createcluster testcluster
[cfeist@gold ~]$ ccs -h ask-03 --getconf
<cluster config_version="1" name="testcluster">  
  <fence_daemon/>  
  <clusternodes/>  
  <cman/>  
  <fencedevices/>  
  <rm>    
    <failoverdomains/>    
    <resources/>    
  </rm>  
</cluster>

... Now on ask-03 remove the very first character of the file '<' ...

[cfeist@gold ~]$ ccs -h ask-03 --getconf
<cluster config_version="1" name="cluster">  
	  <fence_daemon/>  
	  <clusternodes/>  
	  <cman/>  
	  <fencedevices/>  
	  <rm>    
		    <failoverdomains/>    
		    <resources/>    
	  </rm>  
</cluster>


... Notice how it responds with the cluster name 'cluster' instead of test cluster, because it has auto generated a new config file ...