Bug 1506864
| Summary: | pcs cannot load large xml files | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Andrew Beekhof <abeekhof> | ||||||
| Component: | pcs | Assignee: | Ivan Devat <idevat> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | cluster-qe <cluster-qe> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 7.4 | CC: | abeekhof, cfeist, cluster-maint, idevat, omular, rsteiger, tojeline | ||||||
| Target Milestone: | rc | Keywords: | EasyFix | ||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | pcs-0.9.162-1.el7 | Doc Type: | Bug Fix | ||||||
| Doc Text: |
Cause:
pcs used an xml parser with an implicit size limitation setting
Consequence:
pcs failed when tried to process CIB over size limit
Fix:
pcs uses an xml parser with the explicit unlimited size setting
Result:
pcs works with the large CIB file
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2018-04-10 15:40:54 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
I cannot reproduce the problem. I tried it with a cib larger than 10404672 bytes and with a cib larger than 20944671 bytes too. Command `pcs resource create` succeeded. Command `pcs resource enable` failed because the command `crm_mon` failed: [ant ~] $ rpm -q pcs python-libs python-lxml libxml2 pcs-0.9.158-6.el7.x86_64 python-libs-2.7.5-58.el7.x86_64 python-lxml-3.2.1-4.el7.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 [ant ~] $ wc testing.xml 220012 480014 10404672 testing.xml [ant ~] $ pcs resource create A ocf:heartbeat:Dummy -f testing.xml [ant ~] $ echo $? 0 [ant ~] $ wc testing.xml 200017 480037 11185048 testing.xml [ant ~] $ pcs resource enable A -f testing.xml --debug > testing.log Error: error running crm_mon, is pacemaker running? [ant ~] $ cat testing.log|grep "Running: /usr/sbin/crm_mon" -A6 Running: /usr/sbin/crm_mon --one-shot --as-xml --inactive Environment: CIB_file=/tmp/tmpfilNbQ.pcs LC_ALL=C Finished running: /usr/sbin/crm_mon --one-shot --as-xml --inactive Return value: -9 [ant ~] $ CIB_file=testing.xml crm_mon --one-shot --as-xml --inactive zsh: killed CIB_file=testing.xml crm_mon --one-shot --as-xml --inactive [ant ~] $ echo $? 137 Could you please provide your cib and versions of python-lxml and libxml2? Interesting. I don't get that far. --debug just shows the cibadmin being requested and the parsing failing. versions: [root@c09-h05-r630 pcs]# rpm -qa python-lxml libxml2 libxml2-2.9.1-6.el7_2.3.x86_64 python-lxml-3.2.1-4.el7.x86_64 (In reply to Andrew Beekhof from comment #3) > Interesting. I don't get that far. > --debug just shows the cibadmin being requested and the parsing failing. > Could you please provide your cib and output of --debug? We found out that for 20 MB cib the cibadmin allocated about 600 MB of memory. When there was not enough memory, cibadmin was killed. We tried to set the amount of memory in such a way that cibadmin would finish and pcs would failed. But we did not manage it. How many free memory do you have when you run the failing command? Will attach the whole cib.
[root@c09-h05-r630 pcs]# pcs --debug resource disable scale49-bundle-4
Running: /usr/sbin/cibadmin --local --query
Environment:
LC_ALL=C
Finished running: /usr/sbin/cibadmin --local --query
Return value: 0
--Debug Stdout Start--
<cib crm_feature_set="3.0.14" validate-with="pacemaker-2.8" epoch="4648" num_updates="23" admin_epoch="0" cib-last-written="Mon Oct 30 05:42:59 2017" update-origin="c09-h06-r630" update-client="crmd" update-user="hacluster" have-quorum="1" dc-uuid="2">
<configuration>
<crm_config>
...
</node_state>
</status>
</cib>
--Debug Stdout End--
--Debug Stderr Start--
--Debug Stderr End--
Error: unable to get cib, xml does not conform to the schema
Created attachment 1345677 [details]
Full output of pcs resource disable
Also, putting the xml from that output into a file and running: [12:05 PM] beekhof@fedora ~/Development/sources/pacemaker/devel ☺ # xmllint --relaxng /usr/share/pacemaker/pacemaker-2.10.rng full.xml Results in: "full.xml validates" The issue is reproducible with the cib extracted from given log.
The following script shows the cause of the problem:
import sys
from lxml import etree
with open(sys.argv[1], "r") as xml_file:
etree.fromstring(xml_file.read().encode("utf-8"))
[ant ~] $ python reproducer.py cib.big.xml
Traceback (most recent call last):
File "reproducer.py", line 6, in <module>
etree.fromstring(xml_file.read().encode("utf-8"))
File "lxml.etree.pyx", line 2993, in lxml.etree.fromstring (src/lxml/lxml.etree.c:63070)
File "parser.pxi", line 1617, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:93194)
File "parser.pxi", line 1495, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:92003)
File "parser.pxi", line 1011, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:88660)
File "parser.pxi", line 577, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:84385)
File "parser.pxi", line 676, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:85488)
File "parser.pxi", line 616, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:84811)
lxml.etree.XMLSyntaxError: internal error: Huge input lookup, line 87985, column 184
It is possible to fix it by:
etree.fromstring(xml_file.read().encode("utf-8"), etree.XMLParser(huge_tree=True))
From some reasons the lxml parsing succeeds with the huge cib that I generated.
Note that the error message
"Error: unable to get cib, xml does not conform to the schema"
should be corrected.
(In reply to Ivan Devat from comment #8) > From some reasons the lxml parsing succeeds with the huge cib that I > generated. Quite weird that! Thanks for tracking this one down, glad there's an easy fix Created attachment 1348965 [details]
proposed fix
After fix:
[root@rhel75-node1 ~]# rpm -q pcs
pcs-0.9.162-1.el7.x86_64
> cib from attachment 1345677 [details] is in file provided-cib.xml
[root@rhel75-node1 ~]# wc provided-cib.xml
87994 720882 19545180 provided-cib.xml
[root@rhel75-node1 ~]# pcs -f provided-cib.xml resource disable c09-h11-r630
[root@rhel75-node1 ~]# echo $?
0
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0866 |
Description of problem: # cibadmin -Q > pcs-too-big.xml # wc pcs-too-big.xml 46806 391283 10607376 pcs-too-big.xml # crm_verify --xml-file pcs-too-big.xml # echo $? 0 # N=1 pcs -f pcs-too-big.xml resource bundle create scale$N-bundle container docker image=beekhof:remote replicas=20 run-command=/usr/sbin/pacemaker_remoted network control-port=313$N storage-map id=dev-log-$N source-dir=/dev/log target-dir=/dev/log --disabled Error: unable to get cib, xml does not conform to the schema The error seems to come out of parse_cib_xml() which eventually calls: return etree.fromstring(xml.encode("utf-8")) To work out the exact threshold, I ran: [root@c09-h05-r630 pcs]# for N in $(seq 50 150); do pcs -f testing.xml resource bundle create scale$N-bundle container docker image=beekhof:remote replicas=20 options="--user=root --log-driver=journald" run-command="/usr/sbin/pacemaker_remoted" network control-port=32$N storage-map id=dev-log-$N source-dir=/dev/log target-dir=/dev/log --disabled; pcs -f testing.xml resource create dummy$N ocf:pacemaker:Dummy bundle scale$N-bundle; echo $N; wc testing.xml ; done ... 96 44282 367065 9992820 testing.xml 97 44299 367112 9993786 testing.xml 98 44316 367159 9994752 testing.xml 99 44333 367206 9995718 testing.xml 100 44350 367253 9996692 testing.xml 101 44367 367300 9997666 testing.xml 102 44384 367347 9998640 testing.xml 103 44401 367394 9999614 testing.xml parse_cib_xml() Error: unable to get cib, xml does not conform to the schema 104 44411 367418 10000168 testing.xml Error: unable to get cib, xml does not conform to the schema Error: unable to get cib, xml does not conform to the schema Version-Release number of selected component (if applicable): python-libs-2.7.5-58.el7.x86_64 pcs-0.9.158-6.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create a cib thats over ~10000168 bytes large 2. Run a command like pcs resource enable [foo] 3. Actual results: Error: unable to get cib, xml does not conform to the schema Expected results: Command succeeds Additional info: