Bug 545334

Summary: Invalid XML can cause hard to decipher entity errors
Product: [Community] Publican Reporter: Darrin Mison <dmison>
Component: publicanAssignee: Jeff Fearn <jfearn>
Status: CLOSED NEXTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 2.3CC: anross, dlackey, jfearn, mmcallis, publican-list, r.landmann
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
See Also: https://bugzilla.redhat.com/show_bug.cgi?id=664360
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-28 20:39:16 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Darrin Mison 2009-12-08 03:07:31 EST
Description of problem:
You can't represent XML entities in a document.
It would appear that the entities are being expanded more than once 

Version-Release number of selected component (if applicable):
publican-ovirt-1.0-0.el5
publican-gimp-1.0-0.el5
publican-redhat-1.1-0.el5
publican-1.3-0.el5
publican-redhat-internal-1.0-1.el5
publican-jboss-1.1-0.el5
perl-Publican-WebSite-1.2-1.el5
publican-WebSite-obsoletes-1.14-1.el5
publican-fedora-1.0-0.el5
publican-doc-1.3-0.el5


How reproducible:


Steps to Reproduce:
1.add to book 
<para>The entity to use is: <code>&amp;allproperties;</code></para>
2.build  

Actual results:
Release_Notes.xml:32: parser error : Entity 'allproperties' not defined
			<code>&allproperties;</code>
			                     ^


Expected results:
book contains the text:
The entity to use is: &allproperties;

Additional info:
Comment 1 Jeff Fearn 2009-12-08 17:34:20 EST
The problem here is that the wrong error message is being generated.

When you feed Publican invalid XML that matches the above format the entity parsing code chokes on it before the XML validation can flag it as invalid. This is because the entities are expanded prior to validation to ensure the full tree is being validated.

I'm looking in to how to get the correct error message to be generated.

The workaround is to use valid XML.

e.g

<code>&amp;allproperties&semi;</code>
Comment 2 Jeff Fearn 2009-12-10 20:33:49 EST
*** Bug 546488 has been marked as a duplicate of this bug. ***
Comment 3 Darrin Mison 2009-12-10 20:41:54 EST
ah &semi; .. face->palm

thanks :-)
Comment 4 Jeff Fearn 2010-01-18 17:25:11 EST
I have tracked this to a bug in HTML::Element::_xml_escape. The regex in this function seems too aggressive.
Comment 5 Jeff Fearn 2010-01-18 19:43:26 EST
(In reply to comment #4)
> I have tracked this to a bug in HTML::Element::_xml_escape. The regex in this
> function seems too aggressive.    

This is wrong. The issue is in the way the XML parser is parsing the XML when you have particular kinds of entities or invalid XML. I'm not sure there is anyway to properly flag this since the XML can not be validated before it is parsed ... will put this on the shelf for a bit.
Comment 6 Jeff Fearn 2010-05-10 02:59:33 EDT
Finally tracked this down and worked out how to over come this bug, am pestering upstream to get a fix for XML::TreeBuilder.
Comment 7 Bug Zapper 2010-11-03 23:57:07 EDT
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 8 Jeff Fearn 2010-12-17 00:30:16 EST
Bumped requirement for XML::TreeBuilder to 4.0, which contains fixes for this issue.

Fixed in revision 1697
Comment 9 Jeff Fearn 2011-11-28 20:39:16 EST
Required modules were updated some time ago.