Bug 1036402 - Incorrect XML validation error
Summary: Incorrect XML validation error
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: PressGang CCMS
Classification: Community
Component: Web-UI
Version: 1.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 1.3
Assignee: Matthew Casperson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-12-01 21:33 UTC by Matthew Casperson
Modified: 2014-08-04 22:27 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-12-03 22:16:20 UTC


Attachments (Terms of Use)

Description Matthew Casperson 2013-12-01 21:33:52 UTC
Firefox will report an XML validation error of:

topic.xml:2: parser error : StartTag: invalid element name
<!ENTITY euro "&#38;euro;">
 ^

This does not appear to happen in Chrome.

Comment 1 Matthew Casperson 2013-12-01 21:40:06 UTC
This most likely comes from dummy-docbook.ent

Comment 2 Matthew Casperson 2013-12-01 21:51:54 UTC
The issue looks to be in the removeXmlPreamble() method. 

public static String removeXmlPreamble(@NotNull final String xml) {
    final RegExp regExp = RegExp.compile("^\\s*<\\?[\\s\\S]*?\\?>");
    return regExp.replace(xml, "");
}

has been changed to 

public static String removeXmlPreamble(@NotNull final String xml) {
    final RegExp regExp = RegExp.compile("^\\s*<\\?[\\s\\S]*?\\?>", "g");
    return regExp.replace(xml, "");
}

Comment 3 Lee Newson 2013-12-02 00:57:03 UTC
The following error appears now instead using Firefox 25.0:

topic.xml:981: parser error : Start tag expected, '<' not found
]>

and when using Chrome 31.0.1650.57 no errors appear at all.

Tested with version 1.3-SNAPSHOT build 201312020954 and all tests were done in Private/Incognito windows.

Comment 4 Matthew Casperson 2013-12-02 01:58:24 UTC
The issue was actually the removeDoctypePreamble() method. It now reads:

    public static String removeDoctypePreamble(@NotNull final String xml) {
        final RegExp regExp = RegExp.compile("<\\s*!DOCTYPE[\\s\\S]*?\\]>");
        return regExp.replace(xml, "").trim();
    }

The old regex was not picking up the doctype, and therefore not picking up the dummy entities. This meant that two lots of entities were being added to the XML file, which causes the error.

Comment 5 Lee Newson 2013-12-02 07:21:40 UTC
I've done some additional changes as the above doesn't handle cases where no internal entity definitions are defined. It also will remove any sample content that is wrapped in CDATA. As such this is the new remove method:

    public static String removeDoctypePreamble(@NotNull final String xml) {
        final RegExp regExp = RegExp.compile("^\\s*(<\\?[\\s\\S]*?\\?>)?\\s*(<\\s*!DOCTYPE[\\s\\S]*?(\\[[\\s\\S]*?\\])?>)", "gm");
        return regExp.replace(xml, "$1").trim();
    }


Note You need to log in before you can comment on or make changes to this bug.