Bug 1016338 - Detect invalid UTF-8 in XML
Summary: Detect invalid UTF-8 in XML
Keywords:
Status: NEW
Alias: None
Product: PressGang CCMS
Classification: Community
Component: CCMS-Core
Version: 1.1
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1012194
TreeView+ depends on / blocked
 
Reported: 2013-10-08 00:43 UTC by Ruediger Landmann
Modified: 2023-02-21 23:20 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Ruediger Landmann 2013-10-08 00:43:07 UTC
Description of problem:
If a topic contains valid XML entities that are invalid UTF-8, PressGang doesn't report any problem, but the builds in DocBuilder and elsewhere fail for no obvious reason.

Version-Release number of selected component (if applicable):
1.1

How reproducible:
100%

Steps to Reproduce:
1. Create a topic
2. Insert a 
 somewhere (carriage return)
3. Take a look in DocBuilder

Actual results:
PressGang reports no problem with the topic, but it doesn't build

Expected results:
PressGang warns user that there's a UTF-8 problem

Additional info:
PressGang should probably still allow users to write and store valid XML that's not UTF-8 compliant. That will never build in Publican, but we should remain open to the possibility that users might want to transform their XML with some other tool that might not require UTF-8 compliance.

Only adding this one as a blocker because of its pure nuisance value; it's easily worked around with sed before doing a mass upload. This particular CR is the only offending one I've hit so far.

Comment 1 Matthew Casperson 2014-01-12 21:34:44 UTC
Is there some documentation on character codes that are not valid UTF-8?


Note You need to log in before you can comment on or make changes to this bug.