Description of problem: If a topic contains valid XML entities that are invalid UTF-8, PressGang doesn't report any problem, but the builds in DocBuilder and elsewhere fail for no obvious reason. Version-Release number of selected component (if applicable): 1.1 How reproducible: 100% Steps to Reproduce: 1. Create a topic 2. Insert a somewhere (carriage return) 3. Take a look in DocBuilder Actual results: PressGang reports no problem with the topic, but it doesn't build Expected results: PressGang warns user that there's a UTF-8 problem Additional info: PressGang should probably still allow users to write and store valid XML that's not UTF-8 compliant. That will never build in Publican, but we should remain open to the possibility that users might want to transform their XML with some other tool that might not require UTF-8 compliance. Only adding this one as a blocker because of its pure nuisance value; it's easily worked around with sed before doing a mass upload. This particular CR is the only offending one I've hit so far.
Is there some documentation on character codes that are not valid UTF-8?