Description of problem: DocBook 5 books contain multiple stray "Â" characters in html and html-single ouput Version-Release number of selected component (if applicable): publican-3.9.9-0.fc19.t6.noarch How reproducible: 100% Steps to Reproduce: 1. Create a DocBook 5 book 2. Build html, html-single, pdf, and epub versions Actual results: html and html-single versions contain stray "Â" characters around where DocBook gentext has been used, for example: 1. Document Conventions 1.1. Typographic Conventions Copyright © 2013 | Expected results: No stray "Â" characters Additional info: This doesn't affect the PDF or EPUB
Link to source I can test with?
(In reply to Jeff Fearn from comment #1) > Link to source I can test with? This appeared in a brand-new book; so: $ publican create_book --title "DB5 Test Book" --dtdver 5 $ cd DB5_Test_Book/ $ publican build --formats html-single,html,pdf,epub --langs en-US
On Fedora 19 using the 1.78.1 docbook styles the combination of using xhtml5 output and setting the html.ext param triggers an issue. If html.ext is set to '.html' the '.' appears to trigger an issue where a broken UTF8 character is introduced in to the output stream. Switching from the xhtml5 styles to the xhtml-1_1 styles does not change the output, it's still broken. Switching from the xhtml5 styles to the xhtml styles does change the output, it now renders correctly. The output is HTML4 instead of HTML5. When using xhtml5 if you set html.ext to 'html', i.e. simply drop the '.', then the output renders correctly. The file names are invalid. On RHEL6 using the 1.78.1 styles sheets this issues is not present and the combination of xhtml5 and setting html.ext to '.html' works correctly. This is likely cause somewhere in the libxslt stack on Fedora, it could take a considerable time to debug.
HSS-QE has reviewed and declined this request. QE for this bug will be handled by IED.
It appears that setting html.ext to anything besides '.html' or '.htm' will avoid triggering this issue on Fedora 19. Tested successfully with all these: '.xhtml' '.gtml' '.ht' '.htmll' '.1234' .wtf'
Retested with publican-3.9.9-0.fc19.t11.noarch with perl-XML-TreeBuilder-5.0_1-0.fc19.noarch -- stray characters still appear :(
Setting Content-Type in a meta tag resolves this ... To ssh://git.fedorahosted.org/git/publican.git 181eacd..f8b7701 devel -> devel
Fixed in publican-3.9.9-0.fc19.t14.noarch