Red Hat Bugzilla – Bug 461375
Publican's xsl stylesheet adds a weird ASCII character to section headings
Last modified: 2010-11-23 23:18:03 EST
Description of problem:
Publican's xsl stylesheet adds an ASCII character to section headings in HTML documents, which get rendered by browser.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Render an HTML page from DocBook using Publican (in my case, I made an article)
2. Go to a section heading in the document
3. Notice that the character between the section number and the section title is not a space, but is an ASCII character. In vim, I type "ga" at the character, and find that it is ASCII code Decimal 160, Hex 00a0, Octal 240. According to Google, that should be a non-breaking space. (I'm still not sure why the browser is rendering it as a capital A with a hat over it.) Shouldn't that be converted to " " anyway?
The ASCII character shows up in the HTML document, making it look bad.
The lack of the ASCII character.
In doing some additional looking, I see that line 1011 of pdf.xsl says it's using a dirty, dirty hack by setting character   but I still haven't figured out how this comes into play with the HTML version.
Looks like the 0xC2 before the 0xA0 is not getting clipped out by the self-admittedly ugly hack being used in pdf.xsl.
I don't know if this makes a difference, but this is the first line of the temporary output in the document's tmp/$LANG/xml/DocumentName.xml:
Should there be an encoding there? Does this have anything to do with /usr/bin/xmlClean?
Looks like publican doesn't add an encoding by default. I'll add that bug separately.
This was very odd, chunked html worked fine for me, but html-single had this error.
saxon with the 1.72.0 style sheets did not produce this error in any format.
xsltproc with any style sheets did not produce this error in any format.
Added omit-xml-declaration="no" to xsl:output in html-single.xsl "fixed" this problem for saxon with html-single on the current style sheets.