Bug 756756 - Incorrect rendering of indexterms
Summary: Incorrect rendering of indexterms
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Publican
Classification: Community
Component: publican
Version: 2.8
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: 3.0
Assignee: Jeff Fearn 🐞
QA Contact: Ruediger Landmann
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-11-24 13:02 UTC by Jaromir Hradilek
Modified: 2012-10-31 03:11 UTC (History)
3 users (show)

Fixed In Version: 3.0.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-10-31 03:11:06 UTC
Embargoed:


Attachments (Terms of Use)
An extra space before a comma (40.12 KB, image/png)
2011-11-24 13:02 UTC, Jaromir Hradilek
no flags Details

Description Jaromir Hradilek 2011-11-24 13:02:00 UTC
Created attachment 535896 [details]
An extra space before a comma

Description of problem:
When an opening <primary> tag is followed by a new line, the generated index entry is rendered with an extra space between the term and comma. This is incorrect, because the official DocBook reference explicitly claims that “under no circumstances is the actual content of IndexTerm rendered in the primary flow.” [1]

Version-Release number of selected component (if applicable):
publican-2.8-1.fc16.noarch

How reproducible:
100%

Steps to Reproduce:
1. Add the following code snippet to one of the XML files:

   <indexterm>
     <primary>
       <filename>.fetchmailrc</filename>
     </primary>
   </indexterm>

2. Build a preview of the book with Publican:

   publican build --lang en-US --format html-desktop

3. Open the tmp/en-US/html-desktop/index.html file in a web browser:

   firefox tmp/en-US/html-desktop/index.html

4. Click “Index” in the table of contents and find the “.fetchmailrc” entry.
  
Actual results:
There is an extra space between the ".fetchmailrc" string and the comma as shown in the attached image.

Expected results:
There is no reason for the extra space to be there.

Additional info:
Generated PDF and EPUB files are affected as well.

References:
[1] http://docbook.org/tdg/en/html/indexterm.html

Comment 1 Jeff Fearn 🐞 2011-11-28 22:56:29 UTC
(In reply to comment #0)
> Created attachment 535896 [details]
> An extra space before a comma
> 
> Description of problem:
> When an opening <primary> tag is followed by a new line, the generated index
> entry is rendered with an extra space between the term and comma. This is
> incorrect, because the official DocBook reference explicitly claims that “under
> no circumstances is the actual content of IndexTerm rendered in the primary
> flow.” [1]

The generated index is not the primary flow. The primary flow is where that section appears in the body text.

'primary' is not a block level tag, XmlClean will minimise trailing white space, but it won't delete it. This is correct behaviour for a non-verbatim in-line tag.

If XmlClean is removed, as is being discussed on the list, then treating primary as a block this way will become a larger issue as all white space will be retained.

Comment 2 Jaromir Hradilek 2011-11-28 23:03:36 UTC
Thanks for the explanation, Jeff. However, note that another manifestation of this error is as follows:

1. Add the following code snippet to on of the XML files:

   <filename>/etc/shadow</filename><indexterm>
     <primary><filename>/etc/shadow</filename></primary>
   </indexterm>, which is readable only by the root user.

2. Build a preview of the book with Publican.
3. Open the tmp/en-US/html-desktop/index.html file in a web browser.
4. Search for the "/etc/shadow" string.

Actual results:
There is an extra space between the "/etc/shadow" string and the comma.

Expected results:
There should not be a space, because <indexterm> is not supposed to be rendered in the primary text flow and should be treated just like a comment:

   <filename>/etc/shadow</filename><!--
     this is a comment
   -->, which is readable only by the root user.

Anyway, this should be documented somewhere.

Comment 3 Jeff Fearn 🐞 2011-11-28 23:28:42 UTC
This is now an excellent reason to get rid of the white space munging in XmlClean!

Comment 4 Jeff Fearn 🐞 2011-11-28 23:53:15 UTC
Removed custom XML output, using XML::TreeBuilder default code.

Applied changes to branches/publican-2x and trunk.

Committed revision 1961.

Comment 5 Dayle Parker 2012-04-27 04:32:13 UTC
The fix does not seem to work for me in either case.

Tested both code snippets on Fedora 16 with Publican 3.0-0.fc16.t166 and the white space after the comma is still present.

<indexterm>
     <primary>
       <filename>.fetchmailrc</filename>
     </primary>
   </indexterm>

This snippet (above) from comment 0 appears the same as in attachment 535896 [details] with a white space between the filename and comma:

.fetchmailrc ,


----
And this snippet from Comment 2 (note, I added <para> tags around this snippet so it would build):

   <filename>/etc/shadow</filename><indexterm>
     <primary><filename>/etc/shadow</filename></primary>
   </indexterm>, which is readable only by the root user.

...appears with an extra comma in the published document:

/etc/shadow , which is readable only by the root user. 


...but without an extra white space in the Index:

/etc/shadow, Section

Comment 6 Jeff Fearn 🐞 2012-05-17 04:00:42 UTC
Fixed mixed_mode being applied to nested block elements. Added mixed_mode to index elements.

To ssh://git.fedorahosted.org/git/publican.git
   2e04033..bd8335a  master -> master

Comment 7 Michael Hideo 2012-06-08 01:50:10 UTC
(In reply to comment #0)
> Created attachment 535896 [details]
> An extra space before a comma
> 
> Description of problem:
> When an opening <primary> tag is followed by a new line, the generated index
> entry is rendered with an extra space between the term and comma. This is
> incorrect, because the official DocBook reference explicitly claims that
> “under no circumstances is the actual content of IndexTerm rendered in the
> primary flow.” [1]
> 
> Version-Release number of selected component (if applicable):
> publican-2.8-1.fc16.noarch
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1. Add the following code snippet to one of the XML files:
> 
>    <indexterm>
>      <primary>
>        <filename>.fetchmailrc</filename>
>      </primary>
>    </indexterm>
> 
> 2. Build a preview of the book with Publican:
> 
>    publican build --lang en-US --format html-desktop
> 
> 3. Open the tmp/en-US/html-desktop/index.html file in a web browser:
> 
>    firefox tmp/en-US/html-desktop/index.html
> 
> 4. Click “Index” in the table of contents and find the “.fetchmailrc” entry.
>   
> Actual results:
> There is an extra space between the ".fetchmailrc" string and the comma as
> shown in the attached image.
> 
> Expected results:
> There is no reason for the extra space to be there.
> 
> Additional info:
> Generated PDF and EPUB files are affected as well.
> 
> References:
> [1] http://docbook.org/tdg/en/html/indexterm.html

execute the above.

Comment 8 Jaromir Hradilek 2012-06-08 14:08:44 UTC
Verified in commit bd8335a8f4ba73df816274783c62ff74b9fb8353.


Note You need to log in before you can comment on or make changes to this bug.