Bug 756756

Summary: Incorrect rendering of indexterms
Product: [Community] Publican Reporter: Jaromir Hradilek <jhradile>
Component: publicanAssignee: Jeff Fearn <jfearn>
Status: CLOSED CURRENTRELEASE QA Contact: Ruediger Landmann <rlandman>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.8CC: dayleparker, mhideo, rlandman
Target Milestone: 3.0Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 3.0.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-30 23:11:06 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Attachments:
Description Flags
An extra space before a comma none

Description Jaromir Hradilek 2011-11-24 08:02:00 EST
Created attachment 535896 [details]
An extra space before a comma

Description of problem:
When an opening <primary> tag is followed by a new line, the generated index entry is rendered with an extra space between the term and comma. This is incorrect, because the official DocBook reference explicitly claims that “under no circumstances is the actual content of IndexTerm rendered in the primary flow.” [1]

Version-Release number of selected component (if applicable):
publican-2.8-1.fc16.noarch

How reproducible:
100%

Steps to Reproduce:
1. Add the following code snippet to one of the XML files:

   <indexterm>
     <primary>
       <filename>.fetchmailrc</filename>
     </primary>
   </indexterm>

2. Build a preview of the book with Publican:

   publican build --lang en-US --format html-desktop

3. Open the tmp/en-US/html-desktop/index.html file in a web browser:

   firefox tmp/en-US/html-desktop/index.html

4. Click “Index” in the table of contents and find the “.fetchmailrc” entry.
  
Actual results:
There is an extra space between the ".fetchmailrc" string and the comma as shown in the attached image.

Expected results:
There is no reason for the extra space to be there.

Additional info:
Generated PDF and EPUB files are affected as well.

References:
[1] http://docbook.org/tdg/en/html/indexterm.html
Comment 1 Jeff Fearn 2011-11-28 17:56:29 EST
(In reply to comment #0)
> Created attachment 535896 [details]
> An extra space before a comma
> 
> Description of problem:
> When an opening <primary> tag is followed by a new line, the generated index
> entry is rendered with an extra space between the term and comma. This is
> incorrect, because the official DocBook reference explicitly claims that “under
> no circumstances is the actual content of IndexTerm rendered in the primary
> flow.” [1]

The generated index is not the primary flow. The primary flow is where that section appears in the body text.

'primary' is not a block level tag, XmlClean will minimise trailing white space, but it won't delete it. This is correct behaviour for a non-verbatim in-line tag.

If XmlClean is removed, as is being discussed on the list, then treating primary as a block this way will become a larger issue as all white space will be retained.
Comment 2 Jaromir Hradilek 2011-11-28 18:03:36 EST
Thanks for the explanation, Jeff. However, note that another manifestation of this error is as follows:

1. Add the following code snippet to on of the XML files:

   <filename>/etc/shadow</filename><indexterm>
     <primary><filename>/etc/shadow</filename></primary>
   </indexterm>, which is readable only by the root user.

2. Build a preview of the book with Publican.
3. Open the tmp/en-US/html-desktop/index.html file in a web browser.
4. Search for the "/etc/shadow" string.

Actual results:
There is an extra space between the "/etc/shadow" string and the comma.

Expected results:
There should not be a space, because <indexterm> is not supposed to be rendered in the primary text flow and should be treated just like a comment:

   <filename>/etc/shadow</filename><!--
     this is a comment
   -->, which is readable only by the root user.

Anyway, this should be documented somewhere.
Comment 3 Jeff Fearn 2011-11-28 18:28:42 EST
This is now an excellent reason to get rid of the white space munging in XmlClean!
Comment 4 Jeff Fearn 2011-11-28 18:53:15 EST
Removed custom XML output, using XML::TreeBuilder default code.

Applied changes to branches/publican-2x and trunk.

Committed revision 1961.
Comment 5 Dayle Parker 2012-04-27 00:32:13 EDT
The fix does not seem to work for me in either case.

Tested both code snippets on Fedora 16 with Publican 3.0-0.fc16.t166 and the white space after the comma is still present.

<indexterm>
     <primary>
       <filename>.fetchmailrc</filename>
     </primary>
   </indexterm>

This snippet (above) from comment 0 appears the same as in attachment 535896 [details] with a white space between the filename and comma:

.fetchmailrc ,


----
And this snippet from Comment 2 (note, I added <para> tags around this snippet so it would build):

   <filename>/etc/shadow</filename><indexterm>
     <primary><filename>/etc/shadow</filename></primary>
   </indexterm>, which is readable only by the root user.

...appears with an extra comma in the published document:

/etc/shadow , which is readable only by the root user. 


...but without an extra white space in the Index:

/etc/shadow, Section
Comment 6 Jeff Fearn 2012-05-17 00:00:42 EDT
Fixed mixed_mode being applied to nested block elements. Added mixed_mode to index elements.

To ssh://git.fedorahosted.org/git/publican.git
   2e04033..bd8335a  master -> master
Comment 7 Michael Hideo 2012-06-07 21:50:10 EDT
(In reply to comment #0)
> Created attachment 535896 [details]
> An extra space before a comma
> 
> Description of problem:
> When an opening <primary> tag is followed by a new line, the generated index
> entry is rendered with an extra space between the term and comma. This is
> incorrect, because the official DocBook reference explicitly claims that
> “under no circumstances is the actual content of IndexTerm rendered in the
> primary flow.” [1]
> 
> Version-Release number of selected component (if applicable):
> publican-2.8-1.fc16.noarch
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1. Add the following code snippet to one of the XML files:
> 
>    <indexterm>
>      <primary>
>        <filename>.fetchmailrc</filename>
>      </primary>
>    </indexterm>
> 
> 2. Build a preview of the book with Publican:
> 
>    publican build --lang en-US --format html-desktop
> 
> 3. Open the tmp/en-US/html-desktop/index.html file in a web browser:
> 
>    firefox tmp/en-US/html-desktop/index.html
> 
> 4. Click “Index” in the table of contents and find the “.fetchmailrc” entry.
>   
> Actual results:
> There is an extra space between the ".fetchmailrc" string and the comma as
> shown in the attached image.
> 
> Expected results:
> There is no reason for the extra space to be there.
> 
> Additional info:
> Generated PDF and EPUB files are affected as well.
> 
> References:
> [1] http://docbook.org/tdg/en/html/indexterm.html

execute the above.
Comment 8 Jaromir Hradilek 2012-06-08 10:08:44 EDT
Verified in commit bd8335a8f4ba73df816274783c62ff74b9fb8353.