Bug 756756

Summary: Incorrect rendering of indexterms
Product: [Community] Publican Reporter: Jaromir Hradilek <jhradile>
Component: publicanAssignee: Jeff Fearn 🐞 <jfearn>
Status: CLOSED CURRENTRELEASE QA Contact: Ruediger Landmann <rlandman+disabled>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.8CC: dayleparker, mhideo, rlandman+disabled
Target Milestone: 3.0Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: 3.0.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-31 03:11:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
An extra space before a comma none

Description Jaromir Hradilek 2011-11-24 13:02:00 UTC
Created attachment 535896 [details]
An extra space before a comma

Description of problem:
When an opening <primary> tag is followed by a new line, the generated index entry is rendered with an extra space between the term and comma. This is incorrect, because the official DocBook reference explicitly claims that “under no circumstances is the actual content of IndexTerm rendered in the primary flow.” [1]

Version-Release number of selected component (if applicable):
publican-2.8-1.fc16.noarch

How reproducible:
100%

Steps to Reproduce:
1. Add the following code snippet to one of the XML files:

   <indexterm>
     <primary>
       <filename>.fetchmailrc</filename>
     </primary>
   </indexterm>

2. Build a preview of the book with Publican:

   publican build --lang en-US --format html-desktop

3. Open the tmp/en-US/html-desktop/index.html file in a web browser:

   firefox tmp/en-US/html-desktop/index.html

4. Click “Index” in the table of contents and find the “.fetchmailrc” entry.
  
Actual results:
There is an extra space between the ".fetchmailrc" string and the comma as shown in the attached image.

Expected results:
There is no reason for the extra space to be there.

Additional info:
Generated PDF and EPUB files are affected as well.

References:
[1] http://docbook.org/tdg/en/html/indexterm.html

Comment 1 Jeff Fearn 🐞 2011-11-28 22:56:29 UTC
(In reply to comment #0)
> Created attachment 535896 [details]
> An extra space before a comma
> 
> Description of problem:
> When an opening <primary> tag is followed by a new line, the generated index
> entry is rendered with an extra space between the term and comma. This is
> incorrect, because the official DocBook reference explicitly claims that “under
> no circumstances is the actual content of IndexTerm rendered in the primary
> flow.” [1]

The generated index is not the primary flow. The primary flow is where that section appears in the body text.

'primary' is not a block level tag, XmlClean will minimise trailing white space, but it won't delete it. This is correct behaviour for a non-verbatim in-line tag.

If XmlClean is removed, as is being discussed on the list, then treating primary as a block this way will become a larger issue as all white space will be retained.

Comment 2 Jaromir Hradilek 2011-11-28 23:03:36 UTC
Thanks for the explanation, Jeff. However, note that another manifestation of this error is as follows:

1. Add the following code snippet to on of the XML files:

   <filename>/etc/shadow</filename><indexterm>
     <primary><filename>/etc/shadow</filename></primary>
   </indexterm>, which is readable only by the root user.

2. Build a preview of the book with Publican.
3. Open the tmp/en-US/html-desktop/index.html file in a web browser.
4. Search for the "/etc/shadow" string.

Actual results:
There is an extra space between the "/etc/shadow" string and the comma.

Expected results:
There should not be a space, because <indexterm> is not supposed to be rendered in the primary text flow and should be treated just like a comment:

   <filename>/etc/shadow</filename><!--
     this is a comment
   -->, which is readable only by the root user.

Anyway, this should be documented somewhere.

Comment 3 Jeff Fearn 🐞 2011-11-28 23:28:42 UTC
This is now an excellent reason to get rid of the white space munging in XmlClean!

Comment 4 Jeff Fearn 🐞 2011-11-28 23:53:15 UTC
Removed custom XML output, using XML::TreeBuilder default code.

Applied changes to branches/publican-2x and trunk.

Committed revision 1961.

Comment 5 Dayle Parker 2012-04-27 04:32:13 UTC
The fix does not seem to work for me in either case.

Tested both code snippets on Fedora 16 with Publican 3.0-0.fc16.t166 and the white space after the comma is still present.

<indexterm>
     <primary>
       <filename>.fetchmailrc</filename>
     </primary>
   </indexterm>

This snippet (above) from comment 0 appears the same as in attachment 535896 [details] with a white space between the filename and comma:

.fetchmailrc ,


----
And this snippet from Comment 2 (note, I added <para> tags around this snippet so it would build):

   <filename>/etc/shadow</filename><indexterm>
     <primary><filename>/etc/shadow</filename></primary>
   </indexterm>, which is readable only by the root user.

...appears with an extra comma in the published document:

/etc/shadow , which is readable only by the root user. 


...but without an extra white space in the Index:

/etc/shadow, Section

Comment 6 Jeff Fearn 🐞 2012-05-17 04:00:42 UTC
Fixed mixed_mode being applied to nested block elements. Added mixed_mode to index elements.

To ssh://git.fedorahosted.org/git/publican.git
   2e04033..bd8335a  master -> master

Comment 7 Michael Hideo 2012-06-08 01:50:10 UTC
(In reply to comment #0)
> Created attachment 535896 [details]
> An extra space before a comma
> 
> Description of problem:
> When an opening <primary> tag is followed by a new line, the generated index
> entry is rendered with an extra space between the term and comma. This is
> incorrect, because the official DocBook reference explicitly claims that
> “under no circumstances is the actual content of IndexTerm rendered in the
> primary flow.” [1]
> 
> Version-Release number of selected component (if applicable):
> publican-2.8-1.fc16.noarch
> 
> How reproducible:
> 100%
> 
> Steps to Reproduce:
> 1. Add the following code snippet to one of the XML files:
> 
>    <indexterm>
>      <primary>
>        <filename>.fetchmailrc</filename>
>      </primary>
>    </indexterm>
> 
> 2. Build a preview of the book with Publican:
> 
>    publican build --lang en-US --format html-desktop
> 
> 3. Open the tmp/en-US/html-desktop/index.html file in a web browser:
> 
>    firefox tmp/en-US/html-desktop/index.html
> 
> 4. Click “Index” in the table of contents and find the “.fetchmailrc” entry.
>   
> Actual results:
> There is an extra space between the ".fetchmailrc" string and the comma as
> shown in the attached image.
> 
> Expected results:
> There is no reason for the extra space to be there.
> 
> Additional info:
> Generated PDF and EPUB files are affected as well.
> 
> References:
> [1] http://docbook.org/tdg/en/html/indexterm.html

execute the above.

Comment 8 Jaromir Hradilek 2012-06-08 14:08:44 UTC
Verified in commit bd8335a8f4ba73df816274783c62ff74b9fb8353.