Bug 552356 - Publican repeats text in PDFS with certain text content in 'screen' and 'programlisting' tags
Summary: Publican repeats text in PDFS with certain text content in 'screen' and 'prog...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Publican
Classification: Community
Component: publican
Version: 1.6
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Jeff Fearn 🐞
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-01-04 19:36 UTC by Jared Smith
Modified: 2010-11-24 04:16 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-02-12 06:35:28 UTC
Embargoed:


Attachments (Terms of Use)
Sample Chapter.xml file (1.64 KB, application/xml)
2010-01-04 19:36 UTC, Jared Smith
no flags Details
Output PDF file showing repeated text. (137.83 KB, application/pdf)
2010-01-04 19:37 UTC, Jared Smith
no flags Details
Tarball containing test book (1.91 KB, application/x-tar)
2010-01-20 19:11 UTC, Jared Smith
no flags Details
Output of publican build --lang=en-US --formats=pdf (136.77 KB, application/x-download)
2010-01-21 12:07 UTC, Ruediger Landmann
no flags Details

Description Jared Smith 2010-01-04 19:36:06 UTC
Created attachment 381621 [details]
Sample Chapter.xml file

Description of problem:

Publican seems to have a problem with certain content (specifically large numbers asterisks) in 'screen' and/or 'programlisting' tags.  It repeats the content several times if the output format is PDF.

Version-Release number of selected component (if applicable):

publican-1.3-0.fc12.noarch

How reproducible:

Always

Steps to Reproduce:
1. pushd /tmp
2. publican create --name=ScreenTest --brand=common --lang=en-US
3. pushd ScreenTest
4. Add the following to the end of en-US/Chapter.xml, right before the closing 'chapter' tag.

<section>
  <title>Test of <literal>screen</literal> and <literal>programlisting</literal> tags</title>
  <para>Over the next few paragraphs, we'll be testing the <literal>screen</literal> and <literal>programlisting</literal> tags, as
  rendered by Publican.

<screen>
*****************************************************
******* TEST PROGRAM
*****************************************************
Installing for RedHat Enterprise 5...
Would you like to install support for TEST PROGRAM? (Y/n) <userinput>y</userinput>
</screen>

<screen>
*****************************************************
******* TEST PROGRAM 2
*****************************************************
</screen>

<programlisting>
*****************************************************
******* TEST PROGRAM 3
*****************************************************
Installing for RedHat Enterprise 5...
Would you like to install support for TEST PROGRAM? (Y/n) <userinput>y</userinput>
</programlisting>

<programlisting>
*****************************************************
******* TEST PROGRAM 4
*****************************************************
</programlisting>
  </para>

</section>

5. publican build --formats=pdf,html --langs=en-US

Actual results:

The HTML output looks correct, but the PDF output does not.  The PDF output has duplicated the text several times in each 'screen' or 'programlisting' section.

Expected results:

the PDF output should match the HTML output.

Additional info:

I will attach the resulting PDF, as well as the entire Chapter.xml file.  This should make it very obvious how easy it is to trigger the bug, and what the erroneous PDF output looks like.

Comment 1 Jared Smith 2010-01-04 19:37:40 UTC
Created attachment 381622 [details]
Output PDF file showing repeated text.

Comment 2 Jeff Fearn 🐞 2010-01-18 01:08:35 UTC
This seems to be a bug in the way the PDF XSLT is handling mixed-mode content. Workaround is to avoid mixed-mode content where possible.

In the above example, if you move the closing para tag above the first screen it will work correctly, will not change the expected display, and will still be valid DocBook.

I am looking in to what the problem with mixed-mode content handling is and how to fix it.

Cheers, Jeff.

Comment 3 Jeff Fearn 🐞 2010-01-20 04:28:33 UTC
I can not duplicate this behavior ... I'm _sure_ I did the other day, but I can not duplicate it now on either my RHEL 5 or rawhide machines. I hate it when that happens :(

Jared, can you try this again and if it still fails can you pleas tar up the whole book and send it to me?

I talked to Rudi who tested this on some boxes as well and he can't duplicate it either ... very weird.

Cheers, Jeff.

Comment 4 Jared Smith 2010-01-20 19:11:09 UTC
Created attachment 385755 [details]
Tarball containing test book

I'm attaching the tarball containing the sample book.

Comment 5 Jared Smith 2010-01-20 21:29:25 UTC
For what it's worth, all my testing has been on Fedora 12 64-bit, with all updates applied.

If I had easy access to a RHEL box, I'd try it there.  I did attempt to try on a 32-bit CentOS 5.4 box, but the version of Publican in the EPEL repo is only 0.33, so that won't help us here.  (As a side note, who do I have to prod/bribe to get the version in EPEL bumped to something modern?)

Comment 6 Jeff Fearn 🐞 2010-01-21 03:58:08 UTC
(In reply to comment #4)
> Created an attachment (id=385755) [details]
> Tarball containing test book
> 
> I'm attaching the tarball containing the sample book.    

I tested this book on my rawhide machine and it built fine o_O

Rudi and Murray, could you guys try this and report back your results?

Cheers, Jeff.

Comment 7 Jeff Fearn 🐞 2010-01-21 04:03:21 UTC
(In reply to comment #5)
> For what it's worth, all my testing has been on Fedora 12 64-bit, with all
> updates applied.
> 
> If I had easy access to a RHEL box, I'd try it there.  I did attempt to try on
> a 32-bit CentOS 5.4 box, but the version of Publican in the EPEL repo is only
> 0.33, so that won't help us here.  (As a side note, who do I have to prod/bribe
> to get the version in EPEL bumped to something modern?)    

Hey Rudi, we need to kill the EPEL packages. Publican 1.x requires a newer version of libxml2 that RHEL ships and EPEL can not carry that package, so Publican 1.x can not work for EPEL.

Comment 8 Ruediger Landmann 2010-01-21 12:07:01 UTC
Created attachment 385909 [details]
Output of publican build --lang=en-US --formats=pdf

Sorry Jared, the book built as expected on my machine (64-bit F12)

How annoying! :(

Comment 9 Ruediger Landmann 2010-01-21 12:28:28 UTC
>  (As a side note, who do I have to prod/bribe
> to get the version in EPEL bumped to something modern?)    

Just to expand very slightly on Jeff's reply —  EPEL only ships packages that are not available for Red Hat Enterprise Linux at all; not newer versions of packages. 

As for killing the EPEL packages, apparently they don't like to remove packages either:

http://fedoraproject.org/wiki/EPEL/FAQ#What_do_I_have_to_do_to_get_a_package_removed_from_EPEL.3F

Comment 10 Jared Smith 2010-01-25 18:14:18 UTC
Sorry for the false alarm here... I dug a bit deeper, and found that the repetition wasn't present in the FO file, so I figured it couldn't be Publican's fault.

It turns out it was the hyphenation that was causing the problem.  I had added fop-hyph.jar to my classpath a few weeks ago when playing with hypenation, and that seems to be what's causing the problem.  When I remove that, the output looks fine but I get no hyphenation support.

May I ask what you folks do for hyphenation?

Comment 11 Jeff Fearn 🐞 2010-01-27 05:17:10 UTC
(In reply to comment #10)
> May I ask what you folks do for hyphenation?    

AUIU the default "whole word wrap" is being used.

Cheers, Jeff.


Note You need to log in before you can comment on or make changes to this bug.