Bug 552356 - Publican repeats text in PDFS with certain text content in 'screen' and 'programlisting' tags
Publican repeats text in PDFS with certain text content in 'screen' and 'prog...
Status: CLOSED NOTABUG
Product: Publican
Classification: Community
Component: publican (Show other bugs)
1.6
All Linux
low Severity medium
: ---
: ---
Assigned To: Jeff Fearn
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-01-04 14:36 EST by Jared Smith
Modified: 2010-11-23 23:16 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-02-12 01:35:28 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Sample Chapter.xml file (1.64 KB, application/xml)
2010-01-04 14:36 EST, Jared Smith
no flags Details
Output PDF file showing repeated text. (137.83 KB, application/pdf)
2010-01-04 14:37 EST, Jared Smith
no flags Details
Tarball containing test book (1.91 KB, application/x-tar)
2010-01-20 14:11 EST, Jared Smith
no flags Details
Output of publican build --lang=en-US --formats=pdf (136.77 KB, application/x-download)
2010-01-21 07:07 EST, Ruediger Landmann
no flags Details

  None (edit)
Description Jared Smith 2010-01-04 14:36:06 EST
Created attachment 381621 [details]
Sample Chapter.xml file

Description of problem:

Publican seems to have a problem with certain content (specifically large numbers asterisks) in 'screen' and/or 'programlisting' tags.  It repeats the content several times if the output format is PDF.

Version-Release number of selected component (if applicable):

publican-1.3-0.fc12.noarch

How reproducible:

Always

Steps to Reproduce:
1. pushd /tmp
2. publican create --name=ScreenTest --brand=common --lang=en-US
3. pushd ScreenTest
4. Add the following to the end of en-US/Chapter.xml, right before the closing 'chapter' tag.

<section>
  <title>Test of <literal>screen</literal> and <literal>programlisting</literal> tags</title>
  <para>Over the next few paragraphs, we'll be testing the <literal>screen</literal> and <literal>programlisting</literal> tags, as
  rendered by Publican.

<screen>
*****************************************************
******* TEST PROGRAM
*****************************************************
Installing for RedHat Enterprise 5...
Would you like to install support for TEST PROGRAM? (Y/n) <userinput>y</userinput>
</screen>

<screen>
*****************************************************
******* TEST PROGRAM 2
*****************************************************
</screen>

<programlisting>
*****************************************************
******* TEST PROGRAM 3
*****************************************************
Installing for RedHat Enterprise 5...
Would you like to install support for TEST PROGRAM? (Y/n) <userinput>y</userinput>
</programlisting>

<programlisting>
*****************************************************
******* TEST PROGRAM 4
*****************************************************
</programlisting>
  </para>

</section>

5. publican build --formats=pdf,html --langs=en-US

Actual results:

The HTML output looks correct, but the PDF output does not.  The PDF output has duplicated the text several times in each 'screen' or 'programlisting' section.

Expected results:

the PDF output should match the HTML output.

Additional info:

I will attach the resulting PDF, as well as the entire Chapter.xml file.  This should make it very obvious how easy it is to trigger the bug, and what the erroneous PDF output looks like.
Comment 1 Jared Smith 2010-01-04 14:37:40 EST
Created attachment 381622 [details]
Output PDF file showing repeated text.
Comment 2 Jeff Fearn 2010-01-17 20:08:35 EST
This seems to be a bug in the way the PDF XSLT is handling mixed-mode content. Workaround is to avoid mixed-mode content where possible.

In the above example, if you move the closing para tag above the first screen it will work correctly, will not change the expected display, and will still be valid DocBook.

I am looking in to what the problem with mixed-mode content handling is and how to fix it.

Cheers, Jeff.
Comment 3 Jeff Fearn 2010-01-19 23:28:33 EST
I can not duplicate this behavior ... I'm _sure_ I did the other day, but I can not duplicate it now on either my RHEL 5 or rawhide machines. I hate it when that happens :(

Jared, can you try this again and if it still fails can you pleas tar up the whole book and send it to me?

I talked to Rudi who tested this on some boxes as well and he can't duplicate it either ... very weird.

Cheers, Jeff.
Comment 4 Jared Smith 2010-01-20 14:11:09 EST
Created attachment 385755 [details]
Tarball containing test book

I'm attaching the tarball containing the sample book.
Comment 5 Jared Smith 2010-01-20 16:29:25 EST
For what it's worth, all my testing has been on Fedora 12 64-bit, with all updates applied.

If I had easy access to a RHEL box, I'd try it there.  I did attempt to try on a 32-bit CentOS 5.4 box, but the version of Publican in the EPEL repo is only 0.33, so that won't help us here.  (As a side note, who do I have to prod/bribe to get the version in EPEL bumped to something modern?)
Comment 6 Jeff Fearn 2010-01-20 22:58:08 EST
(In reply to comment #4)
> Created an attachment (id=385755) [details]
> Tarball containing test book
> 
> I'm attaching the tarball containing the sample book.    

I tested this book on my rawhide machine and it built fine o_O

Rudi and Murray, could you guys try this and report back your results?

Cheers, Jeff.
Comment 7 Jeff Fearn 2010-01-20 23:03:21 EST
(In reply to comment #5)
> For what it's worth, all my testing has been on Fedora 12 64-bit, with all
> updates applied.
> 
> If I had easy access to a RHEL box, I'd try it there.  I did attempt to try on
> a 32-bit CentOS 5.4 box, but the version of Publican in the EPEL repo is only
> 0.33, so that won't help us here.  (As a side note, who do I have to prod/bribe
> to get the version in EPEL bumped to something modern?)    

Hey Rudi, we need to kill the EPEL packages. Publican 1.x requires a newer version of libxml2 that RHEL ships and EPEL can not carry that package, so Publican 1.x can not work for EPEL.
Comment 8 Ruediger Landmann 2010-01-21 07:07:01 EST
Created attachment 385909 [details]
Output of publican build --lang=en-US --formats=pdf

Sorry Jared, the book built as expected on my machine (64-bit F12)

How annoying! :(
Comment 9 Ruediger Landmann 2010-01-21 07:28:28 EST
>  (As a side note, who do I have to prod/bribe
> to get the version in EPEL bumped to something modern?)    

Just to expand very slightly on Jeff's reply —  EPEL only ships packages that are not available for Red Hat Enterprise Linux at all; not newer versions of packages. 

As for killing the EPEL packages, apparently they don't like to remove packages either:

http://fedoraproject.org/wiki/EPEL/FAQ#What_do_I_have_to_do_to_get_a_package_removed_from_EPEL.3F
Comment 10 Jared Smith 2010-01-25 13:14:18 EST
Sorry for the false alarm here... I dug a bit deeper, and found that the repetition wasn't present in the FO file, so I figured it couldn't be Publican's fault.

It turns out it was the hyphenation that was causing the problem.  I had added fop-hyph.jar to my classpath a few weeks ago when playing with hypenation, and that seems to be what's causing the problem.  When I remove that, the output looks fine but I get no hyphenation support.

May I ask what you folks do for hyphenation?
Comment 11 Jeff Fearn 2010-01-27 00:17:10 EST
(In reply to comment #10)
> May I ask what you folks do for hyphenation?    

AUIU the default "whole word wrap" is being used.

Cheers, Jeff.

Note You need to log in before you can comment on or make changes to this bug.