Bug 524832

Summary: Publican erroneously changes xml in pot file
Product: [Community] Publican Reporter: John J. McDonough <wb8rcr>
Component: publicanAssignee: Ruediger Landmann <rlandman+disabled>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: medium    
Version: 1.6CC: jfearn, mmcallis, publican-list
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 1.2-0.fc12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-11-20 05:19:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 525587    
Bug Blocks:    

Description John J. McDonough 2009-09-22 12:42:51 UTC
Description of problem:

When generating a .pot file, Publican changes ulinks of the form:

   <ulink yadayadayada />

to

   <ulink yadayadayada></ulink>

When the resulting .po is used to build a document, the string can no longer be found, causing the build to fail.  Publican also removes any whitespace within the ulink, again, causing the build to fail.


Version-Release number of selected component (if applicable):
0.44

How reproducible:
Not clear this happens 100% of the time.

Steps to Reproduce:
1. Build a Publican document containing lots of <ulink yadayadayada />
2. Translate the document
3. Attempt to build the non-English document
  
Actual results:
merge failed

Expected results:
translated document

Additional info:
Encountered in fedora-release-notes for F12, specifically commit d504e0baef24fe1a513b787196847966d3a27c91 demonstrates the problem.

Comment 1 Jeff Fearn 🐞 2009-09-22 21:24:34 UTC
This is caused by the kde tool xml2pot, which publican 0.4x uses to create pot file.

The beta uses a different tool to create pot files.

Assigning to Rudi for verification this is fixed in the beta.

Cheers, Jeff.

Comment 2 Ruediger Landmann 2009-09-23 02:08:42 UTC
Well, both Publican 0.44 and the Beta change the string between the XML and the POT, but it's not an issue, because both versions understand that the two strings are equivalent:

Original string in XML:

  <para>
    This is a test paragraph and here's a ulink: <ulink url="http://www.google.com" />
  </para>

Publican 0.44 POT:

#. Tag: para
#: Chapter.xml:7
#, no-c-format
msgid ""
"This is a test paragraph and here's a ulink: <ulink url=\"http://www.google."
"com\"></ulink>"
msgstr ""

(Verified that running xml2pot from the command line produces the same output, as expected)

Publican Beta POT:

#. Tag: para
#, no-c-format
msgid "This is a test paragraph and here&#39;s a ulink: <ulink url=\"http://www.google.com\"></ulink>"
msgstr ""

So, the Beta not only changes the ulink, but it also transforms the apostrophe.

However, I created PO files and used Google's translator to put this sentence into Swedish:

#. Tag: para
#, no-c-format
msgid "This is a test paragraph and here&#39;s a ulink: <ulink url=\"http://www.google.com\"></ulink>"
msgstr "Detta är en test punkt och här är en ulink: <ulink url=\"http://www.google.com\"></ulink> "

The Beta correctly found and used this string when I ran "publican build --formats=html-single --langs=sv-SE"

Then I tried the same msgstr with Publican 0.44:

#. Tag: para
#: Chapter.xml:7
#, no-c-format
msgid ""
"This is a test paragraph and here's a ulink: <ulink url=\"http://www.google."
"com\"></ulink>"
msgstr "Detta är en test punkt och här är en ulink: <ulink url=\"http://www.google.com\"></ulink> "

Publican 0.44 also correctly found and used the string when I ran "make html-single-sv-SE"

So in both cases, XML -> POT -> PO -> HTML worked, despite discrepancies in the strings.

To round out the experiment, I found that the Beta could not use the POT/PO files generated by 0.44, nor could 0.44 use the POT/PO files generated by Beta. However, this was only because of the apostrophe, not the ulinks. When I changed the string in the original XML to:

  <para>
    This is a test paragraph and here is a ulink: <ulink url="http://www.google.com" />
  </para>

0.44 and Beta could use each other's POT/PO files.

Comment 3 Jeff Fearn 🐞 2009-09-23 05:04:16 UTC
There is an issue with tags with optional content in one of the modules Publican uses to parse XML files.

I opened a bug upstream with a patch, maybe we need to get this applied to fedora so we can fix this behavior ASAP?

https://rt.cpan.org/Ticket/Display.html?id=49932

The munging of the ' is due to it publican XML escaping the text content, it prevents some breakage in certain situations, it would require some testing of turning it off.

FYI if you run clean_ids the same escaping happens in the source XML.

Cheers, Jeff.

Comment 4 Jeff Fearn 🐞 2009-09-30 02:43:59 UTC
I've checked in a fix for both the changing of the XML and the escaping of the apostrophe.

The fix requires updated perl-HTML-Tree and perl-XML-TreeBuilder packages, both of which have updates making their way through bodi.

Once those changes have propagated we will spin up a BETA2 which will fix this behaviour.

Cheers, Jeff.

Comment 5 Fedora Update System 2009-11-18 02:17:55 UTC
publican-1.2-0.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/publican-1.2-0.fc12

Comment 6 Bug Zapper 2009-11-18 08:02:51 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 7 Fedora Update System 2009-11-20 05:16:47 UTC
publican-1.2-0.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 8 Fedora Update System 2009-11-25 14:52:03 UTC
publican-1.2-0.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.