Bug 576771

Summary: Corrupted translated Common Content in 1.6.1
Product: [Community] Publican Reporter: Ruediger Landmann <rlandman+disabled>
Component: publicanAssignee: Jeff Fearn 🐞 <jfearn>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 1.6CC: jfearn, mmcallis, publican-list, rlandman
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: publican-1.6.2-0.fc13 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-04-03 04:40:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ruediger Landmann 2010-03-25 05:50:03 UTC
Description of problem:

UTF8 characters in the translations of the Common Content sections are not rendering properly. For example, the German word "fΓΌr" ("for") appears as "für"

This problem affects the Common Content for Publican itself ("Document Conventions") and brands (Fedora "Feedback") but not the content of documents themselves.

Version-Release number of selected component (if applicable): 1.6.1


How reproducible:
100%

Steps to Reproduce:
1. Create a book: publican create --name=Test

2. In the book directory, run:
publican update_pot
publican update_po --langs=de-DE
publican build --formats html-single --langs de-DE

3. open the built content and look at the Document Conventions section of the preface. (compare with PO file)
  
Actual results:
accented letters replaced by other kinds of characters

Expected results:
accented letters rendered correctly.

Additional info:

Comment 1 Ruediger Landmann 2010-03-26 04:45:52 UTC
I just tried the fix in commit 1124 (1.6.1-0.t13) -- common content is still corrupted, but now, the body text gets corrupted too! :)

Comment 2 Jeff Fearn 🐞 2010-03-26 07:17:53 UTC
Much stuffing around truing to workout which collection of UTF8 combinations allows wide characters and umlauts to work at the same time, I believe 
Commit 1126 is correct.

Requires more testing kthksbye!

Comment 3 Ruediger Landmann 2010-03-30 02:12:16 UTC
Tested commit 1126 (build t22) and unfortunately, we're back to strings that contain an em dash or en dash don't get replaced, as described in Bug 568201

I have tested other characters from the same Unicode block as the en dash and em dash ("General Punctuation", 2000–206F) and they also seem to prevent strings from being matched. Apart from these two dashes, there are a few more characters in that block that seem likely to turn up in English XML, for example, the single and double "curly quotes".

Accented Latin characters and non-Latin characters are both handled correctly otherwise.

Comment 4 Ruediger Landmann 2010-03-30 05:30:13 UTC
Tested commit 1128 (build t25)

* UTF8 characters in English XML are matched correctly
* Common content is built correctly 

but now we have a different regression... :(

* Publican includes fuzzy strings when building translated content. This happens whether there are UTF8 characters involved or not. To reproduce this:

1. create a book
2. run update_pot and update_po for some (any) language
3. add some text to one of the msgstr entries, and mark the translation fuzzy; for example:

#. Tag: para
#, fuzzy, no-c-format
msgid "This is a test paragraph"
msgstr "Lorem ipsum dolor sit amet"

4. build the translated book and note that the translated string appears, even though it's marked fuzzy.

Comment 5 Jeff Fearn 🐞 2010-03-30 21:04:22 UTC
(In reply to comment #4)
> Tested commit 1128 (build t25)
> 
> * UTF8 characters in English XML are matched correctly
> * Common content is built correctly 
> 
> but now we have a different regression... :(
> 
> * Publican includes fuzzy strings when building translated content. This
> happens whether there are UTF8 characters involved or not. To reproduce this:

You will need a bug number requesting this feature for it to be a regression.

Comment 6 Ruediger Landmann 2010-03-30 21:50:41 UTC
This previously came up during the 1.0 Beta phase; I was sure there was a bug for it, but apparently not. Sorry!

So, not a regression but a fresh bug: I'll document it properly.

Comment 7 Jeff Fearn 🐞 2010-03-30 22:06:05 UTC
(In reply to comment #6)
> This previously came up during the 1.0 Beta phase; I was sure there was a bug
> for it, but apparently not. Sorry!
> 
> So, not a regression but a fresh bug: I'll document it properly.    

It's not a bug, it's an RFE!

Comment 8 Fedora Update System 2010-04-01 05:14:58 UTC
publican-1.6.2-0.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/publican-1.6.2-0.fc12

Comment 9 Fedora Update System 2010-04-01 05:15:25 UTC
publican-1.6.2-0.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/publican-1.6.2-0.fc11

Comment 10 Fedora Update System 2010-04-01 05:15:50 UTC
publican-1.6.2-0.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/publican-1.6.2-0.fc13

Comment 11 Fedora Update System 2010-04-03 04:39:33 UTC
publican-1.6.2-0.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 12 Fedora Update System 2010-04-03 04:50:14 UTC
publican-1.6.2-0.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 13 Fedora Update System 2010-04-09 04:27:39 UTC
publican-1.6.2-0.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.