This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours

Bug 678186

Summary: [l10n] update_po produces inconsistent results
Product: [Community] Publican Reporter: Ruediger Landmann <r.landmann>
Component: publicanAssignee: Jeff Fearn <jfearn>
Status: CLOSED CURRENTRELEASE QA Contact: tools-bugs <tools-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.5CC: hpeters, mhideo, mmcallis, publican-list
Target Milestone: 3.0   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 3.0.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-30 23:11:12 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
example XML file
none
PO file with the extra lines
none
PO file without the extra lines -- update_po doesn't add them in. none

Description Ruediger Landmann 2011-02-16 22:08:47 EST
Created attachment 479256 [details]
example XML file

Description of problem:

Publican sometimes includes lines in msgid entries in PO files that contain nothing but two sets of quote marks. This happens when the corresponding XML file includes a <screen> element with a carriage return before the closing tag. For example:

<screen>
rhnpush --server=http://localhost/APP -c 'rhel-5.3-beta' -d /var/satellite/custom-distro/rhel-i386-server-5.3-beta/Server/
</screen>

is represented in the PO file as:

#. Tag: screen
#, no-c-format
msgid "\n"
"rhnpush --server=http://localhost/APP -c 'rhel-5.3-beta' -d /var/satellite/custom-distro/rhel-i386-server-5.3-beta/Server/\n"
""
msgstr ""

If Publican updates a PO file with an msgid that matches except for the extra line, Publican doesn't add this line. For example, update_po does not change this entry:

#. Tag: screen
#, no-c-format
msgid "\n"
"rhnpush --server=http://localhost/APP -c 'rhel-5.3-beta' -d /var/satellite/custom-distro/rhel-i386-server-5.3-beta/Server/\n"
msgstr ""

We have examples of books where PO files lack these lines for reasons that are unclear.

These lines aren't a problem in themselves, but Publican also counts each of these lines as a word when you run publican lang_stats -- which means that the word count for different languages might be different. 

This doesn't impact on translation directly, but can make life interesting for anyone managing a translation project; the word counts that lang_stats produces serve as a crude but handy checksum to make sure that all members of a translation team are working on up-to-date PO files. When different languages report different word counts, it's not immediately obvious that everyone's translating the same thing :)

Version-Release number of selected component (if applicable):
2.5-1

How reproducible:
100%

Steps to Reproduce:
1. generate a PO file from an XML file that includes a <screen> element with its closing tag on a new line
2. run lang_stats on the target language and note the result
3. edit the PO file to remove any lines in msgid entries that consist only of '""' 
4. run lang_stats on the target language and note the result
5. run publican update_po to refresh the PO file
6. open the PO file to note that the '""' lines are not restored
7. run lang_stats on the PO file yet again and note the result
  
Actual results:
results in steps 4 and 7 are the same, but differ from result in step 2

Expected results:
same results in steps 2, 4, and 7

Additional info:
Comment 1 Ruediger Landmann 2011-02-16 22:09:34 EST
Created attachment 479257 [details]
PO file with the extra lines
Comment 2 Ruediger Landmann 2011-02-16 22:10:31 EST
Created attachment 479258 [details]
PO file without the extra lines -- update_po doesn't add them in.
Comment 3 Jeff Fearn 2011-02-26 03:24:22 EST
Merging POT and PO files is currently done by msgmerge from gettext, it might be worth testing various options to msgmerge to see if the behaviour can be changed.

Current options would look like:

msgmerge --no-wrap --quiet --backup=none --update foo.po

Try it without --update, also trying the new code path in https://bugzilla.redhat.com/show_bug.cgi?id=661569 would be worth a shot.
Comment 4 Jeff Fearn 2011-04-18 02:10:35 EDT
This should have been fixed by #661569, requires testing.
Comment 5 Michael Hideo 2012-06-07 21:22:26 EDT
create xml file that contains screen element with a carriage return before closing tag. then create pot file. check for empty strings.
Comment 6 Michael Hideo 2012-06-07 21:29:17 EDT
create xml file that contains screen element with a carriage return before closing tag. then create pot file. check for empty strings.
Comment 7 Hedda Peters 2012-06-13 22:03:34 EDT
Verified.

I followed Rudi's steps 1-7 to reproduce using the attached example XML file.
Publican still produces those lines with only two sets of quote marks in the po-file. After removing them manually and updating the po-file, it does not add them in again - as described by Rudi. 

However, running publican lang_stats is now producing consistent statistics, with or without those lines the results for steps 2, 4 and 7 are the same.

This matches the expected result in the OP, hence verified.