Bug 524990

Summary: setroubleshoot should use ngettext
Product: [Fedora] Fedora Reporter: Miloš Komarčević <kmilos>
Component: setroubleshootAssignee: Daniel Walsh <dwalsh>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: dwalsh, jdennis, mgrepl, piotrdrag
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-01 13:26:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 525036    
Attachments:
Description Flags
C ngettext patch
none
Improved C ngettext patch none

Description Miloš Komarčević 2009-09-22 21:58:58 UTC
Hacks like

num_of_times = _("s")
if alert.report_count == 1:
  num_of_times = ""
...
start_label_text = _("This alert has occurred <b>%d time%s</b> since %s") % (alert.report_count, num_of_times, start_date.format(date_format))

are highly inappropriate and don't work for languages other than English.

http://live.gnome.org/TranslationProject/DevGuidelines/Plurals

Please use ngettext instead, see bug #467603 for an example.

Comment 1 Daniel Walsh 2009-09-23 01:07:08 UTC
Fixed in setroubleshoot-2_2_30-1_fc12

Comment 2 Miloš Komarčević 2009-09-23 07:48:49 UTC
Daniel, thanks for picking this up so quickly. Unfortunately, this is just another hack, admittedly less bad than the last one from i18n point of view. It still might not work well for some languages.

I'm reopening this so you can fix it properly using ngettext when you have some more time.

Comment 3 Daniel Walsh 2009-09-23 14:20:26 UTC
How about

        start_label_text = gettext.ngettext("This alert has occurred <b>%d time</b> since %s", "This alert has occurred <b>%d times</b> since %s", alert.report_count) %  % (alert.report_count, start_date.format(date_format))

Comment 4 Miloš Komarčević 2009-09-23 21:23:29 UTC
That's the idea, and it might "just work", but I don't get the inner workings of python, automake, and intltool.

More typically an alias for ngettext() is made, like

        start_label_text = P_("This alert has occurred <b>%d
time</b> since %s", "This alert has occurred <b>%d times</b> since %s",
alert.report_count) %  % (alert.report_count, start_date.format(date_format))

with P_() bound to gettext.ngettext() wherever _() is bound to gettext.gettext()

Then, you need to tell intltool how to extract these messages by passing some extra args to xgettext. From the intltool README:


Changing keywords used in xgettext invocation
.............................................

If you need to change default keywords used to extract messages from 
source code, you need to add variable XGETTEXT_KEYWORDS to 
Makefile.in.in file inside directory where intltool-update is run
from, eg.

        --- start ----

        XGETTEXT_KEYWORDS = --keyword --keyword=P_

        --- end ----

Default keywords xgettext looks for if no XGETTEXT_KEYWORDS is defined
are _, N_ and U_.



So, to your po/Makefile.in.in you'd need to add something like

XGETTEXT_KEYWORDS = --keyword=P_:1,2


As I said, I'm not an expert, so some details are probably missing...

Comment 5 Miloš Komarčević 2009-09-23 23:32:15 UTC
Found more candidates:

browser.py:        return _("%d days ago") % days

seapplet.c:	sprintf(msg, _("Since your last login, there are %d new security alerts to view."), yellow+red);  

seapplet.c:		sprintf(msg, _("Since your last login, there are %d new security alerts to view.  %d of the alerts may be very serious security violations."), yellow+red, red);

seapplet.c:	sprintf(title, _("%d New Security Alerts"), yellow+red);

Note that the second message in seapplet.c is probably best broken down, the added benefit being that the first sentence already appears earlier, will be recognized as such by xgettext during extraction, and will need to be translated only once.

Comment 6 Miloš Komarčević 2009-09-24 20:24:16 UTC
Created attachment 362563 [details]
C ngettext patch

Attaching a proposed ngettext patch for the C part, just missing the Python bit (could figure out where gettext is imported and _() bound).

Comment 7 John Dennis 2009-09-24 20:39:57 UTC
Dan: this is only somewhat tangentially related. In the python code we were using somewhat of a hack to get around the fact the default encoding in Python was hardcoded to ascii instead of utf-8. This required use to tell gettext() to return utf-8 in Python str object instead of UCS-4 in Unicode objects.

Since that code was first written we've figured out a way to force Python into the correct utf-8 default encoding mode. Which means we can get rid of the old hack and clean up some misleading comments in the code.

This is really just a FYI, it won't affect the output, but at some point should be cleaned up and use the "correct" idoms.

Comment 8 Miloš Komarčević 2009-09-24 21:07:45 UTC
Created attachment 362568 [details]
Improved C ngettext patch

Splitting the two strings, second sentence form might depend on the count in some more languages like it does for Serbian.

Comment 9 Daniel Walsh 2009-09-25 14:12:51 UTC
All this is fixed in setroubleshoot-2.2.31-1.fc12

But we still are not seeing the actual translations in the code.  The problem seems to be that we are shipping the viewer in one package "setroubleshoot"  and the text in another package "setroubleshoot-plugins"

And I think the text is not being translated because of this.

Comment 10 Miloš Komarčević 2009-09-25 21:37:17 UTC
Hm, now we're missing desktop and glade strings...

Btw, don't think putting XGETTEXT_KEYWORDS in Makefile.am is appropriate, the README clearly indicates po/Makefile.in.in. Also the --keywords option without any keywords overrides the default keywords, which is not what one would want I think (I'd use only --keyworrd=_P:1,2 as in my patch). Unfortunately neither of these changes bring back the desktop and glade strings, could really use an intltool expert here.


One plural string also slipped through the net:

--- setroubleshoot-2.2.32/src/browser.py	2009-09-25 22:13:40.168164466 +0100
+++ setroubleshoot-2.2.32_ngettext/src/browser.py	2009-09-25 22:18:17.618165876 +0100
@@ -736,7 +736,7 @@
             return _("Today")
         if days == 1:
             return _("Yesterday")
-        return _("%d days ago") % days
+        return P_("%d day ago", "%d days ago", days) % days
      
     def show(self):
         self.main_window.present()

Comment 11 John Dennis 2009-09-25 21:59:40 UTC
Have you checked to see that locale dir is correct and the domain names match?

Comment 12 Miloš Komarčević 2009-09-25 22:11:38 UTC
John, they are missing during string extraction to pot. When XGETTEXT_KEYWORDS is commented out, the extraction works as before, without the plural strings of course.

Comment 13 John Dennis 2009-09-25 22:13:45 UTC
I wrote up this guide to localization and i18n. It tries to give a concise overview, a lot of the other guides are hard to follow IMHO, 

https://wiki.idm.lab.bos.redhat.com/export/idmwiki/Internationalization_Overview

HTH

Comment 14 John Dennis 2009-09-25 22:14:45 UTC
re comment #12, so the pot files are empty?

Comment 15 Miloš Komarčević 2009-09-25 22:52:41 UTC
They are not empty, just missing/skipping desktop and glade files, although they are listed in POTFILES.in. Strings from Python and C files still get extracted. See e.g. 

http://hg.fedoraproject.org/hg/setroubleshoot/diff/712c021e447c/framework/po/setroubleshoot.pot

Comment 16 Piotr Drąg 2009-10-01 13:17:48 UTC
It seems that it has been fixed in this commit:

http://hg.fedorahosted.org/hg/setroubleshoot/rev/ef68b011d6cb

Comment 17 Daniel Walsh 2009-10-01 13:26:05 UTC
I believe this is fixed in setroubleshoot-2.2.33-1.fc12.x86_64

Comment 18 Miloš Komarčević 2009-10-01 15:37:22 UTC
Yep, just missing one more P_() marked string (see bottom of comment #10).

Thanks Daniel and Piotr.

Comment 19 Daniel Walsh 2009-10-01 15:40:15 UTC
No because if it happened 1 day ago, the tool prints "Yesterday"  So it will only print the plural.

Comment 20 Piotr Drąg 2009-10-01 15:47:22 UTC
Some languages have more than one plural form.

Comment 21 Miloš Komarčević 2009-10-01 15:50:30 UTC
Sure Daniel, that makes perfect sense, but again, only for English. For other languages you could use different plural word forms for "days" even for numbers greater than 2. For example, in Serbian "3 dana", but "21 dan" (same is singular because it ends wit a 1). Plural rules get even more complex for other languages, as mentioned in the link in the opening comment.

This is the whole point of using ngettext. You as a developer do not have to, and indeed shoudn't think about strings with plurals. As they say: "assumption is..." ;)

Comment 22 Miloš Komarčević 2009-10-01 20:49:35 UTC
Last change confirmed in hg, thank you very much Daniel!