Bug 1224423 - [RFE] eliminate pushing duplicated strings to Zanata due to resource files inheritance
Summary: [RFE] eliminate pushing duplicated strings to Zanata due to resource files in...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: RFEs
Version: ---
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ovirt-4.0.0-beta
: 4.0.0
Assignee: Scott Dickerson
QA Contact: Scott Dickerson
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-05-22 22:34 UTC by Einav Cohen
Modified: 2016-08-01 12:27 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-01 12:27:15 UTC
oVirt Team: UX
Embargoed:
rule-engine: ovirt-4.0.0+
rule-engine: exception+
mgoldboi: planning_ack+
oourfali: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1287408 0 unspecified CLOSED [RFE] [CodeChange] reduce number of gwt compilation permutations by developing an alternate localization mechanism 2021-02-22 00:41:40 UTC
oVirt gerrit 55238 0 master ABANDONED engine: i18n conversion of GWT Constants and Messages 2020-02-27 09:08:12 UTC
oVirt gerrit 57318 0 master MERGED userportal, webadmin: move constant and message text to properties files 2020-02-27 09:08:12 UTC

Internal Links: 1287408

Description Einav Cohen 2015-05-22 22:34:16 UTC
Some of the texts displayed in the oVirt UI are maintained in several GWT Messages/Constants Java interfaces. 

Since Zanata doesn't know how to work with Java files, we are automatically generating .properties files out of the GWT Messages/Constants Java interfaces, and then pushing those .properties files into Zanata for translation. 

The automatic generation is done by annotating the relevant GWT Messages/Constants Java interfaces with the "@Generate" annotation [1] and performing a GWT compilation on the code with the -"extraParam" value set to "true". 

It seems that if there is some inheritance within the GWT Messages/Constants Java interfaces, the generated .properties files of the derived interfaces include all of the values from their base interface. 

Example:
we have: 
CommonApplicationConstants [base]
  [user portal] ApplicationConstants [derived]
  [web admin]   ApplicationConstants [derived]

The auto-generated user-portal ApplicationConstants.properties file will include all texts in the user portal ApplicationConstants interface + all texts in the CommonApplicationConstants interface. 
For the auto-generated web-admin ApplicationConstants.properties file - same thing: it will include all texts in the web-admin ApplicationConstants interface + all texts in the CommonApplicationConstants interface. 

So when pushing to Zanata CommonApplicationConstants.properties, the user portal ApplicationConstants.properties and the web admin ApplicationConstants.properties, we are actually ending up with all of the CommonApplicationConstants texts appearing 3 times in Zanata, in 3 different documents. 

This creates extra work for the translators for no good reason, and prone to unnecessary inconsistencies. 

Need to find a way in which each text will be pushed only once to Zanata, without unnecessary duplicates due to interface inheritance. 

One way to achieve this (I think) is to move the English text values from the Messages/Constants Java files into English properties file in the code-repo. That way, we will not need the Java-to-.properties auto-generation to begin with, and we will have control over what goes into the .properties files (and obviously, we will make sure to not include duplication across the .properties files). 

[1] http://www.gwtproject.org/javadoc/latest/com/google/gwt/i18n/client/LocalizableResource.Generate.html

Comment 1 Einav Cohen 2015-06-11 12:54:45 UTC
pushing to 4.0 - we are almost at 3.6 FF and this one should preferably wait until after the Korean/German improvement/merge work that is being done at the moment.

Comment 2 Red Hat Bugzilla Rules Engine 2015-10-19 10:53:43 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 3 Vojtech Szocs 2015-12-09 17:30:42 UTC
> It seems that if there is some inheritance within the GWT Messages/Constants
> Java interfaces, the generated .properties files of the derived interfaces
> include all of the values from their base interface.

Assuming these properties files are generated by GWT compiler by annotating Constants/Messages interfaces like so:

  @Generate(format = "com.google.gwt.i18n.server.PropertyCatalogFactory")

then yes, generated properties files will contain *all* values, e.g. files:

  [webadmin]   ApplicationConstants.properties
  [userportal] ApplicationConstants.properties

will have some degree of duplicity due to CommonApplicationConstants inheritance.

This is because properties files have no standard means of "importing" key/value pairs from other properties files, so each properties file contains everything, including the inherited stuff.

More generally, the problem here is how to efficiently share common texts between different applications, given the general limitation of using properties files (representing texts as key/value pairs).

> So when pushing to Zanata CommonApplicationConstants.properties, the user
> portal ApplicationConstants.properties and the web admin
> ApplicationConstants.properties, we are actually ending up with all of the
> CommonApplicationConstants texts appearing 3 times in Zanata, in 3 different
> documents. 
> 
> This creates extra work for the translators for no good reason, and prone to
> unnecessary inconsistencies.

If we're pushing CommonApplicationConstants.properties into Zanata today, we should *remove* its keys from specific ApplicationConstants.properties before pushing to Zanata. This way, we can fight duplicity when using current (GWT standard) i18n mechanism.

I would expect translation team to advise here what would be their preferred way of sharing common texts. We can adapt our (to-be-written) custom i18n mechanism to support chosen approach.

> One way to achieve this (I think) is to move the English text values from
> the Messages/Constants Java files into English properties file in the
> code-repo. That way, we will not need the Java-to-.properties
> auto-generation to begin with, and we will have control over what goes into
> the .properties files (and obviously, we will make sure to not include
> duplication across the .properties files).

Taking default English texts out of Java code and putting them inside properties files is the first step.

As I wrote above, we'll need to find a way to share common texts that fits translation team (doesn't make their life too hard), then adapt i18n mechanism accordingly.

Comment 4 Yuko Katabami 2015-12-09 21:56:17 UTC
Hi Christophe and Jérome,

I just added you to the Cc list, as this was originally requested by you when we had a discussion on French UI translation review.
It would be greatly appreciated if you could help me as I am lacking some technical knowledge.

Many thanks in advance.

Yuko

Comment 5 Jérôme Fenal 2015-12-10 11:13:24 UTC
Hi Yuko,

Reading the bugzilla, and not having a proper GWT/Java knowledge, I won't be able to comment very far.

As far as I understand:
- currently, English strings are directly in the code, then extracted using a GWT factory
- multiple extracts may overlap in their keys (source strings), without context, because only source strings will be used, without any comments/context which can be added/augmented as can be done with gettext
- currently those overlapping keys (source strings) cannot be merged, because there is no mechanism to do so
- it seems it could be done, if source strings are first (manually, but once, with a policy to avoid English strings in code thereafter) extracted and maintained in a separate properties file
- which could then enable to first merge those properties files before translation, which would show only one instance of each source string
- which would then be redistributed in one or multiple translated properties files, matching each instance of source strings found in the English properties files.

Seems it will have to wait until work is started on ovirt 4.0, and will have to involve the development team to export the English strings and insert the mechanism to rely on such exported strings, whatever the language is.

Did I get it correctly?

Comment 6 Yuko Katabami 2015-12-13 11:37:21 UTC
(In reply to Jérôme Fenal from comment #5)
> Hi Yuko,
> 
> Reading the bugzilla, and not having a proper GWT/Java knowledge, I won't be
> able to comment very far.
> 
> As far as I understand:
> - currently, English strings are directly in the code, then extracted using
> a GWT factory
> - multiple extracts may overlap in their keys (source strings), without
> context, because only source strings will be used, without any
> comments/context which can be added/augmented as can be done with gettext
> - currently those overlapping keys (source strings) cannot be merged,
> because there is no mechanism to do so
> - it seems it could be done, if source strings are first (manually, but
> once, with a policy to avoid English strings in code thereafter) extracted
> and maintained in a separate properties file
> - which could then enable to first merge those properties files before
> translation, which would show only one instance of each source string
> - which would then be redistributed in one or multiple translated properties
> files, matching each instance of source strings found in the English
> properties files.
> 
> Seems it will have to wait until work is started on ovirt 4.0, and will have
> to involve the development team to export the English strings and insert the
> mechanism to rely on such exported strings, whatever the language is.
> 
> Did I get it correctly?

Hi Jérôme,

Thank you very much for your explanation.
I am starting to understand what processes are involved.

I am wondering how we can determine strings that need to be merged and that don't.
For example, in CommonApplicationConstants, strings like "New" and "Edit" are used for single purpose thus we should have single translation for all instances, whereas some strings like "Up" and "Down" are used for different purposes, including "up and running" "down and stopped" or "move up" and "move down" an item in a list.
Are they going to be sorted manually?

Currently, we check resource ID to determine where a string is used.

If keys are removed from those strings, are the resouce IDs also removed?

Comment 15 Sandro Bonazzola 2016-05-02 10:02:11 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 16 Scott Dickerson 2016-05-20 13:57:45 UTC
Patch http://gerrit.ovirt.org/57318 has been merged and implements the following changes:
  1. the default/source English text has been extracted from annotations 
  2. put the text into new properties files 
  3. update existing localized versions of the properties file to remove
     any key not defined in the source properties file (this removed inherited
     duplicate keys)

Now, each translation document in Zanata will only contain keys that are explicitly defined by that document's GWT i18n interface.  No inherited key will be present.

Going forward, when a new key is added to a Constants or Messages interface, the default text will need to be defined in a properties file instead of a source code annotation.

This work also forms the basis for the work to be done in BZ 1287408.


Note You need to log in before you can comment on or make changes to this bug.