Bug 1215274 - Should be able to specify minimum percentage completion on pull
Should be able to specify minimum percentage completion on pull
Status: CLOSED CURRENTRELEASE
Product: Zanata
Classification: Community
Component: Component-zanata-client (Show other bugs)
unspecified
Unspecified Unspecified
high Severity high
: ---
: client-3.7
Assigned To: Patrick Huang
Ding-Yi Chen
: screened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-04-24 16:55 EDT by stephane
Modified: 2015-07-29 21:57 EDT (History)
7 users (show)

See Also:
Fixed In Version: Commit a0cd16978d6590b6509ec8fab5f8adff2fb8c521
Doc Type: Bug Fix
Doc Text:
Story Points: 2
Clone Of:
Environment:
Last Closed: 2015-07-29 21:57:52 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description stephane 2015-04-24 16:55:52 EDT
We currently use Transifex and are looking to switch over to Zanata. However, in trying to replicate our workflow, we found that we can't specify a minimum percentage of accepted translations when we do a pull using zanata-cli. We can specify this when we use the Transifex client to pull. This functionality is important to us as we don't want to pull down translations which aren't very complete.
Comment 1 Michelle Kim 2015-05-05 21:38:28 EDT
Hi Stephane,

We would like to clarify one point while discussing the implementation details:

Do you expect this threshold command run against project level or document level? because it changes how we implement. 

For example, if you set the minimum percentage to 80%, and if there are two documents in that language one with 100% translated and the other 78% translated into French, would you still want to download the whole sets of documents as the overall percentage for French is above 80%?

And also if there are old files in the folder with everything pulled and you run another command with 80% percent as minimum threshold, do you expect the old files with less than 80% all cleared?

Thanks
Michelle
Comment 2 Andreas Jaeger 2015-05-07 15:08:54 EDT
Let me answer instead of Stephane here.

In OpenStack we use these two scenarios:
1) For most projects, we use 75 % for all translated files
2) For one project, we use 75 % for most files and a different value for two other files

The goal is to import new translations for the first time only if they are sufficiently translated (75 %) but then keep updating them even if they translation rate is getting lower.


What we do right now is the following:

1)
# Download new files that are at least 75 % translated.
# Also downloads updates for existing files that are at least 75 %
# translated.
tx pull -a -f --minimum-perc=75

# Pull upstream translations of all downloaded files but do not
# download new files.
tx pull -f

The effect of this is that we update existing files and get new files that are at least 75 % translated.

2)
Setup the config file with percentages per file

# Download new files.
# Also downloads updates for existing files that are
# translated to a certain amount as configured in the config file
tx pull -a -f

# Pull upstream translations of all downloaded files but do not
# download new files.
# Use lower percentage here to update the existing files.
tx pull -f --minimum-perc=50


A download of files where minimum percentage is given only updates files with that minimum percentage, it does not delete any already downloaded files.

The behaviour that transifex has - as documented above - serves our needs well. 


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We also have some postprocessing script (in bash) run afterwards to delete downloaded files with less than 20 % translation. So, our full use case is:

1)
* Download new files with more than 75 %
* Update all downloaded files
* Delete files with less than 20 % - done via bash script
Comment 3 Carlos Munoz 2015-05-08 02:32:55 EDT
Hi Andreas, thanks for the explanation. I think the design we have in mind will cover *most* of your needs. The only that it won't cover is the per-file assignment of minimum percentages per project (i.e. The minimum percentage is a command line option only that will apply to all files being downloaded).

So, having said that, the answer to Michelle's questions are:
1. The minimum translation percentage has to be evaluated on a document basis (not overall).
2. Already written files should not be cleared, even if they don't satisfy the minimum percentage. As I understood, your bash postprocessing script will take care of some of that.

Let me know if you have any questions.
Comment 4 Andreas Jaeger 2015-05-08 03:38:20 EDT
Hi Carlos,

This sounds good and will cover most projects - and we can handle the odd case differently.

> 2. Already written files should not be cleared, even if they don't satisfy the > minimum percentage. As I understood, your bash postprocessing script will take > care of some of that.

Correct, my bash script takes care of that.
Comment 5 Patrick Huang 2015-05-18 21:44:11 EDT
https://github.com/zanata/zanata-client/pull/63
Comment 6 Ding-Yi Chen 2015-05-21 01:29:03 EDT
Note that with --min-doc-percent 100, only fully translated documents are downloaded.

However, in other number, round-up will be apply, i.e. 
document with 94.98% are still downloaded with --min-doc-percent 95.
Comment 7 Ding-Yi Chen 2015-05-26 02:18:02 EDT
Hi Andreas,

Should the percent word base or message base?

For example,

You have 100 messages in a document, 99 of them are one word message and 1 has 99 words. 

The 100 word-messages is translated, others is not.

Do you prefer to see the document 
1% translated (message based), 
or 50% translated (word based)
Comment 8 Ding-Yi Chen 2015-05-27 04:13:40 EDT
After team discuss, we pick message-base statistic because this option is mostly for project maintainers. And other translation service like Gnome use message base statics by default.

https://l10n.gnome.org/languages/cs/gnome-gimp/ui-part/

We also have other maintainers request to see the message-base statistics.

To view the message base statistics:

1. From project-version page -> LANG_YOU_INTEREST -> ANY_DOCUMENT,
2. click on the breadcomb -> LANG_YOU_INTEREST
3. On Radio box "Stats by", choose Message

Then you can see the message base statistics.
Comment 9 Ding-Yi Chen 2015-05-27 04:14:53 EDT
VERIFIED with zanata-client-3.7.0-SNAPSHOT
Commit a0cd16978d6590b6509ec8fab5f8adff2fb8c521
Comment 10 Ding-Yi Chen 2015-05-27 20:33:33 EDT
Merge commit d13462f1dd31e37438a634743fe5a2861bbadf4e
Comment 11 Andreas Jaeger 2015-05-29 15:56:47 EDT
I agree with the message base statistics! Thanks for implementing it!

Note You need to log in before you can comment on or make changes to this bug.