Bug 1215274

Summary: Should be able to specify minimum percentage completion on pull
Product: [Retired] Zanata Reporter: stephane <stephane>
Component: Component-zanata-clientAssignee: Patrick Huang <pahuang>
Status: CLOSED CURRENTRELEASE QA Contact: Ding-Yi Chen <dchen>
Severity: high Docs Contact:
Priority: high    
Version: unspecifiedCC: aeng, camunoz, dchen, jaegerandi, mkim, stephane, zanata-bugs
Target Milestone: ---Keywords: screened
Target Release: client-3.7   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Commit a0cd16978d6590b6509ec8fab5f8adff2fb8c521 Doc Type: Bug Fix
Doc Text:
Story Points: 2
Clone Of: Environment:
Last Closed: 2015-07-30 01:57:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description stephane 2015-04-24 20:55:52 UTC
We currently use Transifex and are looking to switch over to Zanata. However, in trying to replicate our workflow, we found that we can't specify a minimum percentage of accepted translations when we do a pull using zanata-cli. We can specify this when we use the Transifex client to pull. This functionality is important to us as we don't want to pull down translations which aren't very complete.

Comment 1 Michelle Kim 2015-05-06 01:38:28 UTC
Hi Stephane,

We would like to clarify one point while discussing the implementation details:

Do you expect this threshold command run against project level or document level? because it changes how we implement. 

For example, if you set the minimum percentage to 80%, and if there are two documents in that language one with 100% translated and the other 78% translated into French, would you still want to download the whole sets of documents as the overall percentage for French is above 80%?

And also if there are old files in the folder with everything pulled and you run another command with 80% percent as minimum threshold, do you expect the old files with less than 80% all cleared?


Comment 2 Andreas Jaeger 2015-05-07 19:08:54 UTC
Let me answer instead of Stephane here.

In OpenStack we use these two scenarios:
1) For most projects, we use 75 % for all translated files
2) For one project, we use 75 % for most files and a different value for two other files

The goal is to import new translations for the first time only if they are sufficiently translated (75 %) but then keep updating them even if they translation rate is getting lower.

What we do right now is the following:

# Download new files that are at least 75 % translated.
# Also downloads updates for existing files that are at least 75 %
# translated.
tx pull -a -f --minimum-perc=75

# Pull upstream translations of all downloaded files but do not
# download new files.
tx pull -f

The effect of this is that we update existing files and get new files that are at least 75 % translated.

Setup the config file with percentages per file

# Download new files.
# Also downloads updates for existing files that are
# translated to a certain amount as configured in the config file
tx pull -a -f

# Pull upstream translations of all downloaded files but do not
# download new files.
# Use lower percentage here to update the existing files.
tx pull -f --minimum-perc=50

A download of files where minimum percentage is given only updates files with that minimum percentage, it does not delete any already downloaded files.

The behaviour that transifex has - as documented above - serves our needs well. 


We also have some postprocessing script (in bash) run afterwards to delete downloaded files with less than 20 % translation. So, our full use case is:

* Download new files with more than 75 %
* Update all downloaded files
* Delete files with less than 20 % - done via bash script

Comment 3 Carlos Munoz 2015-05-08 06:32:55 UTC
Hi Andreas, thanks for the explanation. I think the design we have in mind will cover *most* of your needs. The only that it won't cover is the per-file assignment of minimum percentages per project (i.e. The minimum percentage is a command line option only that will apply to all files being downloaded).

So, having said that, the answer to Michelle's questions are:
1. The minimum translation percentage has to be evaluated on a document basis (not overall).
2. Already written files should not be cleared, even if they don't satisfy the minimum percentage. As I understood, your bash postprocessing script will take care of some of that.

Let me know if you have any questions.

Comment 4 Andreas Jaeger 2015-05-08 07:38:20 UTC
Hi Carlos,

This sounds good and will cover most projects - and we can handle the odd case differently.

> 2. Already written files should not be cleared, even if they don't satisfy the > minimum percentage. As I understood, your bash postprocessing script will take > care of some of that.

Correct, my bash script takes care of that.

Comment 5 Patrick Huang 2015-05-19 01:44:11 UTC

Comment 6 Ding-Yi Chen 2015-05-21 05:29:03 UTC
Note that with --min-doc-percent 100, only fully translated documents are downloaded.

However, in other number, round-up will be apply, i.e. 
document with 94.98% are still downloaded with --min-doc-percent 95.

Comment 7 Ding-Yi Chen 2015-05-26 06:18:02 UTC
Hi Andreas,

Should the percent word base or message base?

For example,

You have 100 messages in a document, 99 of them are one word message and 1 has 99 words. 

The 100 word-messages is translated, others is not.

Do you prefer to see the document 
1% translated (message based), 
or 50% translated (word based)

Comment 8 Ding-Yi Chen 2015-05-27 08:13:40 UTC
After team discuss, we pick message-base statistic because this option is mostly for project maintainers. And other translation service like Gnome use message base statics by default.


We also have other maintainers request to see the message-base statistics.

To view the message base statistics:

1. From project-version page -> LANG_YOU_INTEREST -> ANY_DOCUMENT,
2. click on the breadcomb -> LANG_YOU_INTEREST
3. On Radio box "Stats by", choose Message

Then you can see the message base statistics.

Comment 9 Ding-Yi Chen 2015-05-27 08:14:53 UTC
VERIFIED with zanata-client-3.7.0-SNAPSHOT
Commit a0cd16978d6590b6509ec8fab5f8adff2fb8c521

Comment 10 Ding-Yi Chen 2015-05-28 00:33:33 UTC
Merge commit d13462f1dd31e37438a634743fe5a2861bbadf4e

Comment 11 Andreas Jaeger 2015-05-29 19:56:47 UTC
I agree with the message base statistics! Thanks for implementing it!