We currently use Transifex and are looking to switch over to Zanata. However, in trying to replicate our workflow, we found that we can't specify a minimum percentage of accepted translations when we do a pull using zanata-cli. We can specify this when we use the Transifex client to pull. This functionality is important to us as we don't want to pull down translations which aren't very complete.
Hi Stephane, We would like to clarify one point while discussing the implementation details: Do you expect this threshold command run against project level or document level? because it changes how we implement. For example, if you set the minimum percentage to 80%, and if there are two documents in that language one with 100% translated and the other 78% translated into French, would you still want to download the whole sets of documents as the overall percentage for French is above 80%? And also if there are old files in the folder with everything pulled and you run another command with 80% percent as minimum threshold, do you expect the old files with less than 80% all cleared? Thanks Michelle
Let me answer instead of Stephane here. In OpenStack we use these two scenarios: 1) For most projects, we use 75 % for all translated files 2) For one project, we use 75 % for most files and a different value for two other files The goal is to import new translations for the first time only if they are sufficiently translated (75 %) but then keep updating them even if they translation rate is getting lower. What we do right now is the following: 1) # Download new files that are at least 75 % translated. # Also downloads updates for existing files that are at least 75 % # translated. tx pull -a -f --minimum-perc=75 # Pull upstream translations of all downloaded files but do not # download new files. tx pull -f The effect of this is that we update existing files and get new files that are at least 75 % translated. 2) Setup the config file with percentages per file # Download new files. # Also downloads updates for existing files that are # translated to a certain amount as configured in the config file tx pull -a -f # Pull upstream translations of all downloaded files but do not # download new files. # Use lower percentage here to update the existing files. tx pull -f --minimum-perc=50 A download of files where minimum percentage is given only updates files with that minimum percentage, it does not delete any already downloaded files. The behaviour that transifex has - as documented above - serves our needs well. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We also have some postprocessing script (in bash) run afterwards to delete downloaded files with less than 20 % translation. So, our full use case is: 1) * Download new files with more than 75 % * Update all downloaded files * Delete files with less than 20 % - done via bash script
Hi Andreas, thanks for the explanation. I think the design we have in mind will cover *most* of your needs. The only that it won't cover is the per-file assignment of minimum percentages per project (i.e. The minimum percentage is a command line option only that will apply to all files being downloaded). So, having said that, the answer to Michelle's questions are: 1. The minimum translation percentage has to be evaluated on a document basis (not overall). 2. Already written files should not be cleared, even if they don't satisfy the minimum percentage. As I understood, your bash postprocessing script will take care of some of that. Let me know if you have any questions.
Hi Carlos, This sounds good and will cover most projects - and we can handle the odd case differently. > 2. Already written files should not be cleared, even if they don't satisfy the > minimum percentage. As I understood, your bash postprocessing script will take > care of some of that. Correct, my bash script takes care of that.
https://github.com/zanata/zanata-client/pull/63
Note that with --min-doc-percent 100, only fully translated documents are downloaded. However, in other number, round-up will be apply, i.e. document with 94.98% are still downloaded with --min-doc-percent 95.
Hi Andreas, Should the percent word base or message base? For example, You have 100 messages in a document, 99 of them are one word message and 1 has 99 words. The 100 word-messages is translated, others is not. Do you prefer to see the document 1% translated (message based), or 50% translated (word based)
After team discuss, we pick message-base statistic because this option is mostly for project maintainers. And other translation service like Gnome use message base statics by default. https://l10n.gnome.org/languages/cs/gnome-gimp/ui-part/ We also have other maintainers request to see the message-base statistics. To view the message base statistics: 1. From project-version page -> LANG_YOU_INTEREST -> ANY_DOCUMENT, 2. click on the breadcomb -> LANG_YOU_INTEREST 3. On Radio box "Stats by", choose Message Then you can see the message base statistics.
VERIFIED with zanata-client-3.7.0-SNAPSHOT Commit a0cd16978d6590b6509ec8fab5f8adff2fb8c521
Merge commit d13462f1dd31e37438a634743fe5a2861bbadf4e
I agree with the message base statistics! Thanks for implementing it!