Description of problem: Zanata does not copy the most recent translation when update is pushed from PressGang. Version-Release number of selected component (if applicable): 3.3 How reproducible: I am not sure. Steps to Reproduce: 1. Push update of a topic which contains translation strings that has more than one 100% match translation available in TM 2. Check which translation memory is copied Actual results: Older translation is copied as fuzzy Expected results: It should pick the most recent translation Additional info: This is similar to the resolved bug: https://bugzilla.redhat.com/show_bug.cgi?id=896332
@lnewson @carlos Wearing my Localization Supervisor hat: The above examples Yuko has provided mean that all books for RHEV 3.3.1 will need to be proofread again even though we undertook this for the GA of RHEV 3.3. This is a lot of extra work and something seems to have gone wrong, since incorrect strings were copied as "translated". Wearing my Zanata Product Manager hat: @lnewson: Has there been any change between January 2014 and now as to how PressGang formulates the hash? Considering that we are working on a z-stream update, copytrans shouldn't of had any issues copying over the correct strings, however it seems to have copied across very old ones again :S Thanks for the assist guys! Isaac
I think we need to change the perception of Copy Trans a bit (among other technical aspects of it). There is no way that Zanata will get the exact desired translation every single time, specially if there are multiple translations for the same string in the system. Zanata won't be able to determine exactly which one is desired, save for the options that are given. In this particular case, changes in PressGang have thrown Zanata's matching algorithm off. Knowing this, we should change the default copy trans settings for this project so that Zanata doesn't mark copied strings as 'Translated', or it simply doesn't look for strings outside the project. In general, maybe anything that is copied and that has more than one possible copy candidate should be forced to fuzzy. If matches from other projects are still desired, then maybe a two-step copy trans could be done (as described in comment #2) to take advantage of Translation memory. I will look at the newly reported cases and let you know my findings.
(In reply to Isaac Rooskov from comment #7) > Wearing my Zanata Product Manager hat: > @lnewson: Has there been any change between January 2014 and now as to how > PressGang formulates the hash? Considering that we are working on a z-stream > update, copytrans shouldn't of had any issues copying over the correct > strings, however it seems to have copied across very old ones again :S Hey Isaac, no this was way before then. It would have been around about July 2012 if I had to guess. As for the strings being copied as translated, that does seem weird as that would mean it's in the same project and has the same resId (since anything from another project should be marked fuzzy). I'm wondering if the translation has been copied in from another book, in which case we'll probably need to implement BZ#1066765 sooner rather than later. Anyways I'll take a look from our side of things today as well and see what I can find.
(In reply to Lee Newson from comment #9) > (In reply to Isaac Rooskov from comment #7) > > Wearing my Zanata Product Manager hat: > > @lnewson: Has there been any change between January 2014 and now as to how > > PressGang formulates the hash? Considering that we are working on a z-stream > > update, copytrans shouldn't of had any issues copying over the correct > > strings, however it seems to have copied across very old ones again :S > > Hey Isaac, no this was way before then. It would have been around about July > 2012 if I had to guess. > > As for the strings being copied as translated, that does seem weird as that > would mean it's in the same project and has the same resId (since anything > from another project should be marked fuzzy). I'm wondering if the > translation has been copied in from another book, in which case we'll > probably need to implement BZ#1066765 sooner rather than later. Anyways I'll > take a look from our side of things today as well and see what I can find. Some came from non-skynet project, such as old V2V Guide. Many came from JBoss. I am just wondering if there is a way to restore our translation as of 3.3 GA state.
When copy trans copies a translation, it gives the credit to the original author of the translation. So when you see the jboss translator as the author, it could be because the copied translation was originally done by him. Another thing to take into account is that copy trans will reuse translations from deleted documents (not deleted projects or versions). This might be why finding the original source is proving difficult without looking directly in the database.
I've requested a DB backup to look at this even more closely.
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1077439 This bug will be scheduled for next sprint and will completely overhaul the operation of copy trans to leverage the translation memory.
Yuko, I have a db backup, so in order not to block your work, please feel free to work on the affected documents with the assumption that Copy Trans did not do its job.
Migrated; check JIRA for bug status: http://zanata.atlassian.net/browse/ZNTA-121