Bug 873489 - XLIFF/Properties/PO upload should check that translations correspond to the current text flow contents
Summary: XLIFF/Properties/PO upload should check that translations correspond to the c...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Zanata
Classification: Retired
Component: Component-Logic
Version: unspecified
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Sean Flanigan
QA Contact: Zanata-QA Mailling List
URL:
Whiteboard:
: 873500 873930 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-06 01:13 UTC by Sean Flanigan
Modified: 2015-07-29 03:35 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-07-29 03:35:34 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1194543 0 high CLOSED Manual document re-upload makes previous translations fuzzy 2021-02-22 00:41:40 UTC

Internal Links: 1194543

Description Sean Flanigan 2012-11-06 01:13:16 UTC
Description of problem:

When pushing XLIFF content, it is possible for the translation files to have older source strings than the source files, but with the same trans-unit ids.  For instance:

source file:
...
<trans-unit id="greeting">
  <source>Hello World</source>
</trans-unit>
...

trans file:
...
<trans-unit id="greeting">
  <source>Hello</source>
  <target>Hallo</target>
</trans-unit>
...

NB: even though the ids match, "Hallo" is not a valid translation of the current source "Hello World".


Version-Release number of selected component (if applicable):
2.0.0

How reproducible:
100%

Steps to Reproduce:
1. Push both files above
2. Pull the translation
  
Actual results:
greeting has the incorrect translation "Hallo"

Expected results:
greeting should be untranslated

Additional info:

We may need to change the way we generate resIds for XLIFF files, to be similar to the Gettext approach, so that the server will reject the outdated translation.  If so, we will need a migration path for existing content in the database.

Alternatively the client may be able to load source and target files and compare the source element.  We will need to weigh both approaches.

We have a similar problem for Properties files.

Comment 1 Sean Flanigan 2012-11-07 02:19:59 UTC
See bug 873500 for Properties files.

Comment 2 Sean Flanigan 2012-11-20 06:06:08 UTC
We should add sourceContentHash to TextFlowTarget, and have the XLIFF and Properties clients pass a hash of the source content.  Then the server could verify that the sourceContentHash matches the current version of the HTextFlow, else reject the translation.

Comment 3 Sean Flanigan 2012-11-20 06:06:17 UTC
*** Bug 873500 has been marked as a duplicate of this bug. ***

Comment 4 Sean Flanigan 2013-12-12 04:00:41 UTC
*** Bug 873930 has been marked as a duplicate of this bug. ***

Comment 5 Sean Flanigan 2013-12-12 04:03:24 UTC
As of Zanata 3.2.0, TextFlowTarget includes a sourceHash property.  If sourceHash is provided, TranslatedDocResourceService on the server will check that the sourceHash matches the HTextFlow before persisting the HTextFlowTarget.

The gettext formats (including offlinepo) now provide sourceHash when pushing to the server, but this still needs to be implemented for xliff and properties files.

Comment 6 Sean Flanigan 2014-03-21 05:07:24 UTC
Pull request (WIP): https://github.com/zanata/zanata-common/pull/1

Comment 7 Sean Flanigan 2014-03-21 05:13:59 UTC
Also https://github.com/zanata/zanata-client/pull/14

Comment 8 Sean Flanigan 2014-03-26 03:12:11 UTC
Both pull requests have been updated.

Comment 9 Sean Flanigan 2014-03-26 04:05:11 UTC
Note that the required changes for PO files were in server commit 0a3971e, common commit 24c7951 and client commit 880c69a157.  Source content checking for PO files requires client >= 3.2.0, and only works with Zanata server >= 3.2.0.  (Note that there was no bz for that change.)

Comment 10 Michelle Kim 2015-02-25 05:40:35 UTC
We have similar bug 1194543 that is implementing the solution for formats that use positional identifiers.

Comment 11 David Mason 2015-02-26 01:56:30 UTC
(In reply to Michelle Kim from comment #10)
> We have similar bug 1194543 that is implementing the solution for formats
> that use positional identifiers.

That bug is about source text that changes which identifier it uses (e.g. due to position change), but this bug is about translations being pushed that were translated from an old version of the source. I do not think they are related since the cause and the fixes for them are different.

Comment 12 Zanata Migrator 2015-07-29 03:35:34 UTC
Migrated; check JIRA for bug status: http://zanata.atlassian.net/browse/ZNTA-336


Note You need to log in before you can comment on or make changes to this bug.