1086459 – RFE: List multiple identical TM matches in descending chronological order

Bug 1086459 - RFE: List multiple identical TM matches in descending chronological order

Summary: RFE: List multiple identical TM matches in descending chronological order

Keywords:
Status:	CLOSED UPSTREAM
Alias:	None
Product:	Zanata
Classification:	Retired
Component:	TranslationEditor
Sub Component:
Version:	3.3
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Patrick Huang
QA Contact:	Zanata-QA Mailling List
Docs Contact:	David Mason
URL:
Whiteboard:
Duplicates (1):	1086059 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-04-11 01:03 UTC by Noriko Mizumoto
Modified:	2015-07-31 01:46 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Story Points:	---
Clone Of:
Environment:
Last Closed:	2015-07-31 01:46:25 UTC
Embargoed:

Attachments	(Terms of Use)

Description Noriko Mizumoto 2014-04-11 01:03:40 UTC

Description of problem:
In translation memory, when there are more than one same similarity matches, they are listed in ascending order followed by less similarities. Generally newer match is improved and translator prefer to use. Thus translator needs to scroll down to the bottom of the exact matches. This is not true bottom but sometime in the middle, because less similarities follows.

Version-Release number of selected component (if applicable):


How reproducible:

For example, assuming 7 memories show up. newest with exact match is the one modified on 2012/3/30. Translator needs to scroll down to the last entry of exact match which is located in the middle of the entire list. In addition, to find the modified date, translator needs to click Info icon and again scroll down to the bottom for each entry.

100% blah blah blah (last modified 2011/1/24)
100% blah,blah,blah (last modified 2011/6/4)
100% blah, blah, blah (last modified 2011/12/30)
100% blah、blah、blah (last modified 2012/3/30)
45% blah? 
30% blah!
4% Blah!?

Actual results:
Same similarity matches are listed in ascending order.

Expected results:
Same similarity matches are listed in descending order.

Additional info:

Comment 1 Ding-Yi Chen 2014-04-17 02:08:32 UTC

*** Bug 1086059 has been marked as a duplicate of this bug. ***

Comment 2 Michelle Kim 2015-05-29 01:11:12 UTC

Thanks Noriko and Yuko for filing this RFE. Marking it high priority to work on for our TM story.

Zanata team,

I believe this RFE makes good sense and important feature for translators. To reconfirm, here are the acceptance criteria:

If there are TM results with same matching rate, e.g. two entries with 90% matches, current implementation displays older translation first and then newer translation next. So it is better to have the recently translated entry show up on top if the rate is the same. Higher match always show up higher than lower match.

So we should change the way the TM displays the result as follows:

1. 100% match (translated June 1 2014)
2. 90% match (translated today)
3. 90% match (translated March 2015)
4. 80% match (translated today)

Carlos, I am passing this issue to you so that Zanata team can discuss the story points.

Thanks,
Michelle

Comment 3 Carlos Munoz 2015-05-29 03:41:32 UTC

Assigning to Patrick for initial assessment.

Comment 4 Patrick Huang 2015-05-29 05:30:13 UTC

It's a simple change. We should be able to make it as part of 3.7 if we want.

Comment 5 Michelle Kim 2015-05-29 05:43:20 UTC

Hi Patrick

That would be great if it can make it to 3.7

Carlos, do you agree?

Thanks
Michelle

Comment 6 Patrick Huang 2015-06-01 00:19:30 UTC

More implementation detail:
For old editor, it seems that we only need to change a boolean value at org.zanata.service.impl.TranslationMemoryServiceImpl#searchTransMemory

This will set lucene to sort by date. Not sure what this will do to the overall sorting (e.g. will it sort by score first then date or will it ONLY sort by date).
If lucene sorting is not giving us what we want, a post-process sort should be easy enough to do.

For new editor, since this is still work in progress, we just need to make sure it is implemented this way.

Comment 7 Luke Brooker 2015-07-21 03:26:42 UTC

Note, this is implemented in the new editor in 3.7 on the front-end. But not in the current editor.

Comment 8 Michelle Kim 2015-07-21 04:06:45 UTC

Additional note is that we intend to fix or introduce additional features to new editor instead of current editor, with the intention of switching the old editor to new editor in near future. We have list of essential features that we need to have in new editor before the switch happens: https://bugzilla.redhat.com/show_bug.cgi?id=1232090

Patrick, Shall we mark this issue as Verified in 3.7 release?

Comment 9 Sean Flanigan 2015-07-27 02:34:41 UTC

As Patrick said, it should be trivial to get Hibernate Search to use date as a tie-breaker for search results with the same score.  This should ensure that the newest of several identical translations is in the top 10 cut-off.  However, this change won't directly affect the grouping of similar translations by itself.  That grouping currently ignores timestamps (see TransMemoryResultComparator).

That said, I think we should consider making the small change soon, even in the old editor, because the sometimes unintuitive selection of the top 10 results leads to user complaints and wasted time.

Comment 10 Damian Jansen 2015-07-27 05:51:13 UTC

https://github.com/zanata/zanata-server/pull/923

Comment 11 Sean Flanigan 2015-07-27 06:07:16 UTC

Damian, I wouldn't say https://github.com/zanata/zanata-server/pull/923 is trying to implement the enhancement described by this bug (match sorting in the editor), just a subset of it (ensuring newer matches are in there somewhere).

Comment 12 Damian Jansen 2015-07-28 04:46:48 UTC

Agreed.

Comment 13 Zanata Migrator 2015-07-31 01:46:25 UTC

Migrated; check JIRA for bug status: http://zanata.atlassian.net/browse/ZNTA-533

Note You need to log in before you can comment on or make changes to this bug.