Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1111021

Summary: Translation Memory search option does not display all the occurrences of a term
Product: [Retired] Zanata Reporter: Julie Carbone <jcarbone>
Component: Usability, TranslationEditorAssignee: Damian Jansen <djansen>
Status: CLOSED UPSTREAM QA Contact: Zanata-QA Mailling List <zanata-qa>
Severity: medium Docs Contact:
Priority: medium    
Version: unspecifiedCC: damason, jcarbone, yshao, zanata-bugs
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-29 03:33:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Julie Carbone 2014-06-19 04:54:22 UTC
Description of problem:
The TM only provides a restricted number of translation options when givin a specific term, rather than displaying all the occurrences of the said term.

Version-Release number of selected component (if applicable):
Zanata 3.4.1 (git-server-3.4.1-dirty)

How reproducible:
Always

Steps to Reproduce:
1. Log in to Zanata 
2. Open a translation project
3. In the translation editor, search any term, like "mount" and observe the results
4. Make another search, adding another term next to "mount", like "mount the" and observe the results (there are more results than the previous search)


Actual results:
In the first search, the Translation Memory only displays the closest memory entry based on the similarity with the term(s), like "mount point" and "# mount -a", rather than displaying every string where the term appears.
To obtain more results, the only solution seems to be adding more terms in the search box.

Expected results:
Being able to see all the translation results for any term, regardless of their similarity with the source term or sentence.

Comment 1 Isaac Rooskov 2014-10-01 01:39:50 UTC
Might be something to do with the similarity algorithm.
May be related to single word vs multiple word searches.

Comment 2 David Mason 2015-06-11 05:59:35 UTC
Technical note:

I have looked recently at the similarity percentage algorithm, there should be no difference in calculated scores between "mount" and "mount the" since "the" is considered a stop-word and is removed before comparison.

The two most likely possibilities are:

 - lucene search returns less results for the shorter search, before we have access to do any similarity calculation.

 - many identical results are returned and combined to a single item. If this happens, the total result count will be less. Look at the detailed info for a match to see how many results it represents.

Comment 3 Julie Carbone 2015-07-01 00:32:41 UTC
I am still experiencing the same problem. Another example:

Look up "version" in the TM
This gives you only two results: 
--version
Version

Then look up "Memory size exceeds supported limit for given cluster version." in the TM. This gives you a choice of 6 strings.

It would be helpful if the search of one word, like "version" here, could give us actual sentences where the term appears, rather than just a one-word TM result.

Thank you

Julie

Comment 4 David Mason 2015-07-01 02:25:17 UTC
(In reply to Julie Carbone from comment #3)
> I am still experiencing the same problem. Another example:
> 
> Look up "version" in the TM
> This gives you only two results: 
> --version
> Version

Please run that search again and let me know what number of matches there are for each. This is shown in the "#" column of the TM (just to the left of the Copy button).

That will help figure out why there are so few matches being shown.

Comment 5 Julie Carbone 2015-07-01 04:00:03 UTC
Version: #9
--version: #1

Comment 6 David Mason 2015-07-01 22:21:30 UTC
(In reply to Julie Carbone from comment #5)
> Version: #9
> --version: #1

So this looks like there are 10 results being found by the search, but most of them are "Version" with the same translation so they are all combined to a single row in the results.

We should review the different stages of search and filtering, and try to make sure they return enough results for 10 or more rows to display in the results, regardless how many actual matches there are for each row.

Another suggestion is to include a way to search for more results. For example, a button below the list of results to click that will search for more and add them to the list.

Comment 7 Zanata Migrator 2015-07-29 03:33:50 UTC
Migrated; check JIRA for bug status: http://zanata.atlassian.net/browse/ZNTA-323