Red Hat Bugzilla – Bug 1269333
Result of searching assets by business-central includes duplicate records
Last modified: 2015-11-24 03:40:07 EST
Description of problem:
Result of searching assets by business-central shows 2 records for an asset.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. On business-central, create a drl names "english" and save it:
2. On business-central, search asset by entering "english"
Search result has 2 records for an asset named "english"
Search result has only 1 record for an asset named "english"
Still reproducible in one special case. Please follow these steps:
1. create drl file named "any_file_name.drl"
2. type "*_file_*" as search pattern
3. click magnifying glass icon
any_file_name.drl file will be twice in search results
Did you delete the existing .index folder before testing?
Please be more specific, what .index folder do you mean. It is folder of application server, of web browser?
Confirmed, even if I deleted .index folder of application server is described behaviour reproducible.
(In reply to Jozef Marko from comment #9)
> Confirmed, even if I deleted .index folder of application server is
> described behaviour reproducible.
Can you please be more specific; is just the "special case" still reproducible, or do you mean it's still "generally" broken?
It is broken only in special cases. I was able to reproduce duplicated search results with the following:
Created files: "any_drl_file.drl", "any_scenario_file.scenario"
Search patterns: *_drl_*, *_scenario_*
For example searching of business processes works well. There are no duplicated search results.
The duplication is not necessarly a bug (ok, it looks bad in the WEB page) - what happens is basically the index engine index all branches... And origin/master and master end up with exact same content...
So it's necessary to limit the results to just display current branch results (master or whatever - as asset management may have changed the current branch as well)
(In reply to Alexandre Porcelli from comment #12)
> The duplication is not necessarly a bug (ok, it looks bad in the WEB page) -
> what happens is basically the index engine index all branches... And
> origin/master and master end up with exact same content...
> So it's necessary to limit the results to just display current branch
> results (master or whatever - as asset management may have changed the
> current branch as well)
Hi Alexandre, it's not that(*)!
I can replicate quite easily (following Jozef's steps) and the Lucene index (queried with Luke) returns two records for "fullText:*_drl_*" each has the same ClusterId (Repository), SegmentId (branch) and Key (Path).. I suspect it might relate to the use of StandardAnalyzer for fullText - which treats underscores as word breaks. I'll look into more...
(*) plus I fixed another BZ that had BatchIndex LIST_MODE of "all" on restart that led to multiple branches being indexed......
OK, the issue was multiple index entries were being created for "full text" searches (which happens to be what's used when searching with the textbox/magnifying glass use-case). The issue has nothing to do with the use of underscores - you can equally search for just "*drl*" etc.
Processes are not affected as they do not have any "additional indexing" implementation (perhaps they should, and this be the subject of a new BZ?). Additional Indexing is used to create meta-data to support (a) impact analysis and (b) refactoring support (features which are to be exposed more in Business Central in the future).
No duplicated records in search results.
Verified on 6.2.0.CR1.