Bug 1074705

Summary: Look at caching minhash xor values
Product: [Community] PressGang CCMS Reporter: Matthew Casperson <mcaspers>
Component: CCMS-CoreAssignee: Lee Newson <lnewson>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 1.4CC: cbredesen, lnewson
Target Milestone: ---   
Target Release: 1.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-01 21:39:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matthew Casperson 2014-03-10 22:43:45 UTC
The MySQL slow query log file is full of entries where the complete list of minhash xor values are being queried and returned. This is from the RESTv1.java file in the recalculateMinHashes() method.

final List<MinHashXOR> minHashXORs = entityManager.createQuery(MinHashXOR.SELECT_ALL_QUERY).getResultList();

Since this information is very unlikely to change, we should consider caching it between calls.

Comment 1 Lee Newson 2014-04-11 03:16:19 UTC
This should also be applied to the Topic XML factory, as it also produces the same slow query.

Comment 2 Lee Newson 2014-04-11 04:42:57 UTC
Fixed in 1.5-SNAPSHOT build 201404111431

The MinHashXOR select all query is now cached upon first use in the CacheEntityLoader class. All references of getting all the MinHashXOR's are now accessed via the CacheEntityLoader and when the MinHashXORs are recalculated the cached values are invalidated.

Additionally I've also optimized how minhashes are calculated to save processing time by removing logic that can be done once instead of for every MinHashXOR value.

Note: This version is currently live on the test/development server.

Comment 4 Matthew Casperson 2014-04-23 19:36:49 UTC
I changed the HashMap in CacheEntityLoader to a ConcurrentHashMap. From what I can tell concurrent read and write access to a HashMap can lead to unpredictable results.