Description of problem: Calling Category.removeChild(ACSObject) triggers the following query: select ao.object_id as c_1, ao.object_type as c_2, ao.display_name as c_3, ao.default_domain_class as c_4, link__cocm.object_id as c_5, link__cocm.category_id as c_6, link__cocm.default_p as c_7, link__cocm.index_p as c_8, link__cocm.sort_key as c_9 from acs_objects ao cross join cat_object_category_map link__cocm where link__cocm.category_id = ? and link__cocm.object_id = ? The where clause restricts cat_object_category_map to only one row, but this is then cartesian join'd against acs_objects. Thus this code from com.redhat.persistence.Sesssion now iterates over every single row in ACSObjects: public void onLink(Link link) { Query q = getQuery(obj, link, value); Cursor c = retrieve(q).getDataSet().getCursor(); while (c.next()) { e.expand(new DeleteEvent(Session.this, c.get("link"))); } } On my database this is 12,000 rows and takes 6 seconds. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Removing an object from a category takes 6 seconds Expected results: Removing an object from a category takes ~10ms Additional info:
Incidentally I ran a profiler on the 'remove' method and this showed that c.r.persistence.metadata.Model.getQualifiedName as a hotspot, accounting for 20% of the execution time. This suggests the the qualified name should be denormalized, or perhaps changed so that instead of being String getQualifiedName() it were void buildQualifiedName(StringBuffer buf) to avoid a huge number of redundant String allocations/concatenations.
There should probably be a separate ticket for getQualifiedName(). Otherwise, this may get lost.
Dan, what sort of profiler do you use these days?
I was evaluating JFluid to see whether it would identify the remove method as the hotspot more effectively than my manual debugging. See http://research.sun.com/projects/jfluid/index.html, but its not a viable long term solution thanks to this license terms: <quote> 3.4. Licensee shall have no right to use the Licensed Software for productive or commercial use. </quote>
Note to self: Create an content item and categorize it, so that there is at least one entry in the cat_object_category_map table: select category_id, object_id from cat_object_category_map; category_id | object_id -------------+----------- 1054 | 9008 A shorter (and sligthly more readable [for me]) form of the offending query is this: select ao.object_id as aoid, cocm.object_id, cocm.category_id from acs_objects ao cross join cat_object_category_map cocm where cocm.category_id = 1054 and cocm.object_id = 9008; This is in fact equavalent to a cartesian join of acs_objects and a single-row table. Category#removeChild(ACSObject) basically does this: remove(CHILD_OBJECTS, acsObj); which seems correct on the face of it. The association is specified as follows: association { Category[0..n] categories = ... ACSObject[0..n] childObjects = .. Boolean[0..1] isDefault = cat_object_category_map.default_p CHAR(1); ... } I have no idea why remove() triggers the above "select" query. Since I'm totally out of my depth on this one, I'm cc'ing Archit so he can take a look at this when time permits.
forgot to add archit to the cc list when posting comment #5.
Have we got an ETA on the fix for this bug ? It is severly hampering use of the 'Assign Categories form in CMS, now that an initial 'empty' APLAWS install has about 20,000 rows in acs_objects, each time we remove a category it takes ~20 seconds.
This is *very* high priority for the client. What's the status?
After our client encountered this bug two days before a delivery deadline (with 120,000 rows in acs_objects), I have modified com.redhat.persistence.Session.getQuery() so it works when the value parameter is not null. See changelist 40423. What I found is that getQuery() returns a Query like this when invoked with a null "value" parameter: com.arsdigita.kernel.ACSObject(id, objectType, displayName, defaultDomainClass, link.childObjects.id, link.categories.id, link.isDefault, link.isIndex, link.sortKey) filter(((link.categories = __from__) and (link.childObjects = null))) order() parameters{com.arsdigita.categorization.Category __from__=[com.arsdigita.categorization.Category:{id=268}]} When invoked with a not-null value, the Query looks like this: com.arsdigita.kernel.ACSObject(id, objectType, displayName, defaultDomainClass, link.childObjects.id, link.categories.id, link.isDefault, link.isIndex, link.sortKey) filter(((link.categories = __from__) and (link.childObjects = __to__))) order() parameters{com.arsdigita.categorization.Category __from__=[com.arsdigita.categorization.Category:{id=268}], com.arsdigita.kernel.ACSObject __to__=[com.arsdigita.cms.ContentBundle:{id=306}]} To me this looks like a bug in the Query class, leading to an incomplete join in the second case. But I only had enough time and knowledge about persistence to fix the getQuery() method, which is enough to speed up category removal.
With the test-qgen land, the cartesian join is no longer generated.