Bug 617577 - Perf: Save database work on uninventory
Summary: Perf: Save database work on uninventory
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Core Server
Version: 4.0.0
Hardware: All
OS: All
high
high
Target Milestone: ---
: ---
Assignee: Heiko W. Rupp
QA Contact: Rajan Timaniya
URL:
Whiteboard:
Depends On:
Blocks: jon-sprint12-bugs rhq-perf
TreeView+ depends on / blocked
 
Reported: 2010-07-23 13:18 UTC by Heiko W. Rupp
Modified: 2013-09-02 07:24 UTC (History)
1 user (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2013-09-02 07:24:41 UTC
Embargoed:


Attachments (Terms of Use)
Patch described (9.33 KB, application/octet-stream)
2010-07-23 13:18 UTC, Heiko W. Rupp
no flags Details
jProfiler output before/after patch + comparision (7.69 MB, application/zip)
2010-07-23 15:08 UTC, Heiko W. Rupp
no flags Details

Description Heiko W. Rupp 2010-07-23 13:18:10 UTC
Created attachment 433952 [details]
Patch described

1) Uninventory of a resource first gets the whole resource ancestor tree inclusive config objects just to get the resource for a resource id

2) Uninventory queries the database for all children in an expensive query just to rerun the expensive part again for an update

Attached patch brings down uninventory time for 1 resource (service)  from ~ 12-20sec to 
~6sec for 1 server with > 1000 children


This has been tested in perf env and locally on postgres with a 1700 resources inventory.

Comment 1 Charles Crouch 2010-07-23 13:20:40 UTC
Needs triage

Comment 2 Heiko W. Rupp 2010-07-23 13:23:26 UTC
Positive side effect: no more Tx deadlock even on uninventoying 90 servers with
>1000 children each.
Before (no reliably reproducible) this ended up in ORA-0600 (or -0060 ?)
Deadlock exceptions

Comment 3 Heiko W. Rupp 2010-07-23 13:25:45 UTC
The following query was doing > 50% activity  on the database for quite some time during uninv of each resource. This is now gone with that patch.

update RHQ_RESOURCE set INVENTORY_STATUS=:1 , AGENT_ID=null, PARENT_RESOURCE_ID=null, RESOURCE_KEY='deleted' where ID=:2 orID in (select resource1_.ID from RHQ_RESOURCE resource1_ inner join RHQ_RESOURCE resource2_ onresource1_.PARENT_RESOURCE_ID=resource2_.ID where resource2_.ID=:3 ) or ID in (select resource3_.ID from RHQ_RESOURCEresource3_ inner join RHQ_RESOURCE resource4_ on resource3_.PARENT_RESOURCE_ID=resource4_.ID inner join RHQ_RESOURCEresource5_ on resource4_.PARENT_RESOURCE_ID=resource5_.ID where resource5_.ID=:4 ) or ID in (select resource6_.ID fromRHQ_RESOURCE resource6_ inner join RHQ_RESOURCE resource7_ on resource6_.PARENT_RESOURCE_ID=resource7_.ID inner joinRHQ_RESOURCE resource8_ on resource7_.PARENT_RESOURCE_ID=resource8_.ID inner join RHQ_RESOURCE resource9_ onresource8_.PARENT_RESOURCE_ID=resource9_.ID where resource9_.ID=:5 ) or ID in (select resource10_.ID from RHQ_RESOURCEresource10_ inner join RHQ_RESOURCE resource11_ on resource10_.PARENT_RESOURCE_ID=resource11_.ID inner join RHQ_RESOURCEresource12_ on resource11_.PARENT_RESOURCE_ID=resource12_.ID inner join RHQ_RESOURCE resource13_ onresource12_.PARENT_RESOURCE_ID=resource13_.ID inner join RHQ_RESOURCE resource14_ onresource13_.PARENT_RESOURCE_ID=resource14_.ID where resource14_.ID=:6 ) or ID in (select resource15_.ID from RHQ_RESOURCEresource15_ inner join RHQ_RESOURCE resource16_ on resource15_.PARENT_RESOURCE_ID=resource16_.ID inner join RHQ_RESOURCEresource17_ on resource16_.PARENT_RESOURCE_ID=resource17_.ID inner join RHQ_RESOURCE resource18_ onresource17_.PARENT_RESOURCE_ID=resource18_.ID inner join RHQ_RESOURCE resource19_ onresource18_.PARENT_RESOURCE_ID=resource19_.ID inner join RHQ_RESOURCE resource20_ onresource19_.PARENT_RESOURCE_ID=resource20_.ID where resource20_.ID=:7 ) or ID in (select resource21_.ID from RHQ_RESOURCEresource21_ inner join RHQ_RESOURCE resource22_ on resource21_.PARENT_RESOURCE_ID=resource22_.ID inner join RHQ_RESOURCEresource23_ on resource22_.PARENT_RESOURCE_ID=resource23_.ID inner join RHQ_RESOURCE resource24_ onresource23_.PARENT_RESOURCE_ID=resource24_.ID inner join RHQ_RESOURCE resource25_ onresource24_.PARENT_RESOURCE_ID=resource25_.ID inner join RHQ_RESOURCE resource26_ onresource25_.PARENT_RESOURCE_ID=resource26_.ID inner join RHQ_RESOURCE resource27_ onresource26_.PARENT_RESOURCE_ID=resource27_.ID where resource27_.ID=:8 )

Comment 4 Charles Crouch 2010-07-23 14:02:40 UTC
This is an optimization, push into master only.

Comment 5 Heiko W. Rupp 2010-07-23 15:08:18 UTC
Created attachment 433981 [details]
jProfiler output before/after patch + comparision

Comment 6 Heiko W. Rupp 2010-07-23 15:09:32 UTC
(In reply to comment #5)
> Created an attachment (id=433981) [details]
> jProfiler output before/after patch + comparision    

Forgot that this is on my local box, uninventorying 1750 resources on 1 platform (so uninv. the whole platform).

Comment 7 Heiko W. Rupp 2010-07-25 19:57:46 UTC
d3ecc7e in master (RHQ 4)

Comment 8 Heiko W. Rupp 2010-08-09 08:00:58 UTC
Setting on OA, as this is already in master (see previous comment)

Comment 9 Rajan Timaniya 2010-08-20 05:22:42 UTC
Heiko,

Can you please give steps to reproduce/test bug and confirm that the fix is in JON 2.4 GA?

Comment 10 Heiko W. Rupp 2010-08-20 05:56:23 UTC
Rajan, this is master only, so will be in RHQ 4, but is not in JON 2.4.GA

Comment 11 Heiko W. Rupp 2010-10-18 13:30:12 UTC
To try to reproduce:
- take a platform with multiple servers and mulitple services per server into inventory. Best are > 1000 services. E.g. take a postgres server, create some logical databases and for each create 1100 tables (via a script).
Then take this into inventory. When all is settled, uninventory the whole platform.

Comment 12 Rajan Timaniya 2010-10-19 11:18:56 UTC
Verified on RHQ-Master build #423
(http://hudson-qe.rhq.rdu.redhat.com:8080/view/RHQ/job/ci-rhq-master/423/)

Steps:
1) Installed RHQ server and agent
2) Agent box has configured multiple servers (JBoss-eap 5.0 has 1048 services and Postgres has 2008 services)
3) Make sure that all are into inventory and settled
4) Uninventory whole platform

Observation:
Platform uninventory with all its resources within in 3-4-sec.

2010-10-19 16:44:27,372 INFO  [org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean] 10.65.193.1 took [37]ms to reload cache for 1 agents
2010-10-19 16:44:31,480 INFO  [org.rhq.enterprise.server.discovery.DiscoveryServerServiceImpl] Processed AV:[rajantest][329][full] - need full=[false] in (233)ms
2010-10-19 16:44:35,918 INFO  [org.rhq.enterprise.server.resource.ResourceManagerBean] User [org.rhq.core.domain.auth.Subject[id=2,name=rhqadmin]] is marking resource [Resource[id=10331, type=Linux, key=rajantest, name=rajantest, parent=<null>, version=Linux 2.6.18-164.el5]] for asynchronous uninventory

Comment 13 Heiko W. Rupp 2013-09-02 07:24:41 UTC
Bulk closing of issues that were VERIFIED, had no target release and where the status changed more than a year ago.


Note You need to log in before you can comment on or make changes to this bug.