Bug 617577 - Perf: Save database work on uninventory
Perf: Save database work on uninventory
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Core Server (Show other bugs)
4.0.0
All All
high Severity high (vote)
: ---
: ---
Assigned To: Heiko W. Rupp
Rajan Timaniya
:
Depends On:
Blocks: jon-sprint12-bugs rhq-perf
  Show dependency treegraph
 
Reported: 2010-07-23 09:18 EDT by Heiko W. Rupp
Modified: 2013-09-02 03:24 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-02 03:24:41 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch described (9.33 KB, application/octet-stream)
2010-07-23 09:18 EDT, Heiko W. Rupp
no flags Details
jProfiler output before/after patch + comparision (7.69 MB, application/zip)
2010-07-23 11:08 EDT, Heiko W. Rupp
no flags Details

  None (edit)
Description Heiko W. Rupp 2010-07-23 09:18:10 EDT
Created attachment 433952 [details]
Patch described

1) Uninventory of a resource first gets the whole resource ancestor tree inclusive config objects just to get the resource for a resource id

2) Uninventory queries the database for all children in an expensive query just to rerun the expensive part again for an update

Attached patch brings down uninventory time for 1 resource (service)  from ~ 12-20sec to 
~6sec for 1 server with > 1000 children


This has been tested in perf env and locally on postgres with a 1700 resources inventory.
Comment 1 Charles Crouch 2010-07-23 09:20:40 EDT
Needs triage
Comment 2 Heiko W. Rupp 2010-07-23 09:23:26 EDT
Positive side effect: no more Tx deadlock even on uninventoying 90 servers with
>1000 children each.
Before (no reliably reproducible) this ended up in ORA-0600 (or -0060 ?)
Deadlock exceptions
Comment 3 Heiko W. Rupp 2010-07-23 09:25:45 EDT
The following query was doing > 50% activity  on the database for quite some time during uninv of each resource. This is now gone with that patch.

update RHQ_RESOURCE set INVENTORY_STATUS=:1 , AGENT_ID=null, PARENT_RESOURCE_ID=null, RESOURCE_KEY='deleted' where ID=:2 orID in (select resource1_.ID from RHQ_RESOURCE resource1_ inner join RHQ_RESOURCE resource2_ onresource1_.PARENT_RESOURCE_ID=resource2_.ID where resource2_.ID=:3 ) or ID in (select resource3_.ID from RHQ_RESOURCEresource3_ inner join RHQ_RESOURCE resource4_ on resource3_.PARENT_RESOURCE_ID=resource4_.ID inner join RHQ_RESOURCEresource5_ on resource4_.PARENT_RESOURCE_ID=resource5_.ID where resource5_.ID=:4 ) or ID in (select resource6_.ID fromRHQ_RESOURCE resource6_ inner join RHQ_RESOURCE resource7_ on resource6_.PARENT_RESOURCE_ID=resource7_.ID inner joinRHQ_RESOURCE resource8_ on resource7_.PARENT_RESOURCE_ID=resource8_.ID inner join RHQ_RESOURCE resource9_ onresource8_.PARENT_RESOURCE_ID=resource9_.ID where resource9_.ID=:5 ) or ID in (select resource10_.ID from RHQ_RESOURCEresource10_ inner join RHQ_RESOURCE resource11_ on resource10_.PARENT_RESOURCE_ID=resource11_.ID inner join RHQ_RESOURCEresource12_ on resource11_.PARENT_RESOURCE_ID=resource12_.ID inner join RHQ_RESOURCE resource13_ onresource12_.PARENT_RESOURCE_ID=resource13_.ID inner join RHQ_RESOURCE resource14_ onresource13_.PARENT_RESOURCE_ID=resource14_.ID where resource14_.ID=:6 ) or ID in (select resource15_.ID from RHQ_RESOURCEresource15_ inner join RHQ_RESOURCE resource16_ on resource15_.PARENT_RESOURCE_ID=resource16_.ID inner join RHQ_RESOURCEresource17_ on resource16_.PARENT_RESOURCE_ID=resource17_.ID inner join RHQ_RESOURCE resource18_ onresource17_.PARENT_RESOURCE_ID=resource18_.ID inner join RHQ_RESOURCE resource19_ onresource18_.PARENT_RESOURCE_ID=resource19_.ID inner join RHQ_RESOURCE resource20_ onresource19_.PARENT_RESOURCE_ID=resource20_.ID where resource20_.ID=:7 ) or ID in (select resource21_.ID from RHQ_RESOURCEresource21_ inner join RHQ_RESOURCE resource22_ on resource21_.PARENT_RESOURCE_ID=resource22_.ID inner join RHQ_RESOURCEresource23_ on resource22_.PARENT_RESOURCE_ID=resource23_.ID inner join RHQ_RESOURCE resource24_ onresource23_.PARENT_RESOURCE_ID=resource24_.ID inner join RHQ_RESOURCE resource25_ onresource24_.PARENT_RESOURCE_ID=resource25_.ID inner join RHQ_RESOURCE resource26_ onresource25_.PARENT_RESOURCE_ID=resource26_.ID inner join RHQ_RESOURCE resource27_ onresource26_.PARENT_RESOURCE_ID=resource27_.ID where resource27_.ID=:8 )
Comment 4 Charles Crouch 2010-07-23 10:02:40 EDT
This is an optimization, push into master only.
Comment 5 Heiko W. Rupp 2010-07-23 11:08:18 EDT
Created attachment 433981 [details]
jProfiler output before/after patch + comparision
Comment 6 Heiko W. Rupp 2010-07-23 11:09:32 EDT
(In reply to comment #5)
> Created an attachment (id=433981) [details]
> jProfiler output before/after patch + comparision    

Forgot that this is on my local box, uninventorying 1750 resources on 1 platform (so uninv. the whole platform).
Comment 7 Heiko W. Rupp 2010-07-25 15:57:46 EDT
d3ecc7e in master (RHQ 4)
Comment 8 Heiko W. Rupp 2010-08-09 04:00:58 EDT
Setting on OA, as this is already in master (see previous comment)
Comment 9 Rajan Timaniya 2010-08-20 01:22:42 EDT
Heiko,

Can you please give steps to reproduce/test bug and confirm that the fix is in JON 2.4 GA?
Comment 10 Heiko W. Rupp 2010-08-20 01:56:23 EDT
Rajan, this is master only, so will be in RHQ 4, but is not in JON 2.4.GA
Comment 11 Heiko W. Rupp 2010-10-18 09:30:12 EDT
To try to reproduce:
- take a platform with multiple servers and mulitple services per server into inventory. Best are > 1000 services. E.g. take a postgres server, create some logical databases and for each create 1100 tables (via a script).
Then take this into inventory. When all is settled, uninventory the whole platform.
Comment 12 Rajan Timaniya 2010-10-19 07:18:56 EDT
Verified on RHQ-Master build #423
(http://hudson-qe.rhq.rdu.redhat.com:8080/view/RHQ/job/ci-rhq-master/423/)

Steps:
1) Installed RHQ server and agent
2) Agent box has configured multiple servers (JBoss-eap 5.0 has 1048 services and Postgres has 2008 services)
3) Make sure that all are into inventory and settled
4) Uninventory whole platform

Observation:
Platform uninventory with all its resources within in 3-4-sec.

2010-10-19 16:44:27,372 INFO  [org.rhq.enterprise.server.cloud.instance.CacheConsistencyManagerBean] 10.65.193.1 took [37]ms to reload cache for 1 agents
2010-10-19 16:44:31,480 INFO  [org.rhq.enterprise.server.discovery.DiscoveryServerServiceImpl] Processed AV:[rajantest][329][full] - need full=[false] in (233)ms
2010-10-19 16:44:35,918 INFO  [org.rhq.enterprise.server.resource.ResourceManagerBean] User [org.rhq.core.domain.auth.Subject[id=2,name=rhqadmin]] is marking resource [Resource[id=10331, type=Linux, key=rajantest, name=rajantest, parent=<null>, version=Linux 2.6.18-164.el5]] for asynchronous uninventory
Comment 13 Heiko W. Rupp 2013-09-02 03:24:41 EDT
Bulk closing of issues that were VERIFIED, had no target release and where the status changed more than a year ago.

Note You need to log in before you can comment on or make changes to this bug.