Bug 720794 - it takes a long time to import a large number of Resources
Summary: it takes a long time to import a large number of Resources
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Core Server
Version: 4.1
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Robert Buck
QA Contact: Mike Foley
URL:
Whiteboard:
: 717257 (view as bug list)
Depends On:
Blocks: jon30-perf rhq-gui-timeouts
TreeView+ depends on / blocked
 
Reported: 2011-07-12 19:08 UTC by Ian Springer
Modified: 2013-08-06 00:39 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-02-07 19:21:20 UTC
Embargoed:


Attachments (Terms of Use)
Diffs to offload server to client comm to background quartz job; reduces user perceived latency from 15s+ to 3s. (15.04 KB, patch)
2011-10-18 21:21 UTC, Robert Buck
no flags Details | Diff

Description Ian Springer 2011-07-12 19:08:01 UTC
It took me about 10 minutes to import 1500 Resources (300 platforms and 1200 top-level servers). Assuming the import time scales linearly, it would take more than 10 seconds to import any more than 25 Resources. Since we are aiming to have all GUI pages load in less than 10 seconds, and 25 is a fairly small number of Resources, we may want to try to improve the performance here.

Comment 1 Ian Springer 2011-08-18 20:11:13 UTC
*** Bug 717257 has been marked as a duplicate of this bug. ***

Comment 2 Ian Springer 2011-08-18 20:15:12 UTC
Note, Heiko reports that importing an AS7 domain-controller Resource takes much longer than 10 seconds (I'm presuming because it has a lot of descendant services). That is a very basic use case that demonstrates this issue.

Comment 3 Ian Springer 2011-09-06 21:09:18 UTC
A solution for this would be to split Resource import into two parts:

1) the call to importResources() would flip all of the NEW Resources to a new COMMITTING inventory status and then return.
2) a background job would periodically scan for COMMITTING Resources and do the real work necessary to commit them to inventory (syncing to Agents, etc.) and flip them to COMMITTED status.

This would allow the GUI to return very quickly after the user clicks the Import button to import a set of Resources. It could then display a "Import of 207 Resources initiated." message, and the Resources would no longer be listed on the autodiscovery queue view, since they would no longer be NEW. The bad part is the GUI would not know when the import had fully completed and so would not be able to display another message to inform the user the import completed.

Comment 4 Ian Springer 2011-09-16 15:25:50 UTC
Note, we already support importing Resources from an Agent that is currently down.

Comment 5 Ian Springer 2011-09-16 15:33:27 UTC
Rather than introducing a new COMMITTING inventory status, the finishCommit background job could probably use an existing field to determine if a COMMITTED Resource has not been fully committed (i.e. synced to its Agent) yet:

1) if (!resource.isConnected())
2) if (resource.getUuid() == null)

Comment 6 Ian Springer 2011-09-16 15:38:08 UTC
I just noticed when I imported Resources from an Agent that was down, I got an ugly stack trace in the Server log:

11:21:02,019 ERROR [ClientCommandSenderTask] {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.send-throttle=true}]; params=[{invocation=NameBasedInvocation[synchronizeInventory], targetInterfaceName=org.rhq.core.clientapi.agent.discovery.DiscoveryAgentService}]]. Cause: org.jboss.remoting.CannotConnectException:Can not get connection to server. Problem establishing socket connection for InvokerLocator [socket://127.0.0.1:16163/?backlog=200&clientMaxPoolSize=304&enableTcpNoDelay=true&maxPoolSize=303&numAcceptThreads=1&rhq.communications.connector.rhqtype=agent&socketTimeout=60000] -> java.net.ConnectException:Connection refused. Cause: org.jboss.remoting.CannotConnectException: Can not get connection to server. Problem establishing socket connection for InvokerLocator [socket://127.0.0.1:16163/?backlog=200&clientMaxPoolSize=304&enableTcpNoDelay=true&maxPoolSize=303&numAcceptThreads=1&rhq.communications.connector.rhqtype=agent&socketTimeout=60000]
11:21:02,021 WARN  [DiscoveryBossBean] Could not perform commit synchronization with agent for platform [jetengine]
org.jboss.remoting.CannotConnectException: Can not get connection to server. Problem establishing socket connection for InvokerLocator [socket://127.0.0.1:16163/?backlog=200&clientMaxPoolSize=304&enableTcpNoDelay=true&maxPoolSize=303&numAcceptThreads=1&rhq.communications.connector.rhqtype=agent&socketTimeout=60000]
	at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:579)
	at org.jboss.remoting.MicroRemoteClientInvoker.invoke(MicroRemoteClientInvoker.java:122)
	at org.jboss.remoting.Client.invoke(Client.java:1634)
	at org.jboss.remoting.Client.invoke(Client.java:548)
	at org.rhq.enterprise.communications.command.client.JBossRemotingRemoteCommunicator.rawSend(JBossRemotingRemoteCommunicator.java:514)
	at org.rhq.enterprise.communications.command.client.JBossRemotingRemoteCommunicator.sendWithoutCallbacks(JBossRemotingRemoteCommunicator.java:456)
	at org.rhq.enterprise.communications.command.client.JBossRemotingRemoteCommunicator.sendWithoutInitializeCallback(JBossRemotingRemoteCommunicator.java:475)
	at org.rhq.enterprise.communications.command.client.JBossRemotingRemoteCommunicator.send(JBossRemotingRemoteCommunicator.java:496)
	at org.rhq.enterprise.communications.command.client.AbstractCommandClient.invoke(AbstractCommandClient.java:143)
	at org.rhq.enterprise.communications.command.client.ClientCommandSender.send(ClientCommandSender.java:1087)
	at org.rhq.enterprise.communications.command.client.ClientCommandSenderTask.send(ClientCommandSenderTask.java:229)
	at org.rhq.enterprise.communications.command.client.ClientCommandSenderTask.call(ClientCommandSenderTask.java:107)
	at org.rhq.enterprise.communications.command.client.ClientCommandSenderTask.call(ClientCommandSenderTask.java:55)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
	at java.net.Socket.connect(Socket.java:529)
	at org.jboss.remoting.transport.socket.SocketClientInvoker.createSocket(SocketClientInvoker.java:192)
	at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.getConnection(MicroSocketClientInvoker.java:827)
	at org.jboss.remoting.transport.socket.MicroSocketClientInvoker.transport(MicroSocketClientInvoker.java:569)
	... 17 more


Since this is a known (and handled) condition, we should not be logging an error or a stack trace. Instead we should just log a warning.

Comment 7 Robert Buck 2011-10-18 21:21:27 UTC
Created attachment 528896 [details]
Diffs to offload server to client comm to background quartz job; reduces user perceived latency from 15s+ to 3s.

Comment 8 Robert Buck 2011-10-19 14:10:02 UTC
QA: Please make sure to test this in HA mode. Thanks.

Comment 9 Robert Buck 2011-10-19 14:20:34 UTC
commit fe75f0f04101c110a722317515043cf063099bd8
Author: Robert Buck <rbuck>
Date:   2011-10-19 10:08:59 -0400

[BZ 720794] Decrease user perceived latency when importing lots of resources by scheduling all server-agent communication as a background quartz task.

Comment 10 Mike Foley 2011-10-26 15:07:25 UTC
observing no functional or performance issues with import of resources.

Comment 11 Mike Foley 2012-02-07 19:21:20 UTC
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE


Note You need to log in before you can comment on or make changes to this bug.