Bug 534616 (RHQ-1396) - Increase the length of time the agent can take to download things from the server
Summary: Increase the length of time the agent can take to download things from the se...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: RHQ-1396
Product: RHQ Project
Classification: Other
Component: Communications Subsystem
Version: unspecified
Hardware: All
OS: All
medium
medium
Target Milestone: ---
: ---
Assignee: John Mazzitelli
QA Contact: Corey Welton
URL: http://jira.rhq-project.org/browse/RH...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-01-23 20:00 UTC by Charles Crouch
Modified: 2015-02-01 23:24 UTC (History)
1 user (show)

Fixed In Version: 1.2
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Charles Crouch 2009-01-23 20:00:00 UTC
Right now when downloading a patch from the server the agent is allowed 10mins to complete the streaming of the file before its request gets timed out and the following exception is seen

2009-01-23 12:45:53,453 ERROR [ResourceContainer.invoker.nonDaemon-5] (enterprise.communications.command.client.ClientCommandSenderTask)- {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-token=1232638162187-1204607085-8541727374551626108, rhq.send-throttle=true}]; params=[{targetInterfaceName=org.rhq.core.clientapi.server.content.ContentServerService, invocation=NameBasedInvocation[downloadPackageBitsGivenResource]}]]. Cause: java.util.concurrent.TimeoutException:null
java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:211)
	at java.util.concurrent.FutureTask.get(FutureTask.java:85)
	at org.rhq.enterprise.communications.command.client.ClientCommandSenderTask.run(ClientCommandSenderTask.java:143)
	at org.rhq.enterprise.communications.command.client.ClientCommandSender.sendSynch(ClientCommandSender.java:616)
	at org.rhq.enterprise.communications.command.client.ClientRemotePojoFactory$RemotePojoProxyHandler.invoke(ClientRemotePojoFactory.java:407)
	at $Proxy9.downloadPackageBitsGivenResource(Unknown Source)
	at org.rhq.core.pc.content.ContentManager.downloadPackageBits(ContentManager.java:265)
	at com.jboss.jbossnetwork.product.jbpm.handlers.JONServerDownloadActionHandler.downloadBits(JONServerDownloadActionHandler.java:68)
	at com.jboss.jbossnetwork.product.jbpm.handlers.JONServerDownloadActionHandler.run(JONServerDownloadActionHandler.java:48)
	at com.jboss.jbossnetwork.product.jbpm.handlers.BaseHandler.execute(BaseHandler.java:130)
	at org.jbpm.graph.def.Action.execute(Action.java:123)
	at org.jbpm.graph.def.Node.execute(Node.java:328)
	at org.jbpm.graph.def.Node.enter(Node.java:316)
	at org.jbpm.graph.def.Transition.take(Transition.java:119)
	at org.jbpm.graph.def.Node.leave(Node.java:382)
	at org.jbpm.graph.node.StartState.leave(StartState.java:70)
	at org.jbpm.graph.exe.Token.signal(Token.java:174)
	at org.jbpm.graph.exe.Token.signal(Token.java:123)
	at org.jbpm.graph.exe.ProcessInstance.signal(ProcessInstance.java:217)
	at org.rhq.plugins.jbossas.JBPMWorkflowManager.run(JBPMWorkflowManager.java:149)
	at org.rhq.plugins.jbossas.JBossASServerComponent.deployPackages(JBossASServerComponent.java:382)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.rhq.core.pc.inventory.ResourceContainer$ComponentInvocationThread.call(ResourceContainer.java:450)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
	at java.util.concurrent.FutureTask.run(FutureTask.java:123)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
	at java.lang.Thread.run(Thread.java:595)
2009-01-23 12:45:53,453 DEBUG [ResourceContainer.invoker.nonDaemon-5] (enterprise.communications.command.client.ClientRemotePojoFactory)- {ClientRemotePojoFactory.execution-failure}Failed to execute remote POJO method [downloadPackageBitsGivenResource]. Cause: java.util.concurrent.TimeoutException:null

As discussed below we should increase this timeout probably to 45mins, incase the server is connected to the CSP over a slow connection or a very large patch is being downloaded.

12:55:46 PM) ccrouch: so the agent makes a call to the server which then goes and downloads the patch from the CSP, and then streams it back to the agent
(12:55:56 PM) ccrouch: atleast thats how it used to work
(12:56:05 PM) mazz: yeah, that makes sense from this stack
(12:56:45 PM) mazz: well, we can use my comm layers @Timeout annotation on this download method - since 10 minutes in general might not be enough for this kind of thing (we couldbe downloading very large binaries)
(12:57:12 PM) ccrouch: yeah i'll raise a jira for this
(12:57:28 PM) ccrouch: for right now though, assuming they have a fast pipe, they should be ok
(12:57:33 PM) mazz: [{targetInterfaceName=org.rhq.core.clientapi.server.content.ContentServerService, invocation=NameBasedInvocation[downloadPackageBitsGivenResource]}]]
(12:57:40 PM) mazz: that's the interface that needs a new @Timeout
(12:58:19 PM) mazz: there are other "download" methods around here too - might need to look at these also
(12:59:42 PM) ccrouch: the txn timeout is 45mins
(12:59:53 PM) ccrouch:   @TransactionTimeout(45 * 60)
(12:59:53 PM) ccrouch:     public long outputPackageVersionBitsGivenResource(int resourceId, PackageDetailsKey packageDetailsKey,
(12:59:53 PM) ccrouch:   
(1:00:04 PM) ccrouch: so that maynot be a method call timeout
(1:00:23 PM) mazz: this is the JPA timeout - what you hit was the agent comm timeout
(1:00:38 PM) mazz: it makes sense to make them the same here - 45 minutes
(1:00:58 PM) mazz: so, I would put @Timeout(45 * 60 * 1000) annottaion on that comm interface
(1:01:27 PM) ccrouch: sorry, what i meant to say was 
(1:01:27 PM) ccrouch: "so that may not be a *bad* method call timeout *to use too*"

Comment 1 Joseph Marques 2009-01-30 13:32:14 UTC
we definitely don't want to make the product less usable for people trying to connect to our CSP over a slow connection - marking for 1.2 inclusion.

Comment 2 John Mazzitelli 2009-01-30 15:29:52 UTC
will make sure ContentServerService methods match timeouts for the ContentManagerBean and ContentSourceManagerBean SLSB tx timeouts.

Comment 3 John Mazzitelli 2009-01-30 15:36:23 UTC
added comm annotation @Timeout(45 * 60 * 1000) to ContentServerService interface's methods related to downloading bits

Comment 4 Corey Welton 2009-03-30 16:54:57 UTC
Testing notes:
A suitable test would be to use iptables and reduce throughput across an ethernet device to slow data transfer to a trickle.  We've been unsuccessful in doing this so far.

That said -- this looks to be a simple code change (adding '* 1000') to the timeout formula.  Given this, in addition to  QA not currently have the resources/knowledge to test this, dev and qa have agreed it can be closed.



Comment 5 Red Hat Bugzilla 2009-11-10 20:32:03 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1396



Note You need to log in before you can comment on or make changes to this bug.