Bug 534616 - (RHQ-1396) Increase the length of time the agent can take to download things from the server
Increase the length of time the agent can take to download things from the se...
Status: CLOSED NEXTRELEASE
Product: RHQ Project
Classification: Other
Component: Communications Subsystem (Show other bugs)
unspecified
All All
medium Severity medium (vote)
: ---
: ---
Assigned To: John Mazzitelli
Corey Welton
http://jira.rhq-project.org/browse/RH...
: Improvement
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-23 15:00 EST by Charles Crouch
Modified: 2015-02-01 18:24 EST (History)
1 user (show)

See Also:
Fixed In Version: 1.2
Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Charles Crouch 2009-01-23 15:00:00 EST
Right now when downloading a patch from the server the agent is allowed 10mins to complete the streaming of the file before its request gets timed out and the following exception is seen

2009-01-23 12:45:53,453 ERROR [ResourceContainer.invoker.nonDaemon-5] (enterprise.communications.command.client.ClientCommandSenderTask)- {ClientCommandSenderTask.send-failed}Failed to send command [Command: type=[remotepojo]; cmd-in-response=[false]; config=[{rhq.security-token=1232638162187-1204607085-8541727374551626108, rhq.send-throttle=true}]; params=[{targetInterfaceName=org.rhq.core.clientapi.server.content.ContentServerService, invocation=NameBasedInvocation[downloadPackageBitsGivenResource]}]]. Cause: java.util.concurrent.TimeoutException:null
java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:211)
	at java.util.concurrent.FutureTask.get(FutureTask.java:85)
	at org.rhq.enterprise.communications.command.client.ClientCommandSenderTask.run(ClientCommandSenderTask.java:143)
	at org.rhq.enterprise.communications.command.client.ClientCommandSender.sendSynch(ClientCommandSender.java:616)
	at org.rhq.enterprise.communications.command.client.ClientRemotePojoFactory$RemotePojoProxyHandler.invoke(ClientRemotePojoFactory.java:407)
	at $Proxy9.downloadPackageBitsGivenResource(Unknown Source)
	at org.rhq.core.pc.content.ContentManager.downloadPackageBits(ContentManager.java:265)
	at com.jboss.jbossnetwork.product.jbpm.handlers.JONServerDownloadActionHandler.downloadBits(JONServerDownloadActionHandler.java:68)
	at com.jboss.jbossnetwork.product.jbpm.handlers.JONServerDownloadActionHandler.run(JONServerDownloadActionHandler.java:48)
	at com.jboss.jbossnetwork.product.jbpm.handlers.BaseHandler.execute(BaseHandler.java:130)
	at org.jbpm.graph.def.Action.execute(Action.java:123)
	at org.jbpm.graph.def.Node.execute(Node.java:328)
	at org.jbpm.graph.def.Node.enter(Node.java:316)
	at org.jbpm.graph.def.Transition.take(Transition.java:119)
	at org.jbpm.graph.def.Node.leave(Node.java:382)
	at org.jbpm.graph.node.StartState.leave(StartState.java:70)
	at org.jbpm.graph.exe.Token.signal(Token.java:174)
	at org.jbpm.graph.exe.Token.signal(Token.java:123)
	at org.jbpm.graph.exe.ProcessInstance.signal(ProcessInstance.java:217)
	at org.rhq.plugins.jbossas.JBPMWorkflowManager.run(JBPMWorkflowManager.java:149)
	at org.rhq.plugins.jbossas.JBossASServerComponent.deployPackages(JBossASServerComponent.java:382)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.rhq.core.pc.inventory.ResourceContainer$ComponentInvocationThread.call(ResourceContainer.java:450)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
	at java.util.concurrent.FutureTask.run(FutureTask.java:123)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
	at java.lang.Thread.run(Thread.java:595)
2009-01-23 12:45:53,453 DEBUG [ResourceContainer.invoker.nonDaemon-5] (enterprise.communications.command.client.ClientRemotePojoFactory)- {ClientRemotePojoFactory.execution-failure}Failed to execute remote POJO method [downloadPackageBitsGivenResource]. Cause: java.util.concurrent.TimeoutException:null

As discussed below we should increase this timeout probably to 45mins, incase the server is connected to the CSP over a slow connection or a very large patch is being downloaded.

12:55:46 PM) ccrouch: so the agent makes a call to the server which then goes and downloads the patch from the CSP, and then streams it back to the agent
(12:55:56 PM) ccrouch: atleast thats how it used to work
(12:56:05 PM) mazz: yeah, that makes sense from this stack
(12:56:45 PM) mazz: well, we can use my comm layers @Timeout annotation on this download method - since 10 minutes in general might not be enough for this kind of thing (we couldbe downloading very large binaries)
(12:57:12 PM) ccrouch: yeah i'll raise a jira for this
(12:57:28 PM) ccrouch: for right now though, assuming they have a fast pipe, they should be ok
(12:57:33 PM) mazz: [{targetInterfaceName=org.rhq.core.clientapi.server.content.ContentServerService, invocation=NameBasedInvocation[downloadPackageBitsGivenResource]}]]
(12:57:40 PM) mazz: that's the interface that needs a new @Timeout
(12:58:19 PM) mazz: there are other "download" methods around here too - might need to look at these also
(12:59:42 PM) ccrouch: the txn timeout is 45mins
(12:59:53 PM) ccrouch:   @TransactionTimeout(45 * 60)
(12:59:53 PM) ccrouch:     public long outputPackageVersionBitsGivenResource(int resourceId, PackageDetailsKey packageDetailsKey,
(12:59:53 PM) ccrouch:   
(1:00:04 PM) ccrouch: so that maynot be a method call timeout
(1:00:23 PM) mazz: this is the JPA timeout - what you hit was the agent comm timeout
(1:00:38 PM) mazz: it makes sense to make them the same here - 45 minutes
(1:00:58 PM) mazz: so, I would put @Timeout(45 * 60 * 1000) annottaion on that comm interface
(1:01:27 PM) ccrouch: sorry, what i meant to say was 
(1:01:27 PM) ccrouch: "so that may not be a *bad* method call timeout *to use too*"
Comment 1 Joseph Marques 2009-01-30 08:32:14 EST
we definitely don't want to make the product less usable for people trying to connect to our CSP over a slow connection - marking for 1.2 inclusion.
Comment 2 John Mazzitelli 2009-01-30 10:29:52 EST
will make sure ContentServerService methods match timeouts for the ContentManagerBean and ContentSourceManagerBean SLSB tx timeouts.
Comment 3 John Mazzitelli 2009-01-30 10:36:23 EST
added comm annotation @Timeout(45 * 60 * 1000) to ContentServerService interface's methods related to downloading bits
Comment 4 Corey Welton 2009-03-30 12:54:57 EDT
Testing notes:
A suitable test would be to use iptables and reduce throughput across an ethernet device to slow data transfer to a trickle.  We've been unsuccessful in doing this so far.

That said -- this looks to be a simple code change (adding '* 1000') to the timeout formula.  Given this, in addition to  QA not currently have the resources/knowledge to test this, dev and qa have agreed it can be closed.

Comment 5 Red Hat Bugzilla 2009-11-10 15:32:03 EST
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-1396

Note You need to log in before you can comment on or make changes to this bug.