Bug 849964 - [eap6] Starting managed server fails afer 10seconds - Read timeout
[eap6] Starting managed server fails afer 10seconds - Read timeout
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Plugins (Show other bugs)
4.4
Unspecified Unspecified
high Severity high (vote)
: ---
: RHQ 4.5.0
Assigned To: Stefan Negrea
Mike Foley
:
Depends On:
Blocks: as7-plugin 851655
  Show dependency treegraph
 
Reported: 2012-08-21 07:10 EDT by Libor Zoubek
Modified: 2015-11-01 19:43 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 851655 (view as bug list)
Environment:
Last Closed: 2013-08-31 06:10:28 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Libor Zoubek 2012-08-21 07:10:55 EDT
Description of problem:


Version-Release number of selected component (if applicable):
JON 3.1.1.ER2 + EAP 6.0

How reproducible: not always


Steps to Reproduce:
1. have EAP6 running in domain mode in inventory
2. create new managed server (with autostart=false)
3. start it up (set blocking=true) right after it appears in your inventory (availability should be DOWN or UNKNOWN)
  
Actual results: Operation fails after 10seconds with following message:

java.lang.Exception: Read timed out, rolled-back=false, rolled-back=false
	at org.rhq.core.pc.operation.OperationInvocation.run(OperationInvocation.java:278)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:722)


but in fact, server has been really started, server's availability will turn to UP soon.

Expected results: Start operation succeeds


Additional info: I've also tried to wait 10minutes since managed server resource was created and first attempt to start it (blocking=true) and it succeeded. 

Could it be some interference between start operation and discovery scan running?
Comment 4 Heiko W. Rupp 2012-08-23 13:18:13 EDT
Is there a longer stacktrace around somewhere?
And are you sure about 10secs (and not 20secs) ?
Comment 5 Heiko W. Rupp 2012-08-23 13:54:18 EDT
Operation code is in org.rhq.modules.plugins.jbossas7.ManagedASComponent#invokeOperation

which does getASConnection().execute(op);
which is     
    public Result execute(Operation op) {
        return execute(op, false, 10);
    }

So here the 10s timeout is defined.

Option a) increase 10s to 30s by calling
  getASConnection().execute(op,<timeout in sec>);

we have done that in one other place as well.


b) add a config property to let the user specify a timeout value 
then continue with a)
Comment 6 Heiko W. Rupp 2012-08-23 15:31:07 EDT
[15:27:41] <mfoley> yeah ... increasing the timeout ... that seems low-risk fix for this point in the JON 3.1.1 lifecycle ... i am good with that
Comment 7 Stefan Negrea 2012-08-24 11:34:24 EDT
Applied option a) from comment #5. The timeout was increased from 10 to 30 seconds to avoid situations in which managed server operations run slower than expected due to heavy load on the host machine.
Comment 9 Libor Zoubek 2012-08-30 12:53:40 EDT
verified on JON 3.1.1.CR1
Comment 10 Heiko W. Rupp 2013-08-31 06:10:28 EDT
Bulk close of old bugs in VERIFIED state.

Note You need to log in before you can comment on or make changes to this bug.