Description of problem: Version-Release number of selected component (if applicable): JON 3.1.1.ER2 + EAP 6.0 How reproducible: not always Steps to Reproduce: 1. have EAP6 running in domain mode in inventory 2. create new managed server (with autostart=false) 3. start it up (set blocking=true) right after it appears in your inventory (availability should be DOWN or UNKNOWN) Actual results: Operation fails after 10seconds with following message: java.lang.Exception: Read timed out, rolled-back=false, rolled-back=false at org.rhq.core.pc.operation.OperationInvocation.run(OperationInvocation.java:278) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) but in fact, server has been really started, server's availability will turn to UP soon. Expected results: Start operation succeeds Additional info: I've also tried to wait 10minutes since managed server resource was created and first attempt to start it (blocking=true) and it succeeded. Could it be some interference between start operation and discovery scan running?
Is there a longer stacktrace around somewhere? And are you sure about 10secs (and not 20secs) ?
Operation code is in org.rhq.modules.plugins.jbossas7.ManagedASComponent#invokeOperation which does getASConnection().execute(op); which is public Result execute(Operation op) { return execute(op, false, 10); } So here the 10s timeout is defined. Option a) increase 10s to 30s by calling getASConnection().execute(op,<timeout in sec>); we have done that in one other place as well. b) add a config property to let the user specify a timeout value then continue with a)
[15:27:41] <mfoley> yeah ... increasing the timeout ... that seems low-risk fix for this point in the JON 3.1.1 lifecycle ... i am good with that
Applied option a) from comment #5. The timeout was increased from 10 to 30 seconds to avoid situations in which managed server operations run slower than expected due to heavy load on the host machine.
master branch commit: http://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=b8a6a496d40915abb22fdd87753aec0c131ed95d
verified on JON 3.1.1.CR1
Bulk close of old bugs in VERIFIED state.