Bug 1031100

Summary: Domain mode: Restart of the managed server fails with read timed out
Product: [JBoss] JBoss Operations Network Reporter: Radim Hatlapatka <rhatlapa>
Component: Plugin -- JBoss EAP 6Assignee: Thomas Segismont <tsegismo>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: unspecified    
Version: JON 3.2CC: tsegismo
Target Milestone: ER07   
Target Release: JON 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1012435    

Description Radim Hatlapatka 2013-11-15 15:42:51 UTC
Description of problem:
There is 20s limit for restarting server (20s it took to JON server to report the read timed out error). But sometimes it takes longer thus ending up with error [1]. For example in my test it took 30s


[1]
java.lang.Exception: Read timed out, rolled-back=false, rolled-back=false at org.rhq.core.pc.operation.OperationInvocation.run(OperationInvocation.java:278) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724)

Version-Release number of selected component (if applicable): JON 3.2.0.ER5 vs EAP 6.1.1


How reproducible: 10 % (mostly happens on solaris sparc 10)


Steps to Reproduce:
1. Import EAP 6 in domain mode
2. restart one of the managed servers (e.g. server-one)


Actual results:
restart shown as failed due Read timed out because restart took longer than 20s 

Expected results:
No failure if restart is successful even if it took a little bit longer


Additional info:
The best way would be to make similar fix as was done for https://bugzilla.redhat.com/show_bug.cgi?id=911327

It could be good also to increase the timeout for stop and start operation on the managed server

Comment 1 Thomas Segismont 2013-11-18 11:19:11 UTC
Fixed in master

commit 09d3e4fc3fd3350364a2eca08ad8714202d9f48d
Author: Thomas Segismont <tsegismo>
Date:   Mon Nov 18 12:15:35 2013 +0100

Comment 2 Thomas Segismont 2013-11-18 13:46:46 UTC
Cherry-picked to release/jon3.2.x

commit e14361863f517bca0e8b3208f1f424200768b777
Author: Thomas Segismont <tsegismo>
Date:   Mon Nov 18 12:15:35 2013 +0100

Comment 3 Simeon Pinder 2013-11-19 15:47:59 UTC
Moving to ON_QA as available for testing with new brew build.

Comment 4 Simeon Pinder 2013-11-22 05:13:36 UTC
Mass moving all of these from ER6 to target milestone ER07 since the ER6 build was bad and QE was halted for the same reason.

Comment 5 Radim Hatlapatka 2013-11-27 16:31:39 UTC
I am not able to hit the issue any more with JON 3.2.0.ER7