1031100 – Domain mode: Restart of the managed server fails with read timed out

Bug 1031100 - Domain mode: Restart of the managed server fails with read timed out

Summary: Domain mode: Restart of the managed server fails with read timed out

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Operations Network
Classification:	JBoss
Component:	Plugin -- JBoss EAP 6
Sub Component:
Version:	JON 3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	ER07
Target Release:	JON 3.2.0
Assignee:	Thomas Segismont
QA Contact:	Mike Foley
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1012435
TreeView+	depends on / blocked

Reported:	2013-11-15 15:42 UTC by Radim Hatlapatka
Modified:	2014-01-02 20:37 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:
Type:	Bug
Embargoed:

Attachments	(Terms of Use)

Description Radim Hatlapatka 2013-11-15 15:42:51 UTC

Description of problem:
There is 20s limit for restarting server (20s it took to JON server to report the read timed out error). But sometimes it takes longer thus ending up with error [1]. For example in my test it took 30s


[1]
java.lang.Exception: Read timed out, rolled-back=false, rolled-back=false at org.rhq.core.pc.operation.OperationInvocation.run(OperationInvocation.java:278) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724)

Version-Release number of selected component (if applicable): JON 3.2.0.ER5 vs EAP 6.1.1


How reproducible: 10 % (mostly happens on solaris sparc 10)


Steps to Reproduce:
1. Import EAP 6 in domain mode
2. restart one of the managed servers (e.g. server-one)


Actual results:
restart shown as failed due Read timed out because restart took longer than 20s 

Expected results:
No failure if restart is successful even if it took a little bit longer


Additional info:
The best way would be to make similar fix as was done for https://bugzilla.redhat.com/show_bug.cgi?id=911327

It could be good also to increase the timeout for stop and start operation on the managed server

Comment 1 Thomas Segismont 2013-11-18 11:19:11 UTC

Fixed in master

commit 09d3e4fc3fd3350364a2eca08ad8714202d9f48d
Author: Thomas Segismont <tsegismo>
Date:   Mon Nov 18 12:15:35 2013 +0100

Comment 2 Thomas Segismont 2013-11-18 13:46:46 UTC

Cherry-picked to release/jon3.2.x

commit e14361863f517bca0e8b3208f1f424200768b777
Author: Thomas Segismont <tsegismo>
Date:   Mon Nov 18 12:15:35 2013 +0100

Comment 3 Simeon Pinder 2013-11-19 15:47:59 UTC

Moving to ON_QA as available for testing with new brew build.

Comment 4 Simeon Pinder 2013-11-22 05:13:36 UTC

Mass moving all of these from ER6 to target milestone ER07 since the ER6 build was bad and QE was halted for the same reason.

Comment 5 Radim Hatlapatka 2013-11-27 16:31:39 UTC

I am not able to hit the issue any more with JON 3.2.0.ER7

Note You need to log in before you can comment on or make changes to this bug.