Bug 1015573

Summary: Increase timeout for domain startup.
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Petr Kremensky <pkremens>
Component: InstallerAssignee: Francisco Canas <fcanas>
Status: CLOSED CURRENTRELEASE QA Contact: Petr Kremensky <pkremens>
Severity: urgent Docs Contact: Russell Dickenson <rdickens>
Priority: unspecified    
Version: 6.2.0CC: fcanas, pkremens, thauser
Target Milestone: ER6Keywords: Regression
Target Release: EAP 6.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-12-15 16:20:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1007768    
Bug Blocks:    
Attachments:
Description Flags
Starting domain with installer in windows (6.2.0.ER4) none

Description Petr Kremensky 2013-10-04 14:10:27 UTC
Description of problem:
Domain start timeout is too low, installer will never start domain successfully.

Version-Release number of selected component (if applicable):
EAP 6.2.0.ER4

How reproducible:
always

Steps to Reproduce:
1. On server launch screen choose to start server in domain mode, continue to Processing screen using default values

Actual results:
Failed to start Server after 2 attempts.

Expected results:
Application Server successfully started.

Comment 1 Francisco Canas 2013-10-04 17:01:14 UTC
Hi Petr,

Can you comment with info on your environment? OS, jvm, maybe the machine specs? I have been unable to reproduce this, but it's definitely possible that on lower-spec machines the domain server start-up takes longer than the timeout we give it.

Comment 2 Petr Kremensky 2013-10-07 07:53:27 UTC
Created attachment 808741 [details]
Starting domain with installer in windows (6.2.0.ER4)

Comment 3 Petr Kremensky 2013-10-07 07:59:35 UTC
Hi Francisco,
I am able to reproduce this on my Lenovo T520 with Fedora 16, Oracle JDK.1.7.0_40 locally (not 100%), also on my virtual RHEL 6.4.

Moreover, trying this on one of windows host in our lab I'll get bunch of IAEs (see attachment 808741 [details]). I am able to start domain without any exception on the same host (W2k8r2-x86_64) with ER3 installer. No IAEs on Fedora/RHEL.

Comment 4 Petr Kremensky 2013-10-07 14:16:49 UTC
I am getting IAE message on Processing screen every time on windows even if I don't start server by installer (using just default values where it is possible).
W:\pkremens>java -jar jboss-eap-6.2.0.ER4-installer.jar

Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: bad position: 49
        at javax.swing.text.JTextComponent.setCaretPosition(JTextComponent.java:1678)
        at com.izforge.izpack.panels.ProcessPanel$1.run(ProcessPanel.java:253)
        at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:251)
        at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:733)
        at java.awt.EventQueue.access$200(EventQueue.java:103)
        at java.awt.EventQueue$3.run(EventQueue.java:694)
        at java.awt.EventQueue$3.run(EventQueue.java:692)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
        at java.awt.EventQueue.dispatchEvent(EventQueue.java:703)
        at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)
        at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)
        at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)
        at java.awt.EventDispatchThread.run(EventDispatchThread.java:91)

However I am able to finish installation successfully.

Shall I create you a new BZ for this?

Comment 5 Francisco Canas 2013-10-07 14:35:26 UTC
I think this IAE message is a completely different bug than the server timing out on some systems. So yes, you can make a new BZ for it. Thanks!

Comment 6 Francisco Canas 2013-10-07 15:47:06 UTC
I've increased the number of attempts made to connect to server during the domain jobs, and also made the default wait-between-tries 3 seconds instead of 1.

http://git.app.eng.bos.redhat.com/jbossas-installer.git/commit/?h=eap-6.2&id=2772f77dc359f272718bd073d307d94c991bf7a1
http://git.app.eng.bos.redhat.com/jbossas-installer.git/commit/?h=eap-6.2&id=921445ef646fabc4a18c73f5625bdee5d60513b7

Note: We may have to find a better solution then polling the server repeatedly until it's responded or timed out, as sometimes the server responds with an HTTP OK before all of its subsystems are actually ready and this could lead to failure on slower systems that take longer to start up. This happens more in domain mode.

Comment 7 Petr Kremensky 2013-10-08 07:08:33 UTC
I am able to start domain via installer on all platforms. But on windows installer still makes only 3 attempts (Checking if server is ready...(1/4).). Is this problem only on my environment, or can you reproduce this as well?

Comment 8 Petr Kremensky 2013-10-08 07:53:45 UTC
You may also try to grep server logs for "Started in" string.

for standalone:
jboss-eap-6.2/standalone/log/server.log
for domain:
jboss-eap-6.2/domain/servers/server-one/log/server.log
jboss-eap-6.2/domain/servers/server-two/log/server.log

It's a common practice in our tests to combine these two approaches (HTTP == OK && "Started in") in waitForServerStart() methods.

Comment 9 Petr Kremensky 2013-10-09 09:58:50 UTC
I am returning this to assigned, however we can close this once we solve BZ1007768

Comment 10 Francisco Canas 2013-10-10 19:53:51 UTC
We've now implemented something similar, where we scan for particular server codes in the output of the server start-up jobs in order to determine whether it's ready or not. If the expected code never appears, we do one final check of the management or http interfaces. Only if both of those methods fail do we then timeout and output a fail message. 

In addition we'll be increasing the number of tries, as the domain server can really take a long time to completely ready itself on some systems. But with frequent tries, this won't impact speedier systems much.

See this commit, and the 3 other commits directly following it for details:
http://git.app.eng.bos.redhat.com/jbossas-installer.git/commit/?h=eap-6.2&id=d80226536782768612ac2f82fbe39ce70d967689

Comment 11 Petr Kremensky 2013-10-18 13:27:41 UTC
Verified on EAP 6.2.0.ER6 installer. 

This fix solved also BZ982191 as starting domain on windows with JDK 1.6 will hang no more.