Red Hat Bugzilla – Bug 1015573
Increase timeout for domain startup.
Last modified: 2014-09-03 00:57:17 EDT
Description of problem:
Domain start timeout is too low, installer will never start domain successfully.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. On server launch screen choose to start server in domain mode, continue to Processing screen using default values
Failed to start Server after 2 attempts.
Application Server successfully started.
Can you comment with info on your environment? OS, jvm, maybe the machine specs? I have been unable to reproduce this, but it's definitely possible that on lower-spec machines the domain server start-up takes longer than the timeout we give it.
Created attachment 808741 [details]
Starting domain with installer in windows (6.2.0.ER4)
I am able to reproduce this on my Lenovo T520 with Fedora 16, Oracle JDK.1.7.0_40 locally (not 100%), also on my virtual RHEL 6.4.
Moreover, trying this on one of windows host in our lab I'll get bunch of IAEs (see attachment 808741 [details]). I am able to start domain without any exception on the same host (W2k8r2-x86_64) with ER3 installer. No IAEs on Fedora/RHEL.
I am getting IAE message on Processing screen every time on windows even if I don't start server by installer (using just default values where it is possible).
W:\pkremens>java -jar jboss-eap-6.2.0.ER4-installer.jar
Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: bad position: 49
at java.security.AccessController.doPrivileged(Native Method)
However I am able to finish installation successfully.
Shall I create you a new BZ for this?
I think this IAE message is a completely different bug than the server timing out on some systems. So yes, you can make a new BZ for it. Thanks!
I've increased the number of attempts made to connect to server during the domain jobs, and also made the default wait-between-tries 3 seconds instead of 1.
Note: We may have to find a better solution then polling the server repeatedly until it's responded or timed out, as sometimes the server responds with an HTTP OK before all of its subsystems are actually ready and this could lead to failure on slower systems that take longer to start up. This happens more in domain mode.
I am able to start domain via installer on all platforms. But on windows installer still makes only 3 attempts (Checking if server is ready...(1/4).). Is this problem only on my environment, or can you reproduce this as well?
You may also try to grep server logs for "Started in" string.
It's a common practice in our tests to combine these two approaches (HTTP == OK && "Started in") in waitForServerStart() methods.
I am returning this to assigned, however we can close this once we solve BZ1007768
We've now implemented something similar, where we scan for particular server codes in the output of the server start-up jobs in order to determine whether it's ready or not. If the expected code never appears, we do one final check of the management or http interfaces. Only if both of those methods fail do we then timeout and output a fail message.
In addition we'll be increasing the number of tries, as the domain server can really take a long time to completely ready itself on some systems. But with frequent tries, this won't impact speedier systems much.
See this commit, and the 3 other commits directly following it for details:
Verified on EAP 6.2.0.ER6 installer.
This fix solved also BZ982191 as starting domain on windows with JDK 1.6 will hang no more.