Created attachment 918356 [details]
Description of problem:
When applying Cummulative Patch 1 on JON 3.2.0 server, the following exception is logged in update.log file:
org.jboss.as.cli.CliInitializationException: Failed to connect to the controller
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
Caused by: org.jboss.as.cli.CommandLineException: The controller is not available at localhost:6999
... 8 more
Caused by: java.io.IOException: java.net.ConnectException: JBAS012144: Could not connect to remote://localhost:6999. The connection timed out
... 11 more
Caused by: java.net.ConnectException: JBAS012144: Could not connect to remote://localhost:6999. The connection timed out
... 13 more
The server seems to work fine after it is started later, though this exception does not appera in RHEL 6.5 environment so there seems to be inconsistency in the upgrade process.
Version-Release number of selected component (if applicable):
JON 3.2.0 GA + JON 3.2.0.GA Update 01
Steps to Reproduce:
1. Install JON 3.2.0 GA in Windows environment (Windows Server 2008)
2. Apply CP1
3. Check 'update.log' file
The exception is logged.
No error message is logged.
The same issue happens while applying CP2 on top of CP1.
Do you have a log for CP2? The log I see attached does not seem to match the script I see in the git repo.
Yes I do, I am attaching it as update-cp2.log. Which script in the git repo do you have in mind?
Created attachment 920108 [details]
This should not be targeted to 3.3. It's a CP issue, it has to be fixed in a future CP. I'll set to 3.2.3 since 3.2.2 just went out.
The issue seems to be that this test system starts the JON server (actually, just the underlying EAP, to the point where it will accept connections from the jboss cli) much more slowly than we anticipated. If that is right, and given that fact, I doubt this will hit a production system, or most test systems (my Win7 laptop had no issues).
*** For Jan ***
I think the fix is to just "sleep" longer before we attempt the CLI connection. To test this please try and to apply CP2 again, but this time, prior to executing please edit apply-updates.bat, line 375 (the ping command). Change "21" to "46". This will more than double the wait time before we attempt to connect.
When this exception occurs the CP files have been completely applied. The only patch not applied is the "no-tx-separate-pool" fix, useful only for Oracle installs.
If the re-test works with the longer sleep, we'll just want to update the real script for the CP3. One note - this same fix is totally applicable to the .sh script, which uses the same 20s sleep currently and would suffer the same fate for a slow startup.
If for some reason this does not solve the issue, we need to look further into why the connection does not happen.
I tried the update process once more after changing the 'sleep' time from "21" to "46", both for JON 3.2.0 GA -> CP1 and CP1 -> CP2, but the same exception is still being logged (in both cases).
According to the timestamps, shortly after the update is applied the server itself also logs a few exceptions while being started - I added the attachment 'excerpt-server.log'
Created attachment 921730 [details]
Excerpt from server's log file after applying CP2
Jan, if you have the complete update and server logs please attach. Thanks, -Jay
Here you go - I am attaching both update logs (CP1 and CP2) and full server's log.
Created attachment 922037 [details]
Server's log file
Created attachment 922038 [details]
Update log after applying CP1
Created attachment 922039 [details]
Update log after applying CP2
Jan, since this is closed, I attached an updated version of apply-updates.bat to bug 1084009. Please re-test 3.2.2 update using that batch file and post the results over there. Thanks, jay
Adding a comment to remove needinfo flag. The results of testing this BZ are posted here: BZ 1084009.