Bug 1120090

Summary: CliInitializationException logged in update.log during applying CP1 in Windows environment
Product: [JBoss] JBoss Operations Network Reporter: Jan Bednarik <jbednari>
Component: UpgradeAssignee: Jay Shaughnessy <jshaughn>
Status: CLOSED DUPLICATE QA Contact: Mike Foley <mfoley>
Severity: high Docs Contact:
Priority: high    
Version: JON 3.2.1CC: jbednari, jshaughn, mkoci, myarboro
Target Milestone: ---   
Target Release: JON 3.2.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-25 14:40:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
update.log
none
update-cp2.log
none
Excerpt from server's log file after applying CP2
none
Server's log file
none
Update log after applying CP1
none
Update log after applying CP2 none

Description Jan Bednarik 2014-07-16 09:14:15 UTC
Created attachment 918356 [details]
update.log

Description of problem:
When applying Cummulative Patch 1 on JON 3.2.0 server, the following exception is logged in update.log file:

org.jboss.as.cli.CliInitializationException: Failed to connect to the controller
	at org.jboss.as.cli.impl.CliLauncher.initCommandContext(CliLauncher.java:280)
	at org.jboss.as.cli.impl.CliLauncher.main(CliLauncher.java:240)
	at org.jboss.as.cli.CommandLineMain.main(CommandLineMain.java:34)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.jboss.modules.Module.run(Module.java:270)
	at org.jboss.modules.Main.main(Main.java:411)
Caused by: org.jboss.as.cli.CommandLineException: The controller is not available at localhost:6999
	at org.jboss.as.cli.impl.CommandContextImpl.tryConnection(CommandContextImpl.java:951)
	at org.jboss.as.cli.impl.CommandContextImpl.connectController(CommandContextImpl.java:790)
	at org.jboss.as.cli.impl.CommandContextImpl.connectController(CommandContextImpl.java:766)
	at org.jboss.as.cli.impl.CliLauncher.initCommandContext(CliLauncher.java:278)
	... 8 more
Caused by: java.io.IOException: java.net.ConnectException: JBAS012144: Could not connect to remote://localhost:6999. The connection timed out
	at org.jboss.as.controller.client.impl.AbstractModelControllerClient.executeForResult(AbstractModelControllerClient.java:129)
	at org.jboss.as.controller.client.impl.AbstractModelControllerClient.execute(AbstractModelControllerClient.java:71)
	at org.jboss.as.cli.impl.CommandContextImpl.tryConnection(CommandContextImpl.java:929)
	... 11 more
Caused by: java.net.ConnectException: JBAS012144: Could not connect to remote://localhost:6999. The connection timed out
	at org.jboss.as.protocol.ProtocolConnectionUtils.connectSync(ProtocolConnectionUtils.java:131)
	at org.jboss.as.protocol.ProtocolConnectionManager$EstablishingConnection.connect(ProtocolConnectionManager.java:256)
	at org.jboss.as.protocol.ProtocolConnectionManager.connect(ProtocolConnectionManager.java:70)
	at org.jboss.as.protocol.mgmt.FutureManagementChannel$Establishing.getChannel(FutureManagementChannel.java:176)
	at org.jboss.as.cli.impl.CLIModelControllerClient.getOrCreateChannel(CLIModelControllerClient.java:156)
	at org.jboss.as.cli.impl.CLIModelControllerClient$2.getChannel(CLIModelControllerClient.java:118)
	at org.jboss.as.protocol.mgmt.ManagementChannelHandler.executeRequest(ManagementChannelHandler.java:115)
	at org.jboss.as.protocol.mgmt.ManagementChannelHandler.executeRequest(ManagementChannelHandler.java:90)
	at org.jboss.as.controller.client.impl.AbstractModelControllerClient.executeRequest(AbstractModelControllerClient.java:236)
	at org.jboss.as.controller.client.impl.AbstractModelControllerClient.execute(AbstractModelControllerClient.java:141)
	at org.jboss.as.controller.client.impl.AbstractModelControllerClient.executeForResult(AbstractModelControllerClient.java:127)
	... 13 more

The server seems to work fine after it is started later, though this exception does not appera in RHEL 6.5 environment so there seems to be inconsistency in the upgrade process.

Version-Release number of selected component (if applicable):
JON 3.2.0 GA + JON 3.2.0.GA Update 01

How reproducible:
always

Steps to Reproduce:
1. Install JON 3.2.0 GA in Windows environment (Windows Server 2008)
2. Apply CP1
3. Check 'update.log' file

Actual results:
The exception is logged.

Expected results:
No error message is logged.

Additional info:

Comment 1 Jan Bednarik 2014-07-16 09:28:29 UTC
The same issue happens while applying CP2 on top of CP1.

Comment 2 Jay Shaughnessy 2014-07-22 20:48:13 UTC
Do you have a log for CP2? The log I see attached does not seem to match the script I see in the git repo.

Comment 3 Jan Bednarik 2014-07-23 06:53:30 UTC
Yes I do, I am attaching it as update-cp2.log. Which script in the git repo do you have in mind?

Comment 4 Jan Bednarik 2014-07-23 06:54:04 UTC
Created attachment 920108 [details]
update-cp2.log

Comment 5 Jay Shaughnessy 2014-07-23 13:23:00 UTC
This should not be targeted to 3.3.  It's a CP issue, it has to be fixed in a future CP.  I'll set to 3.2.3 since 3.2.2 just went out.

The issue seems to be that this test system starts the JON server (actually, just the underlying EAP, to the point where it will accept connections from the jboss cli) much more slowly than we anticipated.  If that is right, and given that fact, I doubt this will hit a production system, or most test systems (my Win7 laptop had no issues).

*** For Jan ***
I think the fix is to just "sleep" longer before we attempt the CLI connection.  To test this please try and to apply CP2 again, but this time, prior to executing please edit apply-updates.bat, line 375 (the ping command).  Change "21" to "46".  This will more than double the wait time before we attempt to connect.
***************

When this exception occurs the CP files have been completely applied. The only patch not applied is the "no-tx-separate-pool" fix, useful only for Oracle installs.

If the re-test works with the longer sleep, we'll just want to update the real script for the CP3.  One note - this same fix is totally applicable to the .sh script, which uses the same 20s sleep currently and would suffer the same fate for a slow startup.

If for some reason this does not solve the issue, we need to look further into why the connection does not happen.

Comment 7 Jan Bednarik 2014-07-28 08:54:17 UTC
I tried the update process once more after changing the 'sleep' time from "21"  to "46", both for JON 3.2.0 GA -> CP1 and CP1 -> CP2, but the same exception is still being logged (in both cases).

According to the timestamps, shortly after the update is applied the server itself also logs a few exceptions while being started - I added the attachment 'excerpt-server.log'

Comment 8 Jan Bednarik 2014-07-28 08:54:59 UTC
Created attachment 921730 [details]
Excerpt from server's log file after applying CP2

Comment 9 Jay Shaughnessy 2014-07-28 15:03:40 UTC
Jan, if you have the complete update and server logs please attach. Thanks, -Jay

Comment 10 Jan Bednarik 2014-07-29 06:50:04 UTC
Here you go - I am attaching both update logs (CP1 and CP2) and full server's log.

Comment 11 Jan Bednarik 2014-07-29 06:50:49 UTC
Created attachment 922037 [details]
Server's log file

Comment 12 Jan Bednarik 2014-07-29 06:51:29 UTC
Created attachment 922038 [details]
Update log after applying CP1

Comment 13 Jan Bednarik 2014-07-29 06:51:54 UTC
Created attachment 922039 [details]
Update log after applying CP2

Comment 14 Jay Shaughnessy 2014-07-31 19:16:52 UTC
Jan, since this is closed, I attached an updated version of apply-updates.bat to bug 1084009.  Please re-test 3.2.2 update using that batch file and post the results over there.  Thanks,  jay

Comment 15 Jan Bednarik 2014-09-08 11:21:11 UTC
Adding a comment to remove needinfo flag. The results of testing this BZ are posted here: BZ 1084009.