Description of problem: $Summary Version-Release number of selected component (if applicable): rhq-server-4.10.0-SNAPSHOT 4f60fa0d3c7a How reproducible: Always Steps to Reproduce: 1. unzip rhq-server-4.10.0-SNAPSHOT.zip 2. cd rhq-server-4.10.0-SNAPSHOT/bin/ 3. ./rhqctl install 4. set jboss.bind.address to 0.0.0.0 Actual results: RHQ storage is installed correctly but RHQ server installation failed with: 06:46:15,286 INFO [org.rhq.server.control.command.Install] The RHQ Server must be started to complete its installation. Starting the RHQ server in preparation of running the server installer... 06:46:15,304 INFO [org.rhq.server.control.command.Install] Waiting for the RHQ Server to start in preparation of running the server installer... Trying to start the RHQ Server... RHQ Server (pid 6847 ) is ✘ down Failed to start - make sure the RHQ Server is fully configured properly 06:46:20,850 INFO [org.jboss.modules] JBoss Modules version 1.2.0.CR1 06:46:20,971 INFO [org.rhq.enterprise.server.installer.InstallerServiceImpl] The server is preconfigured and ready for auto-install. 06:46:21,055 INFO [org.xnio] XNIO Version 3.0.7.GA 06:46:21,066 INFO [org.xnio.nio] XNIO NIO Implementation Version 3.0.7.GA 06:46:21,074 INFO [org.jboss.remoting] JBoss Remoting version 3.2.14.GA 06:46:31,338 ERROR [org.rhq.enterprise.server.installer.Installer] The installer will now exit due to previous errors: java.lang.Exception: Cannot obtain client connection to the RHQ app server!! at org.rhq.enterprise.server.installer.InstallerServiceImpl.testModelControllerClient(InstallerServiceImpl.java:1100) [rhq-installer-util-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT] at org.rhq.enterprise.server.installer.InstallerServiceImpl.preInstall(InstallerServiceImpl.java:217) [rhq-installer-util-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT] at org.rhq.enterprise.server.installer.InstallerServiceImpl.test(InstallerServiceImpl.java:142) [rhq-installer-util-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT] at org.rhq.enterprise.server.installer.Installer.doInstall(Installer.java:90) [rhq-installer-util-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT] at org.rhq.enterprise.server.installer.Installer.main(Installer.java:57) [rhq-installer-util-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.6.0_24] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) [rt.jar:1.6.0_24] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.6.0_24] at java.lang.reflect.Method.invoke(Method.java:616) [rt.jar:1.6.0_24] at org.jboss.modules.Module.run(Module.java:262) [jboss-modules.jar:1.2.0.CR1] at org.jboss.modules.Main.main(Main.java:329) [jboss-modules.jar:1.2.0.CR1] Caused by: java.io.IOException: java.net.ConnectException: JBAS012144: Could not connect to remote://127.0.0.1:9999. The connection timed out at org.jboss.as.controller.client.impl.AbstractModelControllerClient.executeForResult(AbstractModelControllerClient.java:129) [jboss-as-controller-client-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.controller.client.impl.AbstractModelControllerClient.execute(AbstractModelControllerClient.java:81) [jboss-as-controller-client-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.rhq.common.jbossas.client.controller.JBossASClient.execute(JBossASClient.java:270) [rhq-jboss-as-dmr-client-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT] at org.rhq.common.jbossas.client.controller.CoreJBossASClient.getSystemProperties(CoreJBossASClient.java:103) [rhq-jboss-as-dmr-client-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT] at org.rhq.enterprise.server.installer.InstallerServiceImpl.testModelControllerClient(InstallerServiceImpl.java:1051) [rhq-installer-util-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT] ... 10 more Caused by: java.net.ConnectException: JBAS012144: Could not connect to remote://127.0.0.1:9999. The connection timed out at org.jboss.as.protocol.ProtocolConnectionUtils.connectSync(ProtocolConnectionUtils.java:130) [jboss-as-protocol-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.protocol.ProtocolConnectionManager$EstablishingConnection.connect(ProtocolConnectionManager.java:256) [jboss-as-protocol-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.protocol.ProtocolConnectionManager.connect(ProtocolConnectionManager.java:70) [jboss-as-protocol-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.protocol.mgmt.FutureManagementChannel$Establishing.getChannel(FutureManagementChannel.java:176) [jboss-as-protocol-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.controller.client.impl.RemotingModelControllerClient.getOrCreateChannel(RemotingModelControllerClient.java:144) [jboss-as-controller-client-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.controller.client.impl.RemotingModelControllerClient$1.getChannel(RemotingModelControllerClient.java:65) [jboss-as-controller-client-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.protocol.mgmt.ManagementChannelHandler.executeRequest(ManagementChannelHandler.java:115) [jboss-as-protocol-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.protocol.mgmt.ManagementChannelHandler.executeRequest(ManagementChannelHandler.java:98) [jboss-as-protocol-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.controller.client.impl.AbstractModelControllerClient.executeRequest(AbstractModelControllerClient.java:236) [jboss-as-controller-client-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.controller.client.impl.AbstractModelControllerClient.execute(AbstractModelControllerClient.java:141) [jboss-as-controller-client-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] at org.jboss.as.controller.client.impl.AbstractModelControllerClient.executeForResult(AbstractModelControllerClient.java:127) [jboss-as-controller-client-7.2.0.Alpha1-redhat-4.jar:7.2.0.Alpha1-redhat-4] ... 14 more 06:46:31,352 ERROR [org.rhq.server.control.command.Install] An error occurred while starting the RHQ server: Process exited with an error: 2 (Exit value: 2) RHQ Server (pid 6847 ) is ✘ down Expected results: Installation is succesfull
Update: This issue is related to JDK1.6. Installation works correctly on JDK1.7 Fails on: [hudson@last-rhq-server bin]$ java -version java version "1.6.0_24" OpenJDK Runtime Environment (IcedTea6 1.11.1) (rhel-1.45.1.11.1.el6-x86_64) OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) Works on: [hudson@last-rhq-server bin]$ java -version java version "1.7.0_03-icedtea" OpenJDK Runtime Environment (rhel-2.1.el6.7-x86_64) OpenJDK 64-Bit Server VM (build 22.0-b10, mixed mode)
This is the first build with this issue http://hudson.qa.jboss.com/hudson/view/RHQ/job/rhq-master-gwt-locales/967/
Can be replicated also on HotSpot JVM (1.6.0_24).
The build above (comment 2) contains 86 changes, by using git bisect, it is in the worse case ceil(log_2(86)) = 7 "build-install-check" cycles. It would be nice to have some established tool based on foreman/jenkins to do this automatically. I am investigating the commits, looking for the culprit..
In commit d6af564d06ce769d1f, the JVM param ("-XX:StringTableSize=1000003") was added that causes this issue. Java 6 has apparently problem with this parameter. In Java 7 and higher, the pooled strings are stored on heap, while on Java 6 and lower it is on permgen space.
When running a server with 5 attached agents (4 of them having EAP in full-ha profile and 1 having the RHQ itself), omitting this JVM param and running it on Java 6, the perm gen was enough to handle this situation. However, in bigger environments with multiple EAPs monitored by 1 agent this could be an issue. Luckily, most of the _different_ strings comes from the plugin descriptor => multiple resources of the same type does not increase the number of memory significantly. So the potential OOM could happen when having large number of different monitored resources (a lot of different plugins, a lot of resource types). For EAPs the agent's permgen was constantly about 37 megs, the server's permgen was about 50 megs. To make it run on Java 6 (which we currently support), I could add this workaround: _JAVA_VERSION=`java -version 2>&1 | grep "java version" | sed -e 's/java version \"1\.\([0-9]\).*/\1/g'` if [ "$_JAVA_VERSION" -lt "7" ]; then echo "lower than 7" else echo "equal or higher than 7" fi # ^ works for OpenJDK, IBM java and HotSpot, all it requires is that "java -version" returns (among other) a line containing 'java version "1.X' 1) to our rhq-server.sh and rhq-agent.sh or 2) remove the JVM param completely (until we abandon Java6 support) with the risk that perm gen could not be enough and user should increase it (especially for agent having multiple plugins that monitor various kinds or resources)
For now, I'll push a commit doing the 2) approach from comment 6, because QE are not able to run their tests. If necessary, I've got the 1) solution/hack prepared as well. This has the benefit that QE test suite uses still Java 6 and can discover any potential OOM errors (because of the insufficient PermGen) before this goes public. Just for the record, if the param is not specified, on Java 7, the default behavior is that it defaults to a lower number (1009), so in the worst case scenario, multiple Strings ends up in the same bucket making the time complexity of String lookup closer to O(n) [instead of O(1)]. So the number is not any hard-limit for number of pooled strings or anything similar. branch: master link: http://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=8f0ce053c time: 2014-02-10 16:46:14 +0100 commit: 8f0ce053c09a02886f670eff6d0295c0c62ced19 author: Jirka Kremser - jkremser message: [BZ 1058267] - RHQ installation on JDK1.6 fails with 'Cannot obtain client connection to the RHQ app server!! - removing the JVM parameter for increasing the size of a hashtable, where Strings are pooled (after calling .intern() on them), because this is not supported on Java 6. This commit can be reverted later on when not supporting Java 6 or when using another solution (check for Java version in the bash scripts) Heiko, is this ^ "solution" ok with you or would you prefer the 2)
master 9000242dd
Bulk closing of 4.10 issues. If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10.
For future RHQ versions we should only support jdk7+ where this flag is needed. Jdk8 actually has a better default setting and since a certain version even automatic string-deduplication. I guess the current default is ok for now.