Bug 1058267
Summary: | RHQ installation on JDK1.6 fails with 'Cannot obtain client connection to the RHQ app server!!' | |||
---|---|---|---|---|
Product: | [Other] RHQ Project | Reporter: | Filip Brychta <fbrychta> | |
Component: | Installer | Assignee: | Jirka Kremser <jkremser> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Mike Foley <mfoley> | |
Severity: | urgent | Docs Contact: | ||
Priority: | high | |||
Version: | 4.10 | CC: | hrupp, jkremser | |
Target Milestone: | --- | |||
Target Release: | RHQ 4.10 | |||
Hardware: | All | |||
OS: | All | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1063364 (view as bug list) | Environment: | ||
Last Closed: | 2014-04-23 12:30:01 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1063364 |
Description
Filip Brychta
2014-01-27 11:53:18 UTC
Update: This issue is related to JDK1.6. Installation works correctly on JDK1.7 Fails on: [hudson@last-rhq-server bin]$ java -version java version "1.6.0_24" OpenJDK Runtime Environment (IcedTea6 1.11.1) (rhel-1.45.1.11.1.el6-x86_64) OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) Works on: [hudson@last-rhq-server bin]$ java -version java version "1.7.0_03-icedtea" OpenJDK Runtime Environment (rhel-2.1.el6.7-x86_64) OpenJDK 64-Bit Server VM (build 22.0-b10, mixed mode) This is the first build with this issue http://hudson.qa.jboss.com/hudson/view/RHQ/job/rhq-master-gwt-locales/967/ Can be replicated also on HotSpot JVM (1.6.0_24). The build above (comment 2) contains 86 changes, by using git bisect, it is in the worse case ceil(log_2(86)) = 7 "build-install-check" cycles. It would be nice to have some established tool based on foreman/jenkins to do this automatically. I am investigating the commits, looking for the culprit.. In commit d6af564d06ce769d1f, the JVM param ("-XX:StringTableSize=1000003") was added that causes this issue. Java 6 has apparently problem with this parameter. In Java 7 and higher, the pooled strings are stored on heap, while on Java 6 and lower it is on permgen space. When running a server with 5 attached agents (4 of them having EAP in full-ha profile and 1 having the RHQ itself), omitting this JVM param and running it on Java 6, the perm gen was enough to handle this situation. However, in bigger environments with multiple EAPs monitored by 1 agent this could be an issue. Luckily, most of the _different_ strings comes from the plugin descriptor => multiple resources of the same type does not increase the number of memory significantly. So the potential OOM could happen when having large number of different monitored resources (a lot of different plugins, a lot of resource types). For EAPs the agent's permgen was constantly about 37 megs, the server's permgen was about 50 megs. To make it run on Java 6 (which we currently support), I could add this workaround: _JAVA_VERSION=`java -version 2>&1 | grep "java version" | sed -e 's/java version \"1\.\([0-9]\).*/\1/g'` if [ "$_JAVA_VERSION" -lt "7" ]; then echo "lower than 7" else echo "equal or higher than 7" fi # ^ works for OpenJDK, IBM java and HotSpot, all it requires is that "java -version" returns (among other) a line containing 'java version "1.X' 1) to our rhq-server.sh and rhq-agent.sh or 2) remove the JVM param completely (until we abandon Java6 support) with the risk that perm gen could not be enough and user should increase it (especially for agent having multiple plugins that monitor various kinds or resources) For now, I'll push a commit doing the 2) approach from comment 6, because QE are not able to run their tests. If necessary, I've got the 1) solution/hack prepared as well. This has the benefit that QE test suite uses still Java 6 and can discover any potential OOM errors (because of the insufficient PermGen) before this goes public. Just for the record, if the param is not specified, on Java 7, the default behavior is that it defaults to a lower number (1009), so in the worst case scenario, multiple Strings ends up in the same bucket making the time complexity of String lookup closer to O(n) [instead of O(1)]. So the number is not any hard-limit for number of pooled strings or anything similar. branch: master link: http://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=8f0ce053c time: 2014-02-10 16:46:14 +0100 commit: 8f0ce053c09a02886f670eff6d0295c0c62ced19 author: Jirka Kremser - jkremser message: [BZ 1058267] - RHQ installation on JDK1.6 fails with 'Cannot obtain client connection to the RHQ app server!! - removing the JVM parameter for increasing the size of a hashtable, where Strings are pooled (after calling .intern() on them), because this is not supported on Java 6. This commit can be reverted later on when not supporting Java 6 or when using another solution (check for Java version in the bash scripts) Heiko, is this ^ "solution" ok with you or would you prefer the 2) master 9000242dd Bulk closing of 4.10 issues. If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10. For future RHQ versions we should only support jdk7+ where this flag is needed. Jdk8 actually has a better default setting and since a certain version even automatic string-deduplication. I guess the current default is ok for now. |