Description of problem:
I'm not sure what details to enter in this defect, nevertheless....
I have a tomcat6 server from EWS 1.0.1 inventoried on my rhq server, the tomcat server is running as root on a rhel 5 box with JAVA_HOME=/usr/lib/jvm/jre-openjdk/ When running the Tomcat Server operations for Shutdown from within rhq, the operation fails. The tomcat6 server never seems to get the message. The agent.log contains:
2010-06-15 16:50:36,930 WARN [ResourceDiscoveryComponent.invoker.daemon-32] (jboss.on.plugins.tomcat.TomcatDiscoveryComponent)- Failed to determine Tomcat Server Version Given:
VersionInfo:/root/JBossEWS101/jboss-ews-1.0/tomcat6/bin/catalina.sh: line 450: /usr/lib/jvm/java-1.6.0-openjdk-220.127.116.11.x86_64/bin/java: No such file or directory
The rhq operation failure gives the following stack trace:
java.lang.RuntimeException: Server failed to shutdown
at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source)
Here are some IRC chat notes with jshaughn:
<jshaughn> i'm wondering if there is missing environment vars or something, when run via the spawned process
<jsefler> my guess is that JAVA_HOME somehow gets confused
<jsefler> in the agent log where is says:
/usr/lib/jvm/java-1.6.0-openjdk-18.104.22.168.x86_64/bin/java: No such file or directory
if it had an extra jre in the path, then it might work...
but I don't know from where that path gets expanded.
<jshaughn> it's expecting a jdk install and its getting a jre install
that seems to be the issue
well, that's the issue for the version thing
that's a problem in and of itself. we're looking for a jdk bin dir and we're getting a jre
can you please BZ that
it's the cause of the unknown version
<jshaughn> you know what, hang on a sec
it;s not us expecting a jdk, perhaps it's tomcat
<jsefler> yeah - I think its TC
<jshaughn> well, I think TC's version script requires a JDK
but you said it was running for you
<jsefler> it runs for me as root from the command line
I wonder if TC5.5 required a JDK and not 6.0
<jshaughn> anyway, this has something to do with things in general
it may be a combination of JRE and symlink, or maybe just JRE. I'm not sure. I bet if JAVA_HOME pointed to a JDK everything would work.
Can you generate a BZ and assign it to regarding the fact that a JRE will fail for version determination. I'll use that to also check on the other stuff. I can see in our code that we are expecting a JDK. But if TC does not require a JDK then we are in error
<jshaughn> I want to try executing the scripts in TC5.5 and see if they work with just a JRE. I'm not sure where our expectation came from....
Version-Release number of selected component (if applicable):
JBoss Operations Network
build number: 10655
build number: cfa1b8d
Steps to Reproduce:
Thinking about this some more, my guess is that the rhq agent is setting its own JAVA_HOME within the spawned shell that it uses to then launch tomcat's shutdown.sh script. Guessing that this JAVA_HOME is different than the JAVA_HOME that tomcat is using, I think that it messes with the environment variables that are set/used in catalina.sh and the result is that catalina then cannot find java. Or something of this sort. When I manually run the tomcat6/bin scripts (startup.sh, shutdown.sh, version.sh, catalina,sh) from the command line with export JAVA_HOME=/usr/lib/jvm/jre-openjdk/ on RHEL5 all is well. For example:
[root@auto-rhq01 bin]# ./catalina.sh version
Using CATALINA_BASE: /root/JBossEWS100/jboss-ews-1.0/tomcat5
Using CATALINA_HOME: /root/JBossEWS100/jboss-ews-1.0/tomcat5
Using CATALINA_TMPDIR: /root/JBossEWS100/jboss-ews-1.0/tomcat5/temp
Using JRE_HOME: /usr/lib/jvm/jre-openjdk/
Server version: Apache Tomcat/5.5.23
Server built: Mar 25 2009 03:58:01
Server number: 22.214.171.124
OS Name: Linux
OS Version: 2.6.18-194.el5
JVM Version: 1.6.0_0-b16
JVM Vendor: Sun Microsystems Inc.
[root@auto-rhq01 bin]# env | grep JAVA
Yet when I run the Shutdown (or Restart) operations from within RHQ, the agent.log reports the following warning and never completes the operation:
2010-06-16 10:19:34,914 WARN [ResourceDiscoveryComponent.invoker.daemon-68] (jboss.on.plugins.tomcat.TomcatDiscoveryComponent)- Failed to determine Tomcat Server Version Given:
VersionInfo:/root/JBossEWS100/jboss-ews-1.0/tomcat6/bin/catalina.sh: line 362: /usr/lib/jvm/java-1.6.0-openjdk-126.96.36.199.x86_64/bin/java: No such file or directory
Notice the path it is using to launch java, (/usr/lib/jvm/java-1.6.0-openjdk-188.8.131.52.x86_64/bin/java), this path does not actually exist on the file system, however if you insert a jre into the path as follows, the file does exist.
[root@auto-rhq01 bin]# ls -l /usr/lib/jvm/java-1.6.0-openjdk-184.108.40.206.x86_64/jre/bin/java
-rwxr-xr-x 1 root root 42232 Mar 30 17:06 /usr/lib/jvm/java-1.6.0-openjdk-220.127.116.11.x86_64/jre/bin/java
The failure to run catalina.sh from the rhq-agent affects other areas too:
1. See the attached screenshot.
2. Response time measurement collection also fails as noted by the following WARNing in the agent.log
// 2010-06-16 10:22:00,928 WARN [MeasurementManager.collector-1] (rhq.core.pc.measurement.MeasurementCollectorRunner)- Failure to collect measurement data for Resource[id=10883, type=Tomcat Server, key=/root/JBossEWS100/jboss-ews-1.0/tomcat6, name=Tomcat (8180), parent=auto-rhq01.usersys.redhat.com, version=Unknown Version] - cause: org.rhq.core.pc.inventory.TimeoutException:Call to [org.jboss.on.plugins.tomcat.TomcatServerComponent.getValues()] with args [[org.rhq.core.domain.measurement.MeasurementReport@22bf991d, [ScheduledMeasurementInfo[res=10883, name=Catalina:type=Server:serverInfo, sched=19141]]]] timed out. Invocation thread will be interrupted
Created attachment 424506 [details]
Unknown version results when agent fails to run catalina.sh
*** Bug 598610 has been marked as a duplicate of this bug. ***
fix commit: 0225e8ee6395a730295e1af1fc903cd27d09b93e
This problem affected any script execution in the tomcat plugin. Most notably
server version detection and start and shutdown operations.
In short the problem resulted from the fact that the RHQ Agent supports
JAVA_HOME being set to a JRE. Tomcat does not and instead requires JRE_HOME to
be set in that situation. Also, the RHQ Agent's environment initializes the
process execution env. So, the TC plugin had to ensure not to forward on the
Agent's setting for JAVA_HOME, and needed to set JRE_HOME and JAVA_HOME as TC
As a note, Tomcat requires only a JRE starting with 5.5. Unless TC is being
run in debug, in which case a JDK is required.
I think this fix solves some fairly mysterious failure seen in the past. Good
I tested this with jon-server-2.4.0.GA_QA build and was able to shutdown apache-tomcat.
Marking this bug as verified.
I am moving this back to ON_QA -- this test wasn't as cut-and-dried as it appears. Doing so, if only to assure we have the correct, very specific environment in place to assure we've got the proper scenario covered.
build number: b9ca90d
$ git rev-list b9ca90d | grep 0225e8ee6395a730295e1af1fc903cd27d09b93e
The grep match indicates that this RHQ build includes the fix from Jay.
Note: verification was done on the same environment where the original problems were discovered.
* The Start/Stop/Restart operations are successfully completing on inventoried Tomcat5 and Tomcat6 servers.
* The Tomcat5 and Tomcat6 Version is successfully captured by RHQ
Could not yet verify in a JON build. Awaiting a newer build with fix included.
JBoss Operations Network
build number: 10745:647a602
Mass-closure of verified bugs against JON.