Description of problem: I'm not sure what details to enter in this defect, nevertheless.... I have a tomcat6 server from EWS 1.0.1 inventoried on my rhq server, the tomcat server is running as root on a rhel 5 box with JAVA_HOME=/usr/lib/jvm/jre-openjdk/ When running the Tomcat Server operations for Shutdown from within rhq, the operation fails. The tomcat6 server never seems to get the message. The agent.log contains: 2010-06-15 16:50:36,930 WARN [ResourceDiscoveryComponent.invoker.daemon-32] (jboss.on.plugins.tomcat.TomcatDiscoveryComponent)- Failed to determine Tomcat Server Version Given: VersionInfo:/root/JBossEWS101/jboss-ews-1.0/tomcat6/bin/catalina.sh: line 450: /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin/java: No such file or directory catalinaHome: /root/JBossEWS101/jboss-ews-1.0/tomcat6 Script:/root/JBossEWS101/jboss-ews-1.0/tomcat6/bin/version.sh timeout=10000 The rhq operation failure gives the following stack trace: java.lang.RuntimeException: Server failed to shutdown at org.jboss.on.plugins.tomcat.TomcatServerOperationsDelegate.shutdown(TomcatServerOperationsDelegate.java:281) at org.jboss.on.plugins.tomcat.TomcatServerOperationsDelegate.shutdown(TomcatServerOperationsDelegate.java:274) at org.jboss.on.plugins.tomcat.TomcatServerOperationsDelegate.invoke(TomcatServerOperationsDelegate.java:128) at org.jboss.on.plugins.tomcat.TomcatServerComponent.invokeOperation(TomcatServerComponent.java:420) at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.rhq.core.pc.inventory.ResourceContainer$ComponentInvocationThread.call(ResourceContainer.java:525) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:636) Here are some IRC chat notes with jshaughn: <jshaughn> i'm wondering if there is missing environment vars or something, when run via the spawned process <jsefler> my guess is that JAVA_HOME somehow gets confused <jshaughn> yeah <jsefler> in the agent log where is says: /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin/java: No such file or directory if it had an extra jre in the path, then it might work... /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java but I don't know from where that path gets expanded. <jshaughn> it's expecting a jdk install and its getting a jre install that seems to be the issue well, that's the issue for the version thing that's a problem in and of itself. we're looking for a jdk bin dir and we're getting a jre can you please BZ that it's the cause of the unknown version <jshaughn> you know what, hang on a sec it;s not us expecting a jdk, perhaps it's tomcat <jsefler> yeah - I think its TC <jshaughn> well, I think TC's version script requires a JDK but you said it was running for you <jsefler> it runs for me as root from the command line <jshaughn> curious I wonder if TC5.5 required a JDK and not 6.0 <jshaughn> anyway, this has something to do with things in general it may be a combination of JRE and symlink, or maybe just JRE. I'm not sure. I bet if JAVA_HOME pointed to a JDK everything would work. Can you generate a BZ and assign it to regarding the fact that a JRE will fail for version determination. I'll use that to also check on the other stuff. I can see in our code that we are expecting a JDK. But if TC does not require a JDK then we are in error <jshaughn> I want to try executing the scripts in TC5.5 and see if they work with just a JRE. I'm not sure where our expectation came from.... Version-Release number of selected component (if applicable): JBoss Operations Network version: 2.4.0-SNAPSHOT build number: 10655 RHQ version: 3.0.0-SNAPSHOT build number: cfa1b8d How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Thinking about this some more, my guess is that the rhq agent is setting its own JAVA_HOME within the spawned shell that it uses to then launch tomcat's shutdown.sh script. Guessing that this JAVA_HOME is different than the JAVA_HOME that tomcat is using, I think that it messes with the environment variables that are set/used in catalina.sh and the result is that catalina then cannot find java. Or something of this sort. When I manually run the tomcat6/bin scripts (startup.sh, shutdown.sh, version.sh, catalina,sh) from the command line with export JAVA_HOME=/usr/lib/jvm/jre-openjdk/ on RHEL5 all is well. For example: [root@auto-rhq01 bin]# ./catalina.sh version Using CATALINA_BASE: /root/JBossEWS100/jboss-ews-1.0/tomcat5 Using CATALINA_HOME: /root/JBossEWS100/jboss-ews-1.0/tomcat5 Using CATALINA_TMPDIR: /root/JBossEWS100/jboss-ews-1.0/tomcat5/temp Using JRE_HOME: /usr/lib/jvm/jre-openjdk/ Server version: Apache Tomcat/5.5.23 Server built: Mar 25 2009 03:58:01 Server number: 5.5.23.0 OS Name: Linux OS Version: 2.6.18-194.el5 Architecture: amd64 JVM Version: 1.6.0_0-b16 JVM Vendor: Sun Microsystems Inc. [root@auto-rhq01 bin]# env | grep JAVA JAVA_HOME=/usr/lib/jvm/jre-openjdk/ Yet when I run the Shutdown (or Restart) operations from within RHQ, the agent.log reports the following warning and never completes the operation: 2010-06-16 10:19:34,914 WARN [ResourceDiscoveryComponent.invoker.daemon-68] (jboss.on.plugins.tomcat.TomcatDiscoveryComponent)- Failed to determine Tomcat Server Version Given: VersionInfo:/root/JBossEWS100/jboss-ews-1.0/tomcat6/bin/catalina.sh: line 362: /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin/java: No such file or directory catalinaHome: /root/JBossEWS100/jboss-ews-1.0/tomcat6 Script:/root/JBossEWS100/jboss-ews-1.0/tomcat6/bin/version.sh timeout=10000 Notice the path it is using to launch java, (/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin/java), this path does not actually exist on the file system, however if you insert a jre into the path as follows, the file does exist. [root@auto-rhq01 bin]# ls -l /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java -rwxr-xr-x 1 root root 42232 Mar 30 17:06 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java The failure to run catalina.sh from the rhq-agent affects other areas too: 1. See the attached screenshot. 2. Response time measurement collection also fails as noted by the following WARNing in the agent.log // 2010-06-16 10:22:00,928 WARN [MeasurementManager.collector-1] (rhq.core.pc.measurement.MeasurementCollectorRunner)- Failure to collect measurement data for Resource[id=10883, type=Tomcat Server, key=/root/JBossEWS100/jboss-ews-1.0/tomcat6, name=Tomcat (8180), parent=auto-rhq01.usersys.redhat.com, version=Unknown Version] - cause: org.rhq.core.pc.inventory.TimeoutException:Call to [org.jboss.on.plugins.tomcat.TomcatServerComponent.getValues()] with args [[org.rhq.core.domain.measurement.MeasurementReport@22bf991d, [ScheduledMeasurementInfo[res=10883, name=Catalina:type=Server:serverInfo, sched=19141]]]] timed out. Invocation thread will be interrupted
Created attachment 424506 [details] Unknown version results when agent fails to run catalina.sh
*** Bug 598610 has been marked as a duplicate of this bug. ***
fix commit: 0225e8ee6395a730295e1af1fc903cd27d09b93e This problem affected any script execution in the tomcat plugin. Most notably server version detection and start and shutdown operations. In short the problem resulted from the fact that the RHQ Agent supports JAVA_HOME being set to a JRE. Tomcat does not and instead requires JRE_HOME to be set in that situation. Also, the RHQ Agent's environment initializes the process execution env. So, the TC plugin had to ensure not to forward on the Agent's setting for JAVA_HOME, and needed to set JRE_HOME and JAVA_HOME as TC would expect. As a note, Tomcat requires only a JRE starting with 5.5. Unless TC is being run in debug, in which case a JDK is required. I think this fix solves some fairly mysterious failure seen in the past. Good find.
I tested this with jon-server-2.4.0.GA_QA build and was able to shutdown apache-tomcat. Marking this bug as verified.
I am moving this back to ON_QA -- this test wasn't as cut-and-dried as it appears. Doing so, if only to assure we have the correct, very specific environment in place to assure we've got the proper scenario covered.
Verified: RHQ version: 3.0.0-SNAPSHOT build number: b9ca90d $ git rev-list b9ca90d | grep 0225e8ee6395a730295e1af1fc903cd27d09b93e 0225e8ee6395a730295e1af1fc903cd27d09b93e The grep match indicates that this RHQ build includes the fix from Jay. Note: verification was done on the same environment where the original problems were discovered. * The Start/Stop/Restart operations are successfully completing on inventoried Tomcat5 and Tomcat6 servers. * The Tomcat5 and Tomcat6 Version is successfully captured by RHQ Could not yet verify in a JON build. Awaiting a newer build with fix included.
Verified: JBoss Operations Network version: 2.4.0.GA_QA build number: 10745:647a602
Mass-closure of verified bugs against JON.