Bug 604444 - Tomcat Operation for Shutdown is failing. Could have to do with JRE/JDK assumptions.
Summary: Tomcat Operation for Shutdown is failing. Could have to do with JRE/JDK assu...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RHQ Project
Classification: Other
Component: Operations
Version: 3.0.0
Hardware: All
OS: Linux
low
high
Target Milestone: ---
: ---
Assignee: Jay Shaughnessy
QA Contact: Corey Welton
URL:
Whiteboard:
: 598610 (view as bug list)
Depends On:
Blocks: rhq_auto_blocker jon24-ews
TreeView+ depends on / blocked
 
Reported: 2010-06-15 21:59 UTC by John Sefler
Modified: 2010-08-12 16:45 UTC (History)
1 user (show)

Fixed In Version: 2.4
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-08-12 16:45:09 UTC
Embargoed:


Attachments (Terms of Use)
Unknown version results when agent fails to run catalina.sh (178.07 KB, image/png)
2010-06-16 15:34 UTC, John Sefler
no flags Details

Description John Sefler 2010-06-15 21:59:24 UTC
Description of problem:
I'm not sure what details to enter in this defect, nevertheless....

I have a tomcat6 server from EWS 1.0.1 inventoried on my rhq server, the tomcat server is running as root on a rhel 5 box with JAVA_HOME=/usr/lib/jvm/jre-openjdk/  When running the Tomcat Server operations for Shutdown from within rhq, the operation fails.  The tomcat6 server never seems to get the message.  The agent.log contains:

 2010-06-15 16:50:36,930 WARN  [ResourceDiscoveryComponent.invoker.daemon-32] (jboss.on.plugins.tomcat.TomcatDiscoveryComponent)- Failed to determine Tomcat Server Version Given:
 VersionInfo:/root/JBossEWS101/jboss-ews-1.0/tomcat6/bin/catalina.sh: line 450: /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin/java: No such file or directory
 catalinaHome: /root/JBossEWS101/jboss-ews-1.0/tomcat6
 Script:/root/JBossEWS101/jboss-ews-1.0/tomcat6/bin/version.sh
 timeout=10000

The rhq operation failure gives the following stack trace:

java.lang.RuntimeException: Server failed to shutdown
	at org.jboss.on.plugins.tomcat.TomcatServerOperationsDelegate.shutdown(TomcatServerOperationsDelegate.java:281)
	at org.jboss.on.plugins.tomcat.TomcatServerOperationsDelegate.shutdown(TomcatServerOperationsDelegate.java:274)
	at org.jboss.on.plugins.tomcat.TomcatServerOperationsDelegate.invoke(TomcatServerOperationsDelegate.java:128)
	at org.jboss.on.plugins.tomcat.TomcatServerComponent.invokeOperation(TomcatServerComponent.java:420)
	at sun.reflect.GeneratedMethodAccessor186.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:616)
	at org.rhq.core.pc.inventory.ResourceContainer$ComponentInvocationThread.call(ResourceContainer.java:525)
	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:636)


Here are some IRC chat notes with jshaughn:

<jshaughn> i'm wondering if there is missing environment vars or something, when run via the spawned process
<jsefler> my guess is that JAVA_HOME somehow gets confused
<jshaughn> yeah
<jsefler> in the agent log where is says:
 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin/java: No such file or directory
 if it had an extra jre in the path, then it might work...
 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java
 but I don't know from where that path gets expanded.
<jshaughn> it's expecting a jdk install and its getting a jre install
 that seems to be the issue
 well, that's the issue for the version thing
 that's a problem in and of itself. we're looking for a jdk bin dir and we're getting a jre
 can you please BZ that
 it's the cause of the unknown version
<jshaughn> you know what, hang on a sec
 it;s not us expecting a jdk, perhaps it's tomcat
<jsefler> yeah - I think its TC
<jshaughn> well, I think TC's version script requires a JDK
 but you said it was running for you 
<jsefler> it runs for me as root from the command line 
<jshaughn> curious
 I wonder if TC5.5 required a JDK and not 6.0
<jshaughn> anyway, this has something to do with things in general
 it may be a combination of JRE and symlink, or maybe just JRE. I'm not sure.  I bet if JAVA_HOME pointed to a JDK everything would work.
 Can you generate a BZ and assign it to regarding the fact that a JRE will fail for version determination.  I'll use that to also check on the other stuff.  I can see in our code that we are expecting a JDK.  But if TC does not require a JDK then we are in error
<jshaughn> I want to try executing the scripts in TC5.5 and see if they work with just a JRE. I'm not sure where our expectation came from....


Version-Release number of selected component (if applicable):
 JBoss Operations Network
version: 2.4.0-SNAPSHOT
build number: 10655 
 RHQ
version: 3.0.0-SNAPSHOT
build number: cfa1b8d 

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 John Sefler 2010-06-16 15:33:17 UTC
Thinking about this some more, my guess is that the rhq agent is setting its own JAVA_HOME within the spawned shell that it uses to then launch tomcat's shutdown.sh script.  Guessing that this JAVA_HOME is different than the JAVA_HOME that tomcat is using, I think that it messes with the environment variables that are set/used in catalina.sh and the result is that catalina then cannot find java.  Or something of this sort.  When I manually run the tomcat6/bin scripts (startup.sh, shutdown.sh, version.sh, catalina,sh) from the command line with export JAVA_HOME=/usr/lib/jvm/jre-openjdk/ on RHEL5 all is well.  For example:

[root@auto-rhq01 bin]# ./catalina.sh version
Using CATALINA_BASE:   /root/JBossEWS100/jboss-ews-1.0/tomcat5
Using CATALINA_HOME:   /root/JBossEWS100/jboss-ews-1.0/tomcat5
Using CATALINA_TMPDIR: /root/JBossEWS100/jboss-ews-1.0/tomcat5/temp
Using JRE_HOME:       /usr/lib/jvm/jre-openjdk/
Server version: Apache Tomcat/5.5.23
Server built:   Mar 25 2009 03:58:01
Server number:  5.5.23.0
OS Name:        Linux
OS Version:     2.6.18-194.el5
Architecture:   amd64
JVM Version:    1.6.0_0-b16
JVM Vendor:     Sun Microsystems Inc.
[root@auto-rhq01 bin]# env | grep JAVA
JAVA_HOME=/usr/lib/jvm/jre-openjdk/

Yet when I run the Shutdown (or Restart) operations from within RHQ, the agent.log reports the following warning and never completes the operation:

2010-06-16 10:19:34,914 WARN  [ResourceDiscoveryComponent.invoker.daemon-68] (jboss.on.plugins.tomcat.TomcatDiscoveryComponent)- Failed to determine Tomcat Server Version Given:
VersionInfo:/root/JBossEWS100/jboss-ews-1.0/tomcat6/bin/catalina.sh: line 362: /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin/java: No such file or directory

catalinaHome: /root/JBossEWS100/jboss-ews-1.0/tomcat6
Script:/root/JBossEWS100/jboss-ews-1.0/tomcat6/bin/version.sh
timeout=10000


Notice the path it is using to launch java, (/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/bin/java), this path does not actually exist on the file system, however if you insert a jre into the path as follows, the file does exist.
[root@auto-rhq01 bin]# ls -l /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java
-rwxr-xr-x 1 root root 42232 Mar 30 17:06 /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/bin/java


The failure to run catalina.sh from the rhq-agent affects other areas too:
1. See the attached screenshot.
2. Response time measurement collection also fails as noted by the following WARNing in the agent.log

//	2010-06-16 10:22:00,928 WARN  [MeasurementManager.collector-1] (rhq.core.pc.measurement.MeasurementCollectorRunner)- Failure to collect measurement data for Resource[id=10883, type=Tomcat Server, key=/root/JBossEWS100/jboss-ews-1.0/tomcat6, name=Tomcat (8180), parent=auto-rhq01.usersys.redhat.com, version=Unknown Version] - cause: org.rhq.core.pc.inventory.TimeoutException:Call to [org.jboss.on.plugins.tomcat.TomcatServerComponent.getValues()] with args [[org.rhq.core.domain.measurement.MeasurementReport@22bf991d, [ScheduledMeasurementInfo[res=10883, name=Catalina:type=Server:serverInfo, sched=19141]]]] timed out. Invocation thread will be interrupted

Comment 2 John Sefler 2010-06-16 15:34:46 UTC
Created attachment 424506 [details]
Unknown version results when agent fails to run catalina.sh

Comment 3 John Sefler 2010-06-22 15:17:05 UTC
*** Bug 598610 has been marked as a duplicate of this bug. ***

Comment 4 Jay Shaughnessy 2010-06-22 18:21:27 UTC
fix commit: 0225e8ee6395a730295e1af1fc903cd27d09b93e

This problem affected any script execution in the tomcat plugin. Most notably
server version detection and start and shutdown operations.

In short the problem resulted from the fact that the RHQ Agent supports
JAVA_HOME being set to a JRE.  Tomcat does not and instead requires JRE_HOME to
be set in that situation.  Also, the RHQ Agent's environment initializes the
process execution env. So, the TC plugin had to ensure not to forward on the
Agent's setting for JAVA_HOME, and needed to set JRE_HOME and JAVA_HOME as TC
would expect.

As a note, Tomcat requires only a JRE starting with 5.5.  Unless TC is being
run in debug, in which case a JDK is required.

I think this fix solves some fairly mysterious failure seen in the past. Good
find.

Comment 5 Sudhir D 2010-06-23 13:26:58 UTC
I tested this with jon-server-2.4.0.GA_QA build and was able to shutdown apache-tomcat. 

Marking this bug as verified.

Comment 6 Corey Welton 2010-06-23 15:36:39 UTC
I am moving this back to ON_QA -- this test wasn't as cut-and-dried as it appears.  Doing so, if only to assure we have the correct, very specific environment in place to assure we've got the proper scenario covered.

Comment 7 John Sefler 2010-06-23 17:37:44 UTC
Verified:
 RHQ
version: 3.0.0-SNAPSHOT
build number: b9ca90d

$ git rev-list b9ca90d  | grep 0225e8ee6395a730295e1af1fc903cd27d09b93e
0225e8ee6395a730295e1af1fc903cd27d09b93e
The grep match indicates that this RHQ build includes the fix from Jay.

Note: verification was done on the same environment where the original problems were discovered.

* The Start/Stop/Restart operations are successfully completing on inventoried Tomcat5 and Tomcat6 servers.
* The Tomcat5 and Tomcat6 Version is successfully captured by RHQ

Could not yet verify in a JON build.  Awaiting a newer build with fix included.

Comment 8 John Sefler 2010-06-25 15:18:05 UTC
Verified:
 JBoss Operations Network
version: 2.4.0.GA_QA
build number: 10745:647a602

Comment 9 Corey Welton 2010-08-12 16:45:09 UTC
Mass-closure of verified bugs against JON.


Note You need to log in before you can comment on or make changes to this bug.