Bug 535361 (RHQ-2065) - Server is in really bad shape if started with jre-1.6.0-ibm
Summary: Server is in really bad shape if started with jre-1.6.0-ibm
Keywords:
Status: CLOSED NEXTRELEASE
Alias: RHQ-2065
Product: RHQ Project
Classification: Other
Component: Core Server
Version: 1.2
Hardware: All
OS: All
high
medium
Target Milestone: ---
: ---
Assignee: Ian Springer
QA Contact: Corey Welton
URL: http://jira.rhq-project.org/browse/RH...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-05-06 20:39 UTC by Corey Welton
Modified: 2013-08-06 00:33 UTC (History)
1 user (show)

Fixed In Version: 1.3
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Red Hat Enterprise Linux Server release 5.3 (Tikanga) Kernel \r on an \m Linux rlx-0-12.rhndev.redhat.com 2.6.18-128.1.1.el5 #1 SMP Mon Jan 26 13:59:00 EST 2009 i686 i686 i386 GNU/Linux java version "1.6.0" Java(TM) SE Runtime Environment (build pxi3260sr
Last Closed:
Embargoed:


Attachments (Terms of Use)
RHQ-2065_log.txt (63.42 KB, text/plain)
2009-05-06 20:42 UTC, Corey Welton
no flags Details

Description Corey Welton 2009-05-06 20:39:00 UTC
I have a test box whose default java is 'jre-1.6.0-ibm'.  On occasion I will end up starting/have started my server w/o changing RHQ_SERVER_JAVA_EXE_FILE_PATH, etc., beforehand.  When this occurs, Very Bad Things result.

Steps to repro:
1. Make sure that `which java` points to the java binary as distributed with jre-1.6.0-ibm
2. rhq-server.sh start
3. Wait some time, then try to go to url.  Note that no page can load
4. `ps|grep rhq`
5. view your log files.

Current results:
Trying to start the RHQ Server...                                                                           
RHQ Server (pid 24556) is starting  

...but obviously it doesn't work

Expected results:
* Perhaps rhq should "work", using ibm java, even if we don't support it
and/or
* Should we be able to catch such a critical exception and completely bail if we hit a error such as this, and thus return message to user, "something Very Bad has taken place  upon boot?"


Workaround:
Make sure you have a supported JRE as the default and/or make changes in rhq-server.sh appropriately

Other notes:
* This may or may not be two separate bugs.  You all can decide.
[16:28] <mazz> yeah, the server DID start, its just in a very bad state - same thing would happen if, for                    
               example, you start but your database is down 

Comment 1 Corey Welton 2009-05-06 20:47:56 UTC
[16:27] <mazz> if you can, capture the output when you RHQ_SERVER_DEBUG=true and put that in the JIRA too       
[16:27] <mazz> need to see the cmd line opts we use to start the VM    




RHQ_SERVER_HOME: /root/220/GA/jon-server-2.2.0.GA
RHQ_SERVER_JAVA_EXE_FILE_PATH: /usr/bin/java
RHQ_SERVER_JAVA_OPTS: -Dapp.name=rhq-server -Xms256M -Xmx1024M -XX:PermSize=128M -XX:MaxPermSize=256M -Djava.net.preferIPv4Stack=true -Djboss.server.log.dir=/root/220/GA/jon-server-2.2.0.GA/logs -Djava.awt.headless=true -Djboss.platform.mbeanserver -Dsun.lang.ClassLoader.allowArraySyntax=true
RHQ_SERVER_ADDITIONAL_JAVA_OPTS:
RHQ_SERVER_CMDLINE_OPTS: -P /root/220/GA/jon-server-2.2.0.GA/bin/rhq-server.properties
_JBOSS_RUN_SCRIPT: /root/220/GA/jon-server-2.2.0.GA/jbossas/bin/run.sh
Trying to start the RHQ Server...
=========================================================================

  JBoss Bootstrap Environment

  JBOSS_HOME: /root/220/GA/jon-server-2.2.0.GA/jbossas

  JAVA: /usr/bin/java

  JAVA_OPTS: -Dprogram.name=run.sh -Dapp.name=rhq-server -Xms256M -Xmx1024M -XX:PermSize=128M -XX:MaxPermSize=256M -Djava.net.preferIPv4Stack=true -Djboss.server.log.dir=/root/220/GA/jon-server-2.2.0.GA/logs -Djava.awt.headless=true -Djboss.platform.mbeanserver -Dsun.lang.ClassLoader.allowArraySyntax=true  -Djava.util.logging.config.file=/server/default/conf/logging.properties -Djava.net.preferIPv4Stack=true

  CLASSPATH: /root/220/GA/jon-server-2.2.0.GA/jbossas/bin/run.jar

=========================================================================


Comment 2 Charles Crouch 2009-07-28 23:52:59 UTC
If this is a problem on aix then lets investigate. Over to corey to check that.

Comment 3 Corey Welton 2009-08-06 02:19:23 UTC
After much head scratching and trying to figure out what was going on... I /do/ get this same error on AIX with Java6.

As it turns out, I get a completely different error with Java5 on AIX, whereupon the server starts up and can apparently even be installed, but throws tracebacks ("SunX509 KeyManagerFactory not available") all over the place while starting

Comment 4 Charles Crouch 2009-08-28 18:22:45 UTC
Assigning to Ian to investigate

Comment 5 Ian Springer 2009-08-30 03:34:03 UTC
Problem #1: lots of XSL/Xalan class not found errors
Fix: xalan.jar needs to be added to jon-server/jbossas/lib/endorsed/ (as it is in a default AS 4.2.3 install); not sure if this will cause issues with Sun JVMs - need to talk to Jay and/or Greg

Problem #2: lots of "SunX509 KeyManagerFactory not available" from Tomcat/JBossWeb and JBossWS
Fix: for the Tomcat ones, see https://issues.apache.org/bugzilla/show_bug.cgi?id=45500; for the JBossWS ones, see https://jira.jboss.org/jira/browse/JBWS-1820


Comment 6 Ian Springer 2009-08-31 21:59:12 UTC
r5041 fixes problem #1.

r5042 fixes problem #2 by adding -Drhq.server.tomcat.security.algorithm=IbmX509 to the run.sh command line args if the OS is AIX. 1) Setting the default value for rhq.server.tomcat.security.algorithm programmatically in the installer, 2) removing it from the default rhq-server.properties file, and 3) changing algorithm="${rhq.server.tomcat.security.algorithm}" to algorithm="${rhq.server.tomcat.security.algorithm:SunX509}" in server.xml, may have been a nicer way to do it, but I went with the rhq-server.sh fix instead since it was a bit less involved. We can always revisit this later if Mazz likes the programmatic alternative better.

The Server now appears to be running on AIX with both IBM JDK6 and IBM JDK5 without any issues. I started it up, imported the platform and a couple servers and clicked around in the GUI a bit. QA should do more extensive testing, including Web services client testing (talk to Simeon for how to run these tests), in addition to more GUI functional testing.


Comment 7 Ian Springer 2009-08-31 22:03:22 UTC
Note, there turned out to *not* be any JBossWS SSL issues (I think I misread one of the JBossWeb SSL errors as being from JBossWS). However, setting the SSL Stub properties as described in https://jira.jboss.org/jira/browse/JBWS-1820 will still be necessary for JBossWS+SSL-based clients of RHQ Web services.


Comment 8 Ian Springer 2009-09-01 12:04:51 UTC
r5045 - set the SSL algorithm to the IBM impl for the RHQ comm layer, as well as for Tomcat.


Comment 9 John Mazzitelli 2009-09-01 14:28:40 UTC
What this does now is render those rhq-server.properties settings "useless" in the sense that no matter what the customer sets those values to in .properties, they will be unused because we now overwrite them.

The purpose of the .properties file was to allow customers to customize their server (for example, if they want to use IBM's JDK, they can set those settings to IbmX509)... we didn't want to have to do all of these if-then conditionals to set values - just set the .properties appropriately and it would work.

Now, I understand this makes it easier on AIX users but I'm bringing this up for discussion since it does go against the design of the scripts/properties file.  IMO, the fix for this should have simply been a README or documentation/FAQ update to tell people "set these values to IbmX509 if using IBM's JDK" - which would hold true for those not just on  AIX but other platforms that may have an IBM JDK available. As it is now, we STILL need a FAQ of some sort to tell people, "if you are on AIX, you cannot configure these 5 property settings because they will be hardcoded to IbmX509". For example, is it possible to run OpenJDK on AIX? If so, I am suspicious that this would work and we'd still break on AIX.


Comment 10 John Mazzitelli 2009-09-01 14:48:41 UTC
(10:36:24 AM) ips: alternate solution would be: comment out the algorithm props in rhq-server.properties
(10:37:13 AM) ips: and then in the installer, after it reads in the initial properties file, if any of those algortihm props are not set, set them based on os.name and/or java.vendor
(10:38:16 AM) ccrouch: lets just decide right here right now
(10:38:28 AM) mazz: the installer one fix will be "kinda" ugly because the commented out settings will remain
(10:38:35 AM) mazz: and the "new" real settings get appended
(10:38:46 AM) mazz: the properties fle update code doesn't "uncomment" things
(10:39:01 AM) mazz: since it can't parse commented properties to know that it can be uncommented
(10:39:21 AM) ips: so then remove them instead comment
(10:39:21 AM) mazz: but what we can do is have the installer popup a warning
(10:39:32 AM) ips: instead of comment
(10:39:51 AM) mazz: removing them isn't good because for those that want to use the preconfigured way to install (the new way to install), people need to see the properties they can set
(10:40:07 AM) mazz: they edit .properties, and run the server the first time
(10:40:18 AM) mazz: and the insatller kicks off automatically without them going to the UI
(10:40:29 AM) mazz: at least having them commented out would show people what they CAN set
(10:40:56 AM) mazz: we can have the installer popup a warning "You are using an IBM JDK but using SUN's algorithms - this is probably not what you want..blah blah"
(10:41:11 AM) jshaughn: the installer could have an actual option, a la, db settings to choose your JDK
(10:41:21 AM) mazz: hmmm... interesting
(10:41:31 AM) mazz: and it could be set automatically by looking at the system properties
(10:41:39 AM) jshaughn: right
(10:41:55 AM) jshaughn: and then set the rhq-server.properties accordingly
(10:42:04 AM) mazz: hey, why can't we do that WITHOUT a JDK drop down?
(10:42:04 AM) ips: but why would the user ever choose a jdk other than the one it's running on?
(10:42:09 AM) mazz: right :)
(10:42:15 AM) jshaughn: right
(10:42:19 AM) jshaughn: just detect and set
(10:42:20 AM) mazz: we can prepare the defaults automagically
(10:42:29 AM) mazz: I like that
(10:42:35 AM) mazz: that should be easy to do I think too
(10:42:45 AM) ips: that's what i was suggesting - is javaVendor.startsWith("IBM") ...
(10:43:30 AM) mazz: javaVendor.contains("IBM") :) because you know in a year they will change it to something like "International Business Machines (IBM)"
(10:44:33 AM) jshaughn: that should work, and then we don't need any doco - woohoo.
(10:44:57 AM) ips: but keep it as ...algorithm=SunX509 in the default props file ?
(10:45:01 AM) mazz: yes
(10:45:07 AM) jshaughn: as long as the installer works on IBM :)
(10:45:20 AM) mazz: the installer can change the default as soon as it starts up essentially
(10:45:37 AM) mazz: so when the fields appear in the UI, you'll see IbmX509 already
(10:45:50 AM) ips: so then won't we still have to doco changing the props file for silent installs for running on AIX?
(10:46:11 AM) mazz: no, because the silent install goes through the same machinations as the UI installer
(10:46:23 AM) ips: ah, cool


Comment 11 John Mazzitelli 2009-09-01 21:02:34 UTC
the installer now sets the necessary properties to IbmX509 if the JRE its running in is IBM (sysprop "java.vendor" must contain "IBM"). The "if AIX" check has been removed from rhq-server.sh since it is no longer needed - the configuration has remained in the .properties file.

Comment 12 Ian Springer 2009-09-02 20:21:45 UTC
It turns out adding xalan.jar back to lib/endorsed/  (r5041) is causing Java preferences errors when the JON Server is run on Java 6, u9 or earlier. See http://jira.rhq-project.org/browse/RHQ-542 for a description of this problem.


Comment 13 Ian Springer 2009-09-02 21:37:18 UTC
r5073 fixes the Xalan issue as follows:

1) In container build, move xalan.jar from jbossas/lib/endorsed/ to etc/ibm/.
2) In rhq-server.sh, copy xalan.jar back from etc/ibm/ to jbossas/lib/endorsed/ if IBM Java is being used


Comment 14 John Mazzitelli 2009-09-03 00:42:18 UTC
we need this tested on different platforms since the server .sh script has changed - need to make sure we didn't regress.

test on Solaris, RHEL/Fedora, AIX - anyone with a Mac should make sure it works too...

Comment 15 Ian Springer 2009-09-03 03:09:20 UTC
You forgot HP-UX  :-)


Comment 16 Ian Springer 2009-09-03 14:15:23 UTC
REPRO STEPS
=============
Start the JON Server via rhq-server.sh on the following OS/JDK combos, ensuring there are no errors in the Server log:

1) RHEL/JDK5
2) RHEL/JDK6 
3) HP-UX/JDK5
4) HP-UX/JDK6
5) Solaris/JDK5
6) Solaris/JDK6
7) AIX/JDK5
8) AIX/JDK6

Optional:
9) CygWin/JDK5
10) CygWin/JDK6
11) MacOSX/JDK5
12) MacOSX/JDK6


Comment 17 John Mazzitelli 2009-09-03 15:49:54 UTC
I tested on Cygwin and the cygwin fork MSYS - both worked.

Comment 18 Corey Welton 2009-09-08 20:56:52 UTC
Will update this comment as variations are tested

1) RHEL/JDK5  - OK
2) RHEL/JDK6  - OK
3) HP-UX/JDK5 - OK
4) HP-UX/JDK6 - OK
5) Solaris/JDK5 - OK
6) Solaris/JDK6 - OK
7) AIX/JDK5 - OK^H^HStill seeing some SunX509 errors.
8) AIX/JDK6 - Still seeing some SunX509 errors.

Comment 19 John Mazzitelli 2009-09-09 15:25:17 UTC
I really think a documentation blurb and release notes blurb should be enough for this kind of issue.
The only thing else I can think of is to see if RHQ_SERVER_ADDITONAL_JAVA_OPTS can be used to pass in -D options that override the .properties props (I'm not even sure if -D would override the settings - IIRC I think I tried that once but the properties file took effect - maybe if we put the -P first on the command line and then -D at the end?)

If possible, then we could do something like this in the .sh:

"if the server isn't installed yet AND the JRE is IBM then set RHQ_SERVER_ADDITIONAL_JAVA_OPTS+=-D... -D... ..."

We know if the server isn't installed yet because we can check for the existance of "rhq.ear.rej" - if it exists, the installer hasn't run yet.

Comment 20 Ian Springer 2009-09-09 16:38:52 UTC
No SunX509 errors on AIX/JDK5? Are you sure?

Comment 21 Corey Welton 2009-09-09 17:46:32 UTC
The false negative on scenario 7 came from restarting the server on IBM Java5 after having installed it on Java6 already.  I wasn't aware that this issue was only occurring on the first run prior to install.

Reinstalling with Java5 as the base /does/ cause the same SunX509 errors to occur.



Comment 22 Corey Welton 2009-09-09 17:49:28 UTC
QA Verified for the original issue in this case (the xalan issue), no regressions have been found across other platforms.  

[11:37] < mazz> I would create another JIRA to document this speciifc issue about the security protocol    

The X509 issue has been granted its own Jira:  RHQ-2416

Comment 23 Red Hat Bugzilla 2009-11-10 20:57:03 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-2065
Imported an attachment (id=368742)
This bug is related to RHQ-542
This bug relates to RHQ-2416



Note You need to log in before you can comment on or make changes to this bug.