I have a test box whose default java is 'jre-1.6.0-ibm'. On occasion I will end up starting/have started my server w/o changing RHQ_SERVER_JAVA_EXE_FILE_PATH, etc., beforehand. When this occurs, Very Bad Things result. Steps to repro: 1. Make sure that `which java` points to the java binary as distributed with jre-1.6.0-ibm 2. rhq-server.sh start 3. Wait some time, then try to go to url. Note that no page can load 4. `ps|grep rhq` 5. view your log files. Current results: Trying to start the RHQ Server... RHQ Server (pid 24556) is starting ...but obviously it doesn't work Expected results: * Perhaps rhq should "work", using ibm java, even if we don't support it and/or * Should we be able to catch such a critical exception and completely bail if we hit a error such as this, and thus return message to user, "something Very Bad has taken place upon boot?" Workaround: Make sure you have a supported JRE as the default and/or make changes in rhq-server.sh appropriately Other notes: * This may or may not be two separate bugs. You all can decide. [16:28] <mazz> yeah, the server DID start, its just in a very bad state - same thing would happen if, for example, you start but your database is down
[16:27] <mazz> if you can, capture the output when you RHQ_SERVER_DEBUG=true and put that in the JIRA too [16:27] <mazz> need to see the cmd line opts we use to start the VM RHQ_SERVER_HOME: /root/220/GA/jon-server-2.2.0.GA RHQ_SERVER_JAVA_EXE_FILE_PATH: /usr/bin/java RHQ_SERVER_JAVA_OPTS: -Dapp.name=rhq-server -Xms256M -Xmx1024M -XX:PermSize=128M -XX:MaxPermSize=256M -Djava.net.preferIPv4Stack=true -Djboss.server.log.dir=/root/220/GA/jon-server-2.2.0.GA/logs -Djava.awt.headless=true -Djboss.platform.mbeanserver -Dsun.lang.ClassLoader.allowArraySyntax=true RHQ_SERVER_ADDITIONAL_JAVA_OPTS: RHQ_SERVER_CMDLINE_OPTS: -P /root/220/GA/jon-server-2.2.0.GA/bin/rhq-server.properties _JBOSS_RUN_SCRIPT: /root/220/GA/jon-server-2.2.0.GA/jbossas/bin/run.sh Trying to start the RHQ Server... ========================================================================= JBoss Bootstrap Environment JBOSS_HOME: /root/220/GA/jon-server-2.2.0.GA/jbossas JAVA: /usr/bin/java JAVA_OPTS: -Dprogram.name=run.sh -Dapp.name=rhq-server -Xms256M -Xmx1024M -XX:PermSize=128M -XX:MaxPermSize=256M -Djava.net.preferIPv4Stack=true -Djboss.server.log.dir=/root/220/GA/jon-server-2.2.0.GA/logs -Djava.awt.headless=true -Djboss.platform.mbeanserver -Dsun.lang.ClassLoader.allowArraySyntax=true -Djava.util.logging.config.file=/server/default/conf/logging.properties -Djava.net.preferIPv4Stack=true CLASSPATH: /root/220/GA/jon-server-2.2.0.GA/jbossas/bin/run.jar =========================================================================
If this is a problem on aix then lets investigate. Over to corey to check that.
After much head scratching and trying to figure out what was going on... I /do/ get this same error on AIX with Java6. As it turns out, I get a completely different error with Java5 on AIX, whereupon the server starts up and can apparently even be installed, but throws tracebacks ("SunX509 KeyManagerFactory not available") all over the place while starting
Assigning to Ian to investigate
Problem #1: lots of XSL/Xalan class not found errors Fix: xalan.jar needs to be added to jon-server/jbossas/lib/endorsed/ (as it is in a default AS 4.2.3 install); not sure if this will cause issues with Sun JVMs - need to talk to Jay and/or Greg Problem #2: lots of "SunX509 KeyManagerFactory not available" from Tomcat/JBossWeb and JBossWS Fix: for the Tomcat ones, see https://issues.apache.org/bugzilla/show_bug.cgi?id=45500; for the JBossWS ones, see https://jira.jboss.org/jira/browse/JBWS-1820
r5041 fixes problem #1. r5042 fixes problem #2 by adding -Drhq.server.tomcat.security.algorithm=IbmX509 to the run.sh command line args if the OS is AIX. 1) Setting the default value for rhq.server.tomcat.security.algorithm programmatically in the installer, 2) removing it from the default rhq-server.properties file, and 3) changing algorithm="${rhq.server.tomcat.security.algorithm}" to algorithm="${rhq.server.tomcat.security.algorithm:SunX509}" in server.xml, may have been a nicer way to do it, but I went with the rhq-server.sh fix instead since it was a bit less involved. We can always revisit this later if Mazz likes the programmatic alternative better. The Server now appears to be running on AIX with both IBM JDK6 and IBM JDK5 without any issues. I started it up, imported the platform and a couple servers and clicked around in the GUI a bit. QA should do more extensive testing, including Web services client testing (talk to Simeon for how to run these tests), in addition to more GUI functional testing.
Note, there turned out to *not* be any JBossWS SSL issues (I think I misread one of the JBossWeb SSL errors as being from JBossWS). However, setting the SSL Stub properties as described in https://jira.jboss.org/jira/browse/JBWS-1820 will still be necessary for JBossWS+SSL-based clients of RHQ Web services.
r5045 - set the SSL algorithm to the IBM impl for the RHQ comm layer, as well as for Tomcat.
What this does now is render those rhq-server.properties settings "useless" in the sense that no matter what the customer sets those values to in .properties, they will be unused because we now overwrite them. The purpose of the .properties file was to allow customers to customize their server (for example, if they want to use IBM's JDK, they can set those settings to IbmX509)... we didn't want to have to do all of these if-then conditionals to set values - just set the .properties appropriately and it would work. Now, I understand this makes it easier on AIX users but I'm bringing this up for discussion since it does go against the design of the scripts/properties file. IMO, the fix for this should have simply been a README or documentation/FAQ update to tell people "set these values to IbmX509 if using IBM's JDK" - which would hold true for those not just on AIX but other platforms that may have an IBM JDK available. As it is now, we STILL need a FAQ of some sort to tell people, "if you are on AIX, you cannot configure these 5 property settings because they will be hardcoded to IbmX509". For example, is it possible to run OpenJDK on AIX? If so, I am suspicious that this would work and we'd still break on AIX.
(10:36:24 AM) ips: alternate solution would be: comment out the algorithm props in rhq-server.properties (10:37:13 AM) ips: and then in the installer, after it reads in the initial properties file, if any of those algortihm props are not set, set them based on os.name and/or java.vendor (10:38:16 AM) ccrouch: lets just decide right here right now (10:38:28 AM) mazz: the installer one fix will be "kinda" ugly because the commented out settings will remain (10:38:35 AM) mazz: and the "new" real settings get appended (10:38:46 AM) mazz: the properties fle update code doesn't "uncomment" things (10:39:01 AM) mazz: since it can't parse commented properties to know that it can be uncommented (10:39:21 AM) ips: so then remove them instead comment (10:39:21 AM) mazz: but what we can do is have the installer popup a warning (10:39:32 AM) ips: instead of comment (10:39:51 AM) mazz: removing them isn't good because for those that want to use the preconfigured way to install (the new way to install), people need to see the properties they can set (10:40:07 AM) mazz: they edit .properties, and run the server the first time (10:40:18 AM) mazz: and the insatller kicks off automatically without them going to the UI (10:40:29 AM) mazz: at least having them commented out would show people what they CAN set (10:40:56 AM) mazz: we can have the installer popup a warning "You are using an IBM JDK but using SUN's algorithms - this is probably not what you want..blah blah" (10:41:11 AM) jshaughn: the installer could have an actual option, a la, db settings to choose your JDK (10:41:21 AM) mazz: hmmm... interesting (10:41:31 AM) mazz: and it could be set automatically by looking at the system properties (10:41:39 AM) jshaughn: right (10:41:55 AM) jshaughn: and then set the rhq-server.properties accordingly (10:42:04 AM) mazz: hey, why can't we do that WITHOUT a JDK drop down? (10:42:04 AM) ips: but why would the user ever choose a jdk other than the one it's running on? (10:42:09 AM) mazz: right :) (10:42:15 AM) jshaughn: right (10:42:19 AM) jshaughn: just detect and set (10:42:20 AM) mazz: we can prepare the defaults automagically (10:42:29 AM) mazz: I like that (10:42:35 AM) mazz: that should be easy to do I think too (10:42:45 AM) ips: that's what i was suggesting - is javaVendor.startsWith("IBM") ... (10:43:30 AM) mazz: javaVendor.contains("IBM") :) because you know in a year they will change it to something like "International Business Machines (IBM)" (10:44:33 AM) jshaughn: that should work, and then we don't need any doco - woohoo. (10:44:57 AM) ips: but keep it as ...algorithm=SunX509 in the default props file ? (10:45:01 AM) mazz: yes (10:45:07 AM) jshaughn: as long as the installer works on IBM :) (10:45:20 AM) mazz: the installer can change the default as soon as it starts up essentially (10:45:37 AM) mazz: so when the fields appear in the UI, you'll see IbmX509 already (10:45:50 AM) ips: so then won't we still have to doco changing the props file for silent installs for running on AIX? (10:46:11 AM) mazz: no, because the silent install goes through the same machinations as the UI installer (10:46:23 AM) ips: ah, cool
the installer now sets the necessary properties to IbmX509 if the JRE its running in is IBM (sysprop "java.vendor" must contain "IBM"). The "if AIX" check has been removed from rhq-server.sh since it is no longer needed - the configuration has remained in the .properties file.
It turns out adding xalan.jar back to lib/endorsed/ (r5041) is causing Java preferences errors when the JON Server is run on Java 6, u9 or earlier. See http://jira.rhq-project.org/browse/RHQ-542 for a description of this problem.
r5073 fixes the Xalan issue as follows: 1) In container build, move xalan.jar from jbossas/lib/endorsed/ to etc/ibm/. 2) In rhq-server.sh, copy xalan.jar back from etc/ibm/ to jbossas/lib/endorsed/ if IBM Java is being used
we need this tested on different platforms since the server .sh script has changed - need to make sure we didn't regress. test on Solaris, RHEL/Fedora, AIX - anyone with a Mac should make sure it works too...
You forgot HP-UX :-)
REPRO STEPS ============= Start the JON Server via rhq-server.sh on the following OS/JDK combos, ensuring there are no errors in the Server log: 1) RHEL/JDK5 2) RHEL/JDK6 3) HP-UX/JDK5 4) HP-UX/JDK6 5) Solaris/JDK5 6) Solaris/JDK6 7) AIX/JDK5 8) AIX/JDK6 Optional: 9) CygWin/JDK5 10) CygWin/JDK6 11) MacOSX/JDK5 12) MacOSX/JDK6
I tested on Cygwin and the cygwin fork MSYS - both worked.
Will update this comment as variations are tested 1) RHEL/JDK5 - OK 2) RHEL/JDK6 - OK 3) HP-UX/JDK5 - OK 4) HP-UX/JDK6 - OK 5) Solaris/JDK5 - OK 6) Solaris/JDK6 - OK 7) AIX/JDK5 - OK^H^HStill seeing some SunX509 errors. 8) AIX/JDK6 - Still seeing some SunX509 errors.
I really think a documentation blurb and release notes blurb should be enough for this kind of issue. The only thing else I can think of is to see if RHQ_SERVER_ADDITONAL_JAVA_OPTS can be used to pass in -D options that override the .properties props (I'm not even sure if -D would override the settings - IIRC I think I tried that once but the properties file took effect - maybe if we put the -P first on the command line and then -D at the end?) If possible, then we could do something like this in the .sh: "if the server isn't installed yet AND the JRE is IBM then set RHQ_SERVER_ADDITIONAL_JAVA_OPTS+=-D... -D... ..." We know if the server isn't installed yet because we can check for the existance of "rhq.ear.rej" - if it exists, the installer hasn't run yet.
No SunX509 errors on AIX/JDK5? Are you sure?
The false negative on scenario 7 came from restarting the server on IBM Java5 after having installed it on Java6 already. I wasn't aware that this issue was only occurring on the first run prior to install. Reinstalling with Java5 as the base /does/ cause the same SunX509 errors to occur.
QA Verified for the original issue in this case (the xalan issue), no regressions have been found across other platforms. [11:37] < mazz> I would create another JIRA to document this speciifc issue about the security protocol The X509 issue has been granted its own Jira: RHQ-2416
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-2065 Imported an attachment (id=368742) This bug is related to RHQ-542 This bug relates to RHQ-2416