Red Hat Bugzilla – Bug 487030
no java universe (broken classpath)
Last modified: 2009-02-24 15:55:55 EST
Description of problem:
condor_status -java produces no output
Version-Release number of selected component (if applicable):
StarterLog:2/23 13:36:45 JavaDetect: failure status 256 when executing /usr/bin/java -classpath /usr/lib:/usr/lib/scimark2lib.jar:. -Xmx1906M CondorJavaInfo old 2
scimark2lib.jar is in /usr/share/condor, not in /usr/lib where classpath above is looking for it.
$ rpm -q condor
$ condor_config_val -v JAVA_CLASSPATH_DEFAULT
JAVA_CLASSPATH_DEFAULT: /usr/share/condor /usr/share/condor/scimark2lib.jar .
Defined in '/etc/condor/condor_config', line 1670.
$ ls -al /usr/share/condor/scimark2lib.jar
-rwxr-xr-x 1 root root 13756 2009-01-14 23:06 /usr/share/condor/scimark2lib.jar*
Is it possible that you modified /etc/condor/condor_config and now there's a condor_config.rpmnew that has the proper config for JAVA_CLASSPATH_DEFAULT?
If so, you should take your changes to /etc/condor/condor_config and put them in the condor_config.local file. Then replace /etc/condor/condor_config with the rpmnew. Generally keeping your changes in the condor_config.local and leaving the condor_config in /etc so that the package can update it as things change.
You're right, sorry about that.
The problem is that in a large pool it's much easier to keep settings common to the pool in the central config file on an nfs share and use condor_config.local for settings local to the machine.
As opposed to distribution-default settings in central config file and settings local to the pool in local config file.
These include most of Part 1, almost all of Part 2 and large chunk of Part 3, whereas the only part that's really local is the list of daemons and start/preempt/suspend settings. I.e. my condor_config.local is 10 lines. Things like uid/fs domain, flocking and access list, pool name, admin e-mail etc. really belong in pool-global config file -- otherwise I have to somehow rsync them out to all the nodes while somehow not overwriting node-local settings at the same time.
One possible workaround is to use 2 local config files: "pool-global" and "really local". I might try that some day.
You can use the command generated config file trick (see 220.127.116.11 of UW's 7.2.1 manual).
Basically, you leave /etc/condor/condor_config alone, and in your condor_config.local add:
LOCAL_CONFIG_FILE = /bin/cat /nfs/condor_config/global /nfs/condor_config/$(FULL_HOSTNAME)|
(The key is the | at the end of the line)
You put your global stuff in the obvious place and your local changes in a file with the appropriate name.
The immediate downside is you need a $(FULL_HOSTNAME) file for each node, but you can get around that by exec'ing your own script instead of /bin/cat. If you're worried about hammering NFS you can be arbitrarily clever in your script.
LOCAL_CONFIG_FILE = /opt/condor/condor_config $(LOCAL_CONFIG_FILE)
in condor_config.local works, too. The only (minor) complaint I have is that with LOCAL_DIR defined as $(TILDE) I have to set condor user's home directory to /var/lib/condor in LDAP -- and /var/lib/condor doesn't exist on half the computers here. That and selinux whines about some already defined context.