Description of problem: A qpidd process started as a job with: Executable = /usr/sbin/qpidd Universe = vanilla arguments = -t --auth no --data-dir /home/whenry/.qpidd Log = qpidd.log Output = output.log Queue Using condor_hold or condor_rm to shutdown the qpidd job often leaves behind the lock file in the data-dir. This means removing the lock explicitly from the command line. A graceful shutdown was witnessed once using condor_hold and condor_rm combination and resubmitting but not always - in fact a graceful shutdown only witnessed once. An explicit "kill" on the process does 'cause a graceful shutdown. Version-Release number of selected component (if applicable): How reproducible: Use the job file above. condor_submit my_qpid_job Run condor_q a few times until you see that the job is running. Using the job ID run either condor_rm or the combination condor_hold and condor_rm. Resubmit the job. Watch as the job starts to run and then drops off the running queue. In fact you don't even have to resubmit because you can run "ls" on the data-dir and see the lock file. The qpidd will not run while that lock file is still there. Steps to Reproduce: See above 1. 2. 3. Actual results: Expected results: Additional info:
William, The lock file in the data-dir is not deleted on a clean qpidd shutdown. The fact that the file is still there is not in itself a problem. Furthermore, killing the qpidd process (even with kill -9) properly cleans up the lock. Did you ever see the following message? Cannot lock <data-dir>/lock: Resource temporarily unavailable This is the only indication you will get that there is a lock contention problem, and it only happens when two qpidd processes are vying for the same data directory. -Ted
I need to retest this and see what happens. (it seems so long ago now).
This is not critical for 1.1. as I don't know of any customer that is actually running brokers as a job. So I've pushed to 1.1.1. I'll retest then.