Description of problem: I ran the java tests in a loop over the weekend. After 200+ runs I had only two failures. In both cases the test aborted because the broker failed to start due to a lock file. In both cases the Java logs show the broker exiting normally prior to this. The two Java test cases are unrelated. How reproducible: It took about 100,000 broker starts/stops to hit this issue. Additional info: The data dir was on nfs, and the broker was running without the store.
This also happens with a local directory, I got it in 200 iterations of: while ./qpidd -d -p 8888 --data-dir /tmp/qpidd && ./qpidd -q -p 8888 ; do echo -n .; done
I've also seen this occasionally in testing where there was no loop, broker was killed normally. Using flock(3) or lockf(3) on the brokers data dir might be more reliable than a simple existence test.
NB: in fixing this we should use common code for both the PID file and the lock file.
No longer a problem.