+++ This bug was initially created as a clone of Bug #849394 +++ Description of problem: I configured the 'script' plugin to watch for output when checking for availability of my resource. I check the results using a regex check. In some cases, the output is empty. Then RHQ assumes the resource is down, when in fact it is up. For example, /sbin/service is called with the name of my service. In some cases no output appears. 2012-08-18 21:09:38,052 DEBUG [ResourceContainer.invoker.daemon-115] (org.rhq.plugins.script.ScriptServerComponent)- [/sbin/service]: CLI results: exitcode=[0]; error=[null]; output= 2012-08-18 21:09:38,052 DEBUG [ResourceContainer.invoker.daemon-115] (org.rhq.plugins.script.ScriptServerComponent)- [/sbin/service]: CLI output [] did not match regex [(?s).*is running.*], resource is considered DOWN Version-Release number of selected component (if applicable): RHQ 4.2 plugin How reproducible: Eventually is. Additional info: modules/core/util/src/main/java/org/rhq/core/util/exec/ProcessExecutor.java contains this unfortunate note: // WARNING: // It seems there is no way to get around a possible race condition - what if the process // was so fast that it exited already? We didn't get a chance to capture its output. // I see a unit test that periodically fails because it doesn't get any captured output when // it should - I think it is because of this race condition. But there is no Java API that // let's me redirect a process' streams before the process is told to start. It's unclear if this is really still the case or not... http://stackoverflow.com/questions/1670798/using-a-thread-to-capture-process-output There is a note that I/O needs to be consumed in a thread. It could be the process completes but the I/O was never consumed. --- Additional comment from genman on 2012-08-19 02:15:08 EDT --- The problem is that the I/O threads are never run to completion: The main process waits for the exit status, but the I/O threads may not have starts. I will provide a patch, but the solution is simply to return a handle containing both stdout/stderr Thread instances and wait for these threads... modules/core/util/src/main/java/org/rhq/core/util/exec/ProcessExecutor.java /** * Wrapper for threads used for capturing output. * Call {@link #join} to wait for output to be fully captured. */ protected static class RedirectThreads { private final StreamRedirector stdout; private final StreamRedirector stderr; ... protected RedirectThreads redirectStreams(ProcessToStart process, Process childProcess) throws IOException { ... return new RedirectThreads(stdoutThread, stderrThread); then ... + final RedirectThreads redirect = redirectStreams(process, childProcess); ... - int exitCode = childProcess.waitFor(); - retExitCode[0] = new Integer(exitCode); + } finally { + redirect.join(); } Simple as this. --- Additional comment from genman on 2012-09-13 18:02:49 EDT --- Created attachment 612648 [details] fixes process leak --- Additional comment from mazz on 2012-09-18 15:19:42 EDT --- (In reply to comment #2) > Created attachment 612648 [details] > fixes process leak I think the wrong patch was attached - the attached patch deals with PIQL queries. Can you remove that patch and attach the one for this issue? --- Additional comment from genman on 2012-09-18 17:07:02 EDT --- Created attachment 614123 [details] Ensure output thread completes before exiting startProgram Sorry I had the wrong patch. One thing you might want to adjust (for sake of the build) is adjust the number of times we check for correct output: modules/core/util/src/test/java/org/rhq/core/util/exec/ProcessExecTest.java diff --git a/modules/core/util/src/test/java/org/rhq/core/util/exec/ProcessExecTest.java b/modules/core/util/src/test/java/org/rhq/core/util/exec/ProcessExecTest.java index 49ab3f5..60b7aed 100644 --- a/modules/core/util/src/test/java/org/rhq/core/util/exec/ProcessExecTest.java +++ b/modules/core/util/src/test/java/org/rhq/core/util/exec/ProcessExecTest.java @@ -35,7 +35,7 @@ public class ProcessExecTest { public void testProcessExecOutputStream() { // run multiple times to ensure race condition fixed - for (int i = 0; i < 100; i++) { + for (int i = 0; i < 10; i++) { ProcessToStart start = new ProcessToStart(); setupProgram(start); I also have a version with thread pools being attached. I don't think a thread pool is worth it, but might be good idea. --- Additional comment from genman on 2012-09-18 17:08:50 EDT --- Created attachment 614124 [details] optional use of a thread pool Additional possible change to use a thread pool to handle I/O redirect. Might be good if there are a lot of script executions, though I don't know if it is worth it for the average case. --- Additional comment from mazz on 2012-09-18 17:20:41 EDT --- git commit to master: bf4b25f421769909e886eb42f7b99c66b207b200
Ready to be cherry picked to the 31x release branch
git cherry pick to release/jon3.1.x branch: 401c7a219876e5af9075ff09ee1bbb83cca6b884
need to fix the api change
cherry picked master git commit e4b77aefd05bd093816e46792a928c49d2f3b402 to deprecate old API: ad54ee06aabfa72fafaf1bb7936ac4d7175756e8
Moving to ON_QA as available for test with build : https://brewweb.devel.redhat.com//buildinfo?buildID=244662.
Closing. In current release.