Bug 872738

Summary: Agent awaits shutdown of ProcessExecutor idle worker thread; should terminate immediately
Product: [Other] RHQ Project Reporter: Elias Ross <genman>
Component: AgentAssignee: John Mazzitelli <mazz>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: hrupp
Target Milestone: ---   
Target Release: RHQ 4.9   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugzilla.redhat.com/show_bug.cgi?id=1009658
https://bugzilla.redhat.com/show_bug.cgi?id=1126047
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-26 04:31:20 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
incomplete patch, but addresses some of the design
none
This is the complete suggested fix none

Description Elias Ross 2012-11-02 16:48:53 EDT
Description of problem:

If org.rhq.core.util.exec.ProcessExecutor() is used in a plugin, then Java will keep around an idle thread for a bit, like 60 seconds.

This slows down the agent shutdown process.

It might make sense to create a "ProcessExecutor.shutdown()" method that is called by the agent container that will free this thread. Or another approach is recommended.

2012-11-02 19:35:32,928 DEBUG [Thread-7] (org.rhq.enterprise.agent.AgentShutdownHook)- {AgentShutdownHook.thread-is-still-active}Thread [ExecuteProcess-/usr/local/bin/foo] is still alive - its stack trace follows:
java.lang.Throwable: Thread [ExecuteProcess-/usr/local/bin/foo]
	at sun.misc.Unsafe.park(Native Method)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
	at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:453)
	at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:352)
	at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:903)
	at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1043)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1103)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
	at java.lang.Thread.run(Thread.java:636)
Comment 1 Elias Ross 2012-11-02 16:57:01 EDT
Or try setting the thread to a daemon one. Although you may not really want that thread terminated if the agent is running some sort of shell command.

You can also set the timeout for pooled threads to be a lot smaller than 60 seconds.

Or remove the 'static' declaration.
Comment 2 Elias Ross 2012-11-06 17:42:48 EST
Created attachment 639678 [details]
incomplete patch, but addresses some of the design


The easiest fix is probably okay, but references to ProcessExecutor should probably be retained otherwise a thread pool is created on every exec. I changed the classes to hold a reference to a ProcessExecutor.

But still the issue is the Agent container should really be responsible and shut down the exec thread pool.

How can this hook be created? Could ResourceContainer's thread pool be used? Etc.

Another issue if the stream I/O threads are really daemon or not.
Comment 3 Elias Ross 2013-06-19 14:47:53 EDT
Created attachment 763100 [details]
This is the complete suggested fix

The only complaint I have is, is that SystemInfoFactory.shutdown() should really be a final shutdown.
Comment 4 John Mazzitelli 2013-06-27 13:41:30 EDT
In org.rhq.core.util.exec.StreamRedirector.StreamRedirector(String, InputStream, OutputStream), I see this has now been removed (because it is no longer extending Thread, but now only implementing Runnable):

    setDaemon(true);

However, I don't see the threads created by the thread factory to be daemon threads (the default, I believe, is to be non-daemon).

In order to get back to the original behavior, I think we need to add:

    t.setDaemon(true);

to:

org.rhq.core.system.SystemInfoFactory.threadFactory.new ThreadFactory() {...}.newThread(Runnable)

after line 72:

        public Thread newThread(Runnable r) {
            Thread t = new Thread(r);
            t.setName("systeminfo-" + threadNumber.getAndIncrement());
            t.setDaemon(true);
            return t;
        }

However, I am not going to do this now because I think the StreamRedirector threads are getting created by the same executor/thread pool that is running everything else (which I don't think we want to be daemon).

The question becomes - what happens if these stream redirect threads hang? It used to be it wouldn't matter - the agent VM will still go down because these were daemon threads. But now they are not. Is is possible this new shutdown code could leave some non-daemon threads hanging around?
Comment 5 John Mazzitelli 2013-06-27 13:56:59 EDT
(In reply to Elias Ross from comment #3)
> Created attachment 763100 [details]
> This is the complete suggested fix
> 
> The only complaint I have is, is that SystemInfoFactory.shutdown() should
> really be a final shutdown.

I pushed this to master branch - git commit : b361a1b07c7d14834fcf4986e09f7bf683be4bf8
Comment 6 John Mazzitelli 2013-09-10 10:38:01 EDT
the patch has been pushed to master and thus will be in RHQ 4.9. Moving this to ON_QA
Comment 7 Heiko W. Rupp 2014-03-26 04:31:20 EDT
Bulk closing now that 4.10 is out.

If you think an issue is not resolved, please open a new BZ and link to the existing one.