Red Hat Bugzilla – Full Text Bug Listing
|Summary:||Component start() called more than once without an intervening stop()|
|Product:||[Other] RHQ Project||Reporter:||Jay Shaughnessy <jshaughn>|
|Component:||Agent||Assignee:||John Mazzitelli <mazz>|
|Status:||CLOSED CURRENTRELEASE||QA Contact:||Mike Foley <mfoley>|
|Version:||4.3||CC:||hrupp, lkrejci, mazz|
|Target Release:||RHQ 4.3.0|
|Fixed In Version:||4.3||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|:||787290 (view as bug list)||Environment:|
|Last Closed:||2013-09-01 06:04:29 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
|Bug Blocks:||760116, 787290|
Description Jay Shaughnessy 2012-01-10 12:43:39 EST
It seems that at agent startup, or after a container restart (e.g after a 'plugins update' agent command), that a resource component may have its start() method invoked more than once without an intervening stop() call. This breaks the lifecycle expectations. I have seen this with the AS4 server resource but it may affect other/all resources, I'm not sure.
Comment 1 John Mazzitelli 2012-01-10 15:48:08 EST
Created attachment 551933 [details] stack traces for the two start() calls attaching a file that shows the stack traces for the two start() calls. There are two threads involved in each call. So the first start() triggers the first two thread stack traces; the second start() call triggers the second two thread stack traces
Comment 2 Lukas Krejci 2012-01-11 07:09:06 EST
I think this might be the cause for Augeas leaks in resource components that inherit from AugeasConfigurationComponent (i.e. all the augeas plugins except Apache). I have seen this using the https://github.com/rhq-project/samples/tree/master/agent/debug-tools/augeas-leak-detector but didn't spend enough time to get to the bottom of it. The bug 766959 contains an example output of that leak detection capturing the create locations that weren't closed inside the start() methods of various Augeas-based resource components.
Comment 3 Jay Shaughnessy 2012-01-11 17:00:43 EST
master commit 3a40aefab45f851facda766e766bb312675d1e63 A variety of auto-formatting changes took place, actual changes are in: InventoryManager.prepareResourceForActivation InventoryManager.activateResource The resource component state could be STOPPED or STARTED. There was a large window while a component was actually starting that a call to prepareResourceForActivation would happily allow the component to again be activated. So, added an actual STARTING state that can block redundant activation. Additionally, if forcing reactivation ensure a STARTED component is stopped before being restarted. Also: - remove unnecessary and bad lazy state check - comment out some unused/debug code - remove warnings Test Notes Not easily verifiable. General lookout for seemingly related issues at agent startup (like resources not showing as available), or errors in the agent log after an agent shutdown/start sequence.
Comment 8 John Mazzitelli 2012-02-02 08:22:36 EST
if this is truly the reason for that augeas leak (and if that is so, maybe its also the cause of those agent crashes on shutdown due to augeas??) I would put this in.
Comment 10 John Mazzitelli 2012-02-02 12:29:23 EST
talked to jay after I did some peer review and I think we are going to make some minor additions to this code. Nothing that affects the core behavior, but more for better handing of possible error conditions.
Comment 11 John Mazzitelli 2012-02-02 15:29:17 EST
I tweeked the code after some peer review and discussion with jay. There really isn't any functionality change here, so unsure if QA wants to retest or not. Its really an internal code change for better exception handling. master commit: f819631220c83f1ae8874953035e60720e61cff4
Comment 12 John Mazzitelli 2012-02-03 13:12:11 EST
added a couple asserts in ResourceUpgradeTest to make sure the component state is set properly... see git commit 16224a4 in master branch
Comment 13 John Mazzitelli 2012-02-03 15:17:16 EST
this was pushed to the release/jon3.0.x branch
Comment 14 John Mazzitelli 2012-02-03 15:18:17 EST
opps... forgot git commit sha - 841dcce in release/jon3.0.x
Comment 15 John Mazzitelli 2012-02-03 15:20:10 EST
i cloned this for the tracking of it in the release/jon3.0.x branch.
Comment 16 Heiko W. Rupp 2013-09-01 06:04:29 EDT
Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since.