Bug 826715

Summary: re-enable jmx plugin to detect logging category changes
Product: [Other] RHQ Project Reporter: John Mazzitelli <mazz>
Component: PluginsAssignee: John Mazzitelli <mazz>
Status: ASSIGNED --- QA Contact: Mike Foley <mfoley>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: hrupp, loleary
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Attachments:
Description Flags
patch for PoC to implement this none

Description John Mazzitelli 2012-05-30 16:19:58 EDT
in bug 535697, we turned off by default the detection of logging categories (read that issue for why we did that).

We want to re-enable this while avoiding the issues of that original bug.

Can we have the initial discovery of the categories be done, but turn if off thereafter. We can provide an operation to manually run another detection of new categories in case the user wants to check for new categories.
Comment 2 John Mazzitelli 2012-05-31 09:24:38 EDT
Created attachment 588089 [details]
patch for PoC to implement this

i am attaching a potential patch that can implement this. I did run some quick tests and used a debugger to make sure its doing what I think its doing. Small patch that only affects one java class and the plugin descriptor for the JMX plugin.
Comment 3 John Mazzitelli 2012-06-04 09:49:39 EDT
Charles - adding a NEEDINFO here to you. Just a reminder to figure out what you want to do with this - the patch is attached and my testing shows this working. However, we need to decide if we really want to introduce this to master or not. So, triage this and put the issue where you see fit.
Comment 4 Charles Crouch 2012-06-05 11:27:07 EDT
Mazz, this is still under discussion as part of PRODMGT-130

Another enhancement Larry just mentioned is for existing customers who have this enabled is to add another plugin configuration property to use to set the frequency of collection of configuration changes. This way customers that want this enabled, also get the ability to determine how frequently its collected, if 30mins is too frequent say.
Comment 5 Larry O'Leary 2012-06-05 13:31:50 EDT
My suggestion in comment #4 deals primarily with the existing user use-case. Although this proposed solution sounds perfect to me, my concern is how it will impact the user who is using the existing implementation. In other words, they have enabled configuration management for one or more Logging service resources and just expect it to update the configuration elements every 30 minutes as it does currently when enabled.

So, my suggestion of adding this as a configurable plug-in option or resource option will allow the user to set the interval of when the operation is invoked. Or they can set it to 0 or disabled or something with similar meaning to make it a manual process. For users that have legacy Logging resources in inventory, the interval would be 30 minutes (or what ever the current interval is when config management is enabled) and for all new resources, we would set config management to enabled with the interval disabled, requiring the operation be invoked if updates are needed.
Comment 6 John Mazzitelli 2012-07-03 11:25:10 EDT
Aside from making this resource-specific (that is, adding a plugin configuration property so this particular resource won't report differing configuration until X minutes passes), I assume the agent-wide preference "rhq.agent.plugins.configuration-discovery.period-secs" isn't good enough? Its default is 1 hour.
Comment 7 John Mazzitelli 2012-07-03 11:28:12 EDT
I can also make the proposed plugin config prop not a boolean (as seen in the attached patch) but rather as an integer. If 0, its as if its "false" or "disabled". If non-zero, that's the time that must elapse before the resource declares its configuration changed (if it has changed). This should be simple to do.
Comment 8 Larry O'Leary 2012-07-03 15:19:23 EDT
To make sure I understand Comment 7:

  - Setting the integer = 0 will result in the configuration being discovered upon the initial service discovery and never again (unless executed manually)

  - Setting the integer > 0 will result in the configuration being discovered upon the initial service discovery and then again at <interval> seconds

  - If the configuration property is unset (null) for existing resources it defaults to rhq.agent.plugins.configuration-discovery.period-secs

  - manageConfigurationEnabled will now default to "true"

  - For newly discovered resources, the integer will be set to 0




The end result will be:

  - Logging services already in inventory with manageConfigurationEnabled set to true will continue to have their Logging configuration re-discovered every rhq.agent.plugins.configuration-discovery.period-secs interval (no change)

  - Logging services in inventory with manageConfigurationEnabled set to false will have their Logging configuration re-discovered every rhq.agent.plugins.configuration-discovery.period-secs interval in the event they set manageConfigurationEnabled to true (no change)

  - Newly discovered Logging services will have their manageConfigurationEnabled set to true and the configurationDiscovery interval set to 0 meaning that the configuration will be discovered on the initial discovery but updates will not happen automatically
Comment 9 Charles Crouch 2012-07-10 11:12:15 EDT
And also that we support a manual operation

+         <operation name="rhq.reloadResourceConfiguration" displayName="Reload Configuration" description="When invoked, this operation prepares the resource to reload its resource configuration the next time it is requested.">
+         </operation>
+
Comment 10 Charles Crouch 2012-07-10 11:13:36 EDT
We need to make sure we put tests in place for this feature, particularly around the upgrade scenarios which Larry outlined.
Comment 11 John Mazzitelli 2012-07-10 11:35:34 EDT
(In reply to comment #8)
> To make sure I understand Comment 7:

Let me clarify below:

>   - Setting the integer = 0 will result in the configuration being
> discovered upon the initial service discovery and never again (unless
> executed manually)

Yes.  If configuManagementEnabled is false, we simply NEVER obtain the configuration, but if that setting is true, and this new integer property =  0, only obtain the configuration the very first time the plugin container asks for the configuration but never after that; if integer is > 0, then do what I will explain below...

>   - Setting the integer > 0 will result in the configuration being
> discovered upon the initial service discovery and then again at <interval>
> seconds

Not quite. Remember, it is the *plugin container* that asks for periodic configuration discovery - not the resources. The resources only obtain the configuration on-demand - that is, when they are asked for it. So what I am saying is, if the integer > 0, then only if the plugin container asks for the configuration AFTER that time period has passed *since the last time the resource obtained the configuration*, will the resource probe the logging data again and report it in a new configuration. Here's a fuller description:

1) resource is initialized (i.e. the plugin component is started)

2) the plugin container asks the resource (the plugin component) for its configuration (the plugin container would do this for a number of reasons - maybe the user asked to see the current configuration by navigating to the resource page in the GUI or maybe the plugin container is in the middle of a configuration scan and asked for the current configuration - this is what rhq.agent.plugins.configuration-discovery.period-secs defines - the period of the PC's configuration scan)

2a) IF configManagementEnabled = false (and we can add in the additional constraint, "OR if the integer < 0"), that PC request for the configuration is ignored and an empty configuration object is returned

2b) IF integer = 0, and the resource has not yet obtained any configuration before, it will do so now and return that configuration and cache that configuration. If the resource has already obtained its configuration and has it cached away, that cached configuration is immediately returned (and this will be true for all subsequent requests for the configuration since it will always have the cached configuration squirreled away - until, of course, the resource is shutdown during agent shutdown for example or the cache is cleared by the operation which I will describe now).  Note that we can create an operation on the resource (like in my current patch attached to this issue) such that that operation "clears the cache" so the next time the PC asks for the configuration, it will obtain the configuration again and cache that new configuration.

2c) IF integer > 0, and the resource has not yet obtained any configuration before, it will do so now and return that configuration and cache that configuration. If the resource has already obtained its configuration and its in the cache BUT IT OBTAINED IT A LONG TIME AGO, LONGER THAN THE NUMBER OF SECONDS SPECIFIED BY INTEGER, then the resource will AGAIN go obtain the current live configuration, cache it (removing the old cached config) and return it. IF the cached configuration is younger than the number of seconds specified by the integer, the current live configuration is not obtained by the resource and simply the cached configuration is immediately returned. Note that we can create an operation on the resource (like in my current patch attached to this issue) such that that operation "clears the cache" so the next time the PC asks for the configuration, it will obtain the configuration again and cache that new configuration.

>   - If the configuration property is unset (null) for existing resources it
> defaults to rhq.agent.plugins.configuration-discovery.period-secs

For existing resources, that could be OK because at least it would keep the behavior as backward-compatible as possible for existing things already in inventory.

>   - manageConfigurationEnabled will now default to "true"
>   - For newly discovered resources, the integer will be set to 0

Yes. It should default to true/0 when the discovery component finds a new resource and sets this value for it. This would at least provide some initial configuration to the resource (though it won't necessarily be complete since, as the JVM runs, its possible new logging categories would be created - but if the user wants to see that data, they would up the value to > 0 or use that operation to clear the cache and force it to obtain the new configuration the next time it is asked for it.

> The end result will be:
> 
>   - Logging services already in inventory with manageConfigurationEnabled
> set to true will continue to have their Logging configuration re-discovered
> every rhq.agent.plugins.configuration-discovery.period-secs interval (no
> change)
> 
>   - Logging services in inventory with manageConfigurationEnabled set to
> false will have their Logging configuration re-discovered every
> rhq.agent.plugins.configuration-discovery.period-secs interval in the event
> they set manageConfigurationEnabled to true (no change)

Yes, because the new integer property would be null, so we would catch that, and assume its value to be a default of rhq.agent.plugins.configuration-discovery.period-secs.

>   - Newly discovered Logging services will have their
> manageConfigurationEnabled set to true and the configurationDiscovery
> interval set to 0 meaning that the configuration will be discovered on the
> initial discovery but updates will not happen automatically

Correct.
Comment 12 Larry O'Leary 2012-07-10 13:17:59 EDT
(In reply to comment #11)
> 2b) IF integer = 0, and the resource has not yet obtained any configuration
> before, it will do so now and return that configuration and cache that
> configuration. If the resource has already obtained its configuration and
> has it cached away, that cached configuration is immediately returned (and
> this will be true for all subsequent requests for the configuration since it
> will always have the cached configuration squirreled away - until, of
> course, the resource is shutdown during agent shutdown for example or the
> cache is cleared by the operation which I will describe now).  Note that we
> can create an operation on the resource (like in my current patch attached
> to this issue) such that that operation "clears the cache" so the next time
> the PC asks for the configuration, it will obtain the configuration again
> and cache that new configuration.

So, the configuration would/could change each time the plugin container is started/restarted? I am basing this off the impression that the cache is volatile. 

> 2c) IF integer > 0, and the resource has not yet obtained any configuration
> before, it will do so now and return that configuration and cache that
> configuration. If the resource has already obtained its configuration and
> its in the cache BUT IT OBTAINED IT A LONG TIME AGO, LONGER THAN THE NUMBER
> OF SECONDS SPECIFIED BY INTEGER, then the resource will AGAIN go obtain the
> current live configuration, cache it (removing the old cached config) and
> return it. IF the cached configuration is younger than the number of seconds
> specified by the integer, the current live configuration is not obtained by
> the resource and simply the cached configuration is immediately returned.
> Note that we can create an operation on the resource (like in my current
> patch attached to this issue) such that that operation "clears the cache" so
> the next time the PC asks for the configuration, it will obtain the
> configuration again and cache that new configuration.

So, the interval isn't really an interval but a "cache-max-age-secs" property? We should make this clear as to what the value means as to not to confuse it with rhq.agent.plugins.configuration-discovery.period-secs. From what you are describing:

   - Every <rhq.agent.plugins.configuration-discovery.period-secs> seconds we ask all resources for configuration changes/updates.
   - For the Logging service, we will provide the cached copy, if it exists, and if the cache-max-age-secs value has not expired or is 0. Otherwise, we will re-discovery/read the configuration for the Logging service and re-populate the cache. This all assuming manageConfigurationEnabled is set to true.


Which of course brings me to another confusing point... This configuration isn't really plugin configuration but resource configuration, correct? The property we are using to determine the scan interval for configuration changes is the correct one? This seems like it should be more like rhq.agent.resource.configuration-discovery.period-secs.




Overall, this sounds good to me.
Comment 13 John Mazzitelli 2012-07-10 13:39:41 EDT
> So, the configuration would/could change each time the plugin container is
> started/restarted? I am basing this off the impression that the cache is
> volatile. 

Correct. It is volatile. Specifically, it is stored in a member variable of the plugin component - when it is shutdown, it clears the cache. When the agent restarts, obviously, it starts clean. This is an in-memory cache only.

> So, the interval isn't really an interval but a "cache-max-age-secs"
> property?

Correct. That's what it is.

> We should make this clear as to what the value means as to not to
> confuse it with rhq.agent.plugins.configuration-discovery.period-secs.

Agreed. We will name it to express what it really is, plus, the <simple-property> metadata will have a description defined so the user will get a human-readable description of this value and what it does.

> From what you are describing:
> 
>    - Every <rhq.agent.plugins.configuration-discovery.period-secs> seconds
> we ask all resources for configuration changes/updates.
>    - For the Logging service, we will provide the cached copy, if it exists,
> and if the cache-max-age-secs value has not expired or is 0. Otherwise, we
> will re-discovery/read the configuration for the Logging service and
> re-populate the cache. This all assuming manageConfigurationEnabled is set
> to true.

That's exactly right.

> Which of course brings me to another confusing point... This configuration
> isn't really plugin configuration but resource configuration, correct?

Correct. The configuration we are loading and caching is the RESOURCE configuration - the actual configuration of the Logging resource.

> The property we are using to determine the scan interval for configuration
> changes is the correct one? This seems like it should be more like
> rhq.agent.resource.configuration-discovery.period-secs.

It is named "rhq.agent.plugins.configuration-discovery.period-secs" because this is an agent configuration setting specifically for the plugin container (PC). All plugin container settings found in the agent's config are prefixed with rhq.agent.plugins for consistency - for example:

rhq.agent.plugins.directory (where the PC can find its plugins)
rhq.agent.plugins.operation-invocation-timeout-secs (the PC's default op timeout)
rhq.agent.plugins.server-discovery.period-secs (the PC's interval for running server scans)
rhq.agent.plugins.drift-detection.period-secs (the PC's interval for drift detection runs)

... and many more...

So for consistency, all PC configuration settings are prefixed with "rhq.agent.plugins". The "plugins." doesn't mean its a "plugin configuration" discovery - its just part of the prefix to denote this is a PC setting.