Bug 829350 - Unable to configure the cadence of child discovery
Unable to configure the cadence of child discovery
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Agent (Show other bugs)
unspecified
Unspecified Unspecified
high Severity unspecified (vote)
: ---
: JON 3.1.1
Assigned To: Lukas Krejci
Mike Foley
:
Depends On: 802550
Blocks: jon310-sprint11/rhq44-sprint11
  Show dependency treegraph
 
Reported: 2012-06-06 10:23 EDT by Lukas Krejci
Modified: 2013-09-03 11:11 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 802550
Environment:
Last Closed: 2013-09-03 11:11:35 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Lukas Krejci 2012-06-06 10:23:24 EDT
+++ This bug was initially created as a clone of Bug #802550 +++

Description of problem:
The discovery of child resources is currently hardcoded to occur 5 seconds after the agent received the newly committed resources. This should be made configurable because it affects the load the plugin container generates during the initial import of inventories.

Version-Release number of selected component (if applicable):
4.4.0-SNAPSHOT

How reproducible:
always

Steps to Reproduce:
1. Configure the agent to have both server and service discovery periods set to an hour (3600s).
2. Watch the agent.log
3. Import a single server (with some child servers or services) into an otherwise empty platform
4. At the time you import the server in the UI, there should be a message in the agent.log stating:

Syncing local inventory with Server inventory...

5. 5 seconds after that message (but no sooner), there should be a message:

Executing runtime discovery scan rooted at [platform]

Actual results:
No way of influencing the 5 seconds interval

Expected results:
There should be a configuration property to set this interval.

Additional info:
Having this configuration property would be also good for plugin and plugin container tests that require deep hierarchies of resources.

--- Additional comment from lkrejci@redhat.com on 2012-03-12 15:48:22 EDT ---

Created attachment 569496 [details]
proposed patch

Attaching a patch for adding such configuration property to the agent and plugin container configuration.

--- Additional comment from ccrouch@redhat.com on 2012-03-19 11:55:56 EDT ---

If this is tested and easily testable then lets apply the patch

--- Additional comment from mazz@redhat.com on 2012-04-04 15:16:04 EDT ---

master commit: de1000f

--- Additional comment from fbrychta@redhat.com on 2012-06-01 11:06:56 EDT ---

I followed this scenario:
1. Configure the agent to have both server and service discovery periods set to an hour (3600s) and set rhq.agent.plugins.child-discovery.delay-secs to 120s
2. import rhq-agent
3. check all imported resources -> platform's and rhq-agent's child resources was imported 
4. after 120s i can see in agent.log 'Executing runtime discovery scan rooted at [platform]' -> after this, child resources of agent's child resources was imported

Example: right after i imported the agent, i could see the JVM resource as a agent's child. The JVM had no child resources. After 120s JMV's child resources was imported. 

According to description of rhq.agent.plugins.child-discovery.delay-secs in agent-configuration.xml i would expect that all agent's child resources (including the JVM) should be imported after 120s.

--- Additional comment from mazz@redhat.com on 2012-06-05 11:50:05 EDT ---

bug 823942 has very recently changed the same area in InventoryManager that this patch changed. I'm not sure how it affected it, but its possible.

See this commit:

http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commit;h=14d53ea73b219a85d1b54584457ed48a60e1a556

--- Additional comment from mazz@redhat.com on 2012-06-05 13:11:00 EDT ---

(In reply to comment #5)
> bug 823942 has very recently changed the same area in InventoryManager that
> this patch changed. I'm not sure how it affected it, but its possible.
> 
> See this commit:
> 
> http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commit;
> h=14d53ea73b219a85d1b54584457ed48a60e1a556

That was Jay's commit. We still aren't sure if that commit has anythign to do with this. Plus, the patch for this issue was very simple - just added a new config to avoid hardcoding the "5" in the code.

Jay and I aren't sure if there is anything wrong here. Looking at this further, but we may need more input from Lukas to see if this really is still broken or not. In fact, I'll add a NEEDINFO here for Lukas to chime in since it was his patch that introduced this new config option to fix the issue.

--- Additional comment from lkrejci@redhat.com on 2012-06-05 13:22:22 EDT ---

<lkrejci> mazz: i thought that was more of a documentation issue actually... i think filip's expectation was that *all* the child resources are going to get discovered after that delay
<lkrejci> but because our discovery is incremental per level, the delay is applied before *each* child discovery, at each level
<lkrejci> i think that was his complaint... and i think that it therefore works as designed, only the docs are not clear enough

Filip, can you confirm this is what you're seeing/expecting?

--- Additional comment from jshaughn@redhat.com on 2012-06-05 16:08:49 EDT ---


I think maybe he is wondering why the JVM child is discovered immediately and not deferred like the other children.

--- Additional comment from fbrychta@redhat.com on 2012-06-06 08:04:55 EDT ---

Yes Jay, i expected that even immediate children would be discovered after defined delay. Lukas clarified that following is correct and expected behaviour:1- after manual import of resource, his immediate children are discovered immediately
2- children on a next level are discovered after rhq.agent.plugins.child-discovery.delay-secs
3- children on a next level are discovered after rhq.agent.plugins.child-discovery.delay-secs
4-... recursively

--- Additional comment from lkrejci@redhat.com on 2012-06-06 08:57:45 EDT ---

Leaving in ON_DEV until we decide what JON version this is going to go in.

master http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=commitdiff;h=27109e402d86deb7804249b870e7c15de7263491
Author: Lukas Krejci <lkrejci@redhat.com>
Date:   Wed Jun 6 14:54:38 2012 +0200

    [BZ 802550] - rewording the docs on rhq.agent.plugins.child-discovery.delay-secs
    to better explain what it actually means.
Comment 1 Lukas Krejci 2012-06-06 10:24:58 EDT
Retargetting to JON 3.1.1. 

We need to merge the commit 27109e402d86deb7804249b870e7c15de7263491 into JON 3.1.1 release branch.
Comment 2 Lukas Krejci 2012-08-07 06:02:17 EDT
release/jon3.1.x http://git.fedorahosted.org/cgit/rhq/rhq.git/diff/?id=a00038bf941b2c719a133beb7e61aeddefb3803d
Author: Lukas Krejci <lkrejci@redhat.com>
Date:   Wed Jun 6 14:54:38 2012 +0200

    [BZ 802550] - rewording the docs on rhq.agent.plugins.child-discovery.delay-secs
    to better explain what it actually means.
    (cherry picked from commit 27109e402d86deb7804249b870e7c15de7263491)
Comment 3 John Sanda 2012-08-13 22:16:33 EDT
Moving to ON_QA since JON 3.1.1 ER2 build is availble - https://brewweb.devel.redhat.com/buildinfo?buildID=228250
Comment 4 Lukas Krejci 2012-08-14 04:04:43 EDT
Moving to ON_QA
Comment 5 Filip Brychta 2012-08-14 07:34:13 EDT
Verified on JON 3.1.1 ER2
Comment 6 Heiko W. Rupp 2013-09-03 11:11:35 EDT
Bulk closing of old issues in VERIFIED state.

Note You need to log in before you can comment on or make changes to this bug.