Bug 723800

Summary: Enhance snmptrapd plugin
Product: [Other] RHQ Project Reporter: Heiko W. Rupp <hrupp>
Component: PluginsAssignee: Heiko W. Rupp <hrupp>
Status: CLOSED UPSTREAM QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 4.1CC: genman, hrupp
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-29 19:32:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Heiko W. Rupp 2011-07-21 08:18:34 UTC
By Elias Ross

his is my first contribution.

I was working on the JMX side but it was decided we needed to support
the functionality of SNMP more. I was attempting to use the RHQ SNMP
listener for Python but although the plugin does support SNMP traps,
it does not do the two-way "inform" messages as described here:
http://www.cisco.com/en/US/docs/ios/11_3/feature/guide/snmpinfm.html

I wanted this support so I added it. So I verified that it now works
with pySNMP V2 inform messages.

I also spent some time on adding support for test ability. There
doesn't seem to be a good plugin suite to leverage, so I started to
craft something called "ComponentTest". I would like to have this
reviewed and maybe put into a common test library. There doesn't seem
to be any testing for many of the plugins which I think is
unfortunate.

I also ran into issues with how events are queued and polled for over
60 seconds. Necessarily, I had to change the minimum poll value to 1
second so I can run the test in a reasonable timeframe. This interval
is now configurable as well, but defaults to 60 as before.

Some of the whitespace and other changes are inadvertent.

-----------


Comment by Mazz:


this looks OK to me. Most involves changes to the snmptrapd plugin - 
which Heiko knows more about. Let's see what he says.

the other stuff looks ok except I would reject this:

+     * Used to be 1 minute, but since poll intervals are configurable 
it stands that 1 second will suffice.
      */
-    int MINIMUM_POLLING_INTERVAL = 60; // 1 minute
+    int MINIMUM_POLLING_INTERVAL = 1; // 1 second

1) I don't like changing defaults for what amounts to arbitrary reasons 
- some code out in the community might not realize the change in 
default. so for backward compatibilty reasons, keep it the same.

2) it is just a default - if you don't like it, you can configure it, as 
the comment says

3) I am leery about making the event subsystem do any more work than 
what its doing now (we've seen potential performance problems if too 
many events flood the system). I would prefer we NOT have the event 
subsystem in the agent get activated every second - certainly not as the 
default.

But other than that, everything else I think looks good. Let's see what 
Heiko says. We can commit to master once we get his input.


--------------

Reply by Elias

There is a little confusion here. (I was confused by this too.)

First of all, this change is only to facilitate testing. But it's a
change that's necessary in that context.

Secondly, I noticed it's not a default, it's a strict minimum you
can't change by configuration:

modules/core/plugin-container/src/main/java/org/rhq/core/pc/event/EventContextImpl.java

       final int adjustedPollingInterval =
Math.max(EventContext.MINIMUM_POLLING_INTERVAL, pollingInterval);
...
               getEventManager().registerEventPoller(poller,
adjustedPollingInterval, resource, sourceLocation);

So, I *can* change the code here but the purpose of this value is
really to bound the lower end according to the comments so it seemed
harmless.

However, as you point out it does appear to be used here as a default
in *one* case:

modules/plugins/platform/src/main/java/org/rhq/plugins/platform/SyslogFileEventLogDelegate.java:
          getEventContext().registerEventPoller(this.poller,
EventContext.MINI

which is probably a mistake.

Everywhere else (you can scan with 'git grep registerEventPoller') you
can see various values like '60' or '63' are used. Example:

modules/helpers/pluginGen/src/main/resources/component.ftl:
eventContext.registerEventPoller(eventPoller, 60);
modules/plugins/twitter/src/main/java/org/rhq/plugins/twitter/FeedComponent.java:
      eventContext.registerEventPoller(eventPoller, 63);
modules/plugins/twitter/src/main/java/org/rhq/plugins/twitter/TwitterComponent.java:
      eventContext.registerEventPoller(eventPoller, 53);

'53' here wouldn't even work. (It'd be 60).

Probably it would be best if there were some sort of system property
setting something like that? I don't know the point of having a
Interface context for something that deserves some sort of
configuration.


3) I am leery about making the event subsystem do any more work than
what its doing now (we've seen potential performance problems if too
many events flood the system). I would prefer we NOT have the event
subsystem in the agent get activated every second - certainly not as the
default.

It's not changing the default, it's changing the minimum value for
testing ... anyway, that's the intent.

Thanks for looking things over.

Comment 1 Heiko W. Rupp 2011-07-21 09:24:59 UTC
Sorry, I did not look at it earlier, but I was knee down in some other stuff.
I've created a BZ for this patch - I would have added you in CC, but BZ is too
#*"§$€@!! to let me do that.

Questions:
* Why did you ?

-            // SNMP v3
-            else if (pdu instanceof ScopedPDU) {
-                ScopedPDU spdu = (ScopedPDU) pdu;
-            }

* log.trace() should be guarded by a if (log.isTraceEnabled()) ( or better by some previously 
initialized boolean) - as you do with log.debug()

Mazz: I think the polling interval is ok here, as it only means how often the PC goes out
to query the plugin for more events. It does not lower the interval for sending event reports
to the server.
As the plugin itself has a thread that gets called when an snmp trap comes in, and then just
adds the data to a vector, from where the poller then adds it to the internal queue to be sent to
the server, I wonder if the poll interval matters much anyway.

Comment 2 Elias Ross 2011-07-21 20:41:04 UTC
* Why did you ?
 .. because that code doesn't do anything. It does a cast and throws away the result

* Trace:

+        log.trace("poll");

this is the only trace, doesn't need to be wrapped as I'm just tracing execution and not composing a statement to print.


Polling interval...Yes agree with your findings. It's probably a level of indirection that's not needed? Or at least it's confusing to people doing coding.

The contribution under LGPL was approved using the Apple OSS contribution form.

Return-path: <prabhaka>
Received: from hyssop.apple.com ([17.128.115.95])
 by mail9.apple.com (Oracle Communications Messaging Exchange Server 7u4-20.01
 64bit (built Nov 21 2010)) with ESMTP id <0LN90063GES0TKJ0.com>;
 Thu, 23 Jun 2011 13:12:01 -0700 (PDT)
Received: from relay13.apple.com ([17.128.113.29])
 by postino.apple.com (Oracle Communications Messaging Exchange Server
 7u4-20.01 64bit (built Nov 21 2010))
 with ESMTP id <0LN9009HTEQYJI90.com>; Thu,
 23 Jun 2011 13:12:00 -0700 (PDT)
X-AuditID: 1180711d-b7c5fae000001427-5a-4e039e045f03
Received: from kencur (kencur.apple.com [17.151.62.38])
	by relay13.apple.com (Apple SCV relay) with SMTP id EA.80.05159.40E930E4; Thu,
 23 Jun 2011 13:11:48 -0700 (PDT)
Received: from selfserve.apple.com (selfserve.apple.com [17.203.40.11])
 by kencur.apple.com
 (Oracle Communications Messaging Exchange Server 7u4-20.01 64bit (built Nov 21
 2010)) with ESMTPSA id <0LN900A78ERZOF90.com>; Thu,
 23 Jun 2011 13:12:00 -0700 (PDT)
Subject: Re: OSS Postback Request: RHQ
MIME-version: 1.0 (Apple Message framework v1237.1)
Content-type: text/plain; charset=iso-8859-1
From: "Ernest N. Prabhakar, Ph.D." <prabhaka>
In-reply-to: <20110621205550.144A2DEC40.apple.com>
Date: Thu, 23 Jun 2011 13:12:05 -0700
Cc: darwin-mgt.com, tmcneil, ksaul
Content-transfer-encoding: quoted-printable
Message-id: <869A00D4-72B8-4461-8516-7BD5169D22F6>
References: <20110621205550.144A2DEC40.apple.com>
To: elias_ross, iBoon
X-Mailer: Apple Mail (2.1237.1)
X-Brightmail-Tracker:
 H4sIAAAAAAAAA+NgFlrBLMWRmVeSWpSXmKPExsUiON1OTZdlHrOfwYwD8haPPp9htDjZtIXR
	gclj68kfbAGMUVw2Kak5mWWpRfp2CVwZh7Z9ZC643MdU8fnMecYGxktXGbsYOTkkBEwkpky8
	zgRhi0lcuLeeDcQWEljJKHFspxiEPYVJonWhNIgtLKAm8WD7M1YQm1fAWKJ98WUWEJtZQEei
	9/s35i5Gdg42AXOJT9YgUU4BR4klv+8CVXNwsAioSsx7HQpR7CRx5cZ3JghbW+LJuwtQA20k
	9i1sZwQpFxJwkLh5xA8kLCKgKXHl/gY2iBvlJbZtesk8gVFgFpITZiE5YRaSqQsYmVcxChal
	5iRWGhrrJRYU5KTqJefnbmIEhV9DoewOxv0/+Q8xCnAwKvHwrkxg9hNiTSwrrsw9xCjBwawk
	wtuRDBTiTUmsrEotyo8vKs1JLT7EKM3BoiTOe3XNJ18hgfTEktTs1NSC1CKYLBMHp1QDI/vK
	XUnTbhxu0+C3KXt3NU9/2qwrJtJJxVve2j+5KRUQdbNwQmOezIcL908snJdSyXLuxtumZ1LP
	De+LaDZ7+qos/BmbdVKLd21SgHvT/ajCUv58ngePVQRzp0zpMs2TT07tnJU6jZeFlXOmmfeC
	ylvhBr/z9bc6iih/C3isavXj38T5G5sTlViKMxINtZiLihMBQRMsGzsCAAA=

Comment 3 Heiko W. Rupp 2011-07-22 10:05:29 UTC
Elias,
great.

This is 826233b0e8fc6 in master - thanks a lot.

Comment 4 Mike Foley 2011-07-29 19:32:36 UTC
marking this closed/upstream.  this plugin is not part of jon.