Bug 1488179
Summary: | Heap Used metric disabled after automatic storage node re-inventory | ||||||
---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | Filip Brychta <fbrychta> | ||||
Component: | Storage Node, Monitoring -- Other | Assignee: | Michael Burman <miburman> | ||||
Status: | CLOSED ERRATA | QA Contact: | Filip Brychta <fbrychta> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | JON 3.3.8 | CC: | mfoley, pyadav, spinder | ||||
Target Milestone: | CR01 | Keywords: | Triaged | ||||
Target Release: | JON 3.3.10 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2018-02-16 03:16:33 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Filip Brychta
2017-09-04 14:55:48 UTC
Checked that StorageNodeMaintenanceOperationsFailure alert definition is created correctly after automatic re-inventory It's because it is enabled by the AlertDefinitionServerPlugin, that creates the alerts. When it creates the alerts, it enables the necessary metrics. Since the templates are already in the RHQ, there's no need to rerun this code and thus nothing enables them. Another note.. this is correctly enabled in the template when using master (RHQ). On standard installation (from the QE machines), the following RHQ CLI script makes everything work (this is what the plugin does): var rt = ResourceTypeCriteria() rt.addFilterPluginName("RHQStorage") rt.addFilterName("VM Memory System") rt.fetchMetricDefinitions(true) var defs = ResourceTypeManager.findResourceTypesByCriteria(rt) var rtVM = defs.get(0) var ite = rtVM.getMetricDefinitions().iterator() while(ite.hasNext()) { a = ite.next(); if(a.name == "{HeapMemoryUsage.used}") { rtIds[0] = a.id } } MeasurementScheduleManager.enableSchedulesForResourceType(rtIds, false) So it appears there's nothing wrong in the functions themselves in JON 3.3.9. I guess I should've checked this before spending all this time to track things. I can't repeat this bug on my own machine. If I unpack JON 3.3.0.GA and then use update 09 on top of that and install, it will work correctly. Maybe there's something in the QE VM scripts, but clearly not in the default installation. Back to this one. The main cause is plugin update. So when the plugin is updated, it restores the original settings (from the plugin). And then our serverplugin (alertdef-rhq) has a setting that by default it will not reset properties if the alerts were already defined. Fixing this requires overriding any user modified settings or making the storage node to override JMX plugin for these properties. Fixed in the master: commit 0607f52e542de726cee5955b8c5a5e70b30a225e (HEAD -> master) Author: Michael Burman <miburman> Date: Wed Nov 22 17:34:32 2017 +0200 [BZ 1488179] Force collection intervals to be updated even if alert definitions are not replaced Moving to ON_QA as available for test with the latest build: JON 3.3.10 DR01 artifacts are available for test from here: http://download.eng.bos.redhat.com/brewroot/packages/org.jboss.on-jboss-on-parent/3.3.0.GA/164/maven/org/jboss/on/jon-server-patch/3.3.0.GA/jon-server-patch-3.3.0.GA.zip *Note: jon-server-patch-3.3.0.GA.zip maps to DR01 build of jon-server-3.3.0.GA-update-10.zip. https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=635136 Verified All metrics enabled after re-inventory. Created attachment 1376766 [details]
re-invertoried platform
I still see it JON 3.3.10 -> moving back to assigned. Prachi, could you please retest? In attached screen shot in comment11 I can see NaN values for Heap Used metric. Is it the correct screen shot? The fix should be in modules/org/rhq/server-startup/main/deployments/rhq.ear/rhq-serverplugins/serverplugin-alertdef-rhq-*.jar but this file is not part of the update10: find jon-server-3.3.0.GA-update-10/ -name '*serverplugin-alertdef-rhq*' jon-server-3.3.0.GA-update-10/jon-server-updates/docs/jon-licenses/org.rhq,serverplugin-alertdef-rhq,4.12.0.JON330GA,GNU Lesser General Public License v2 (or 2.1) or later.txt jon-server-3.3.0.GA-update-10/jon-server-updates/docs/jon-licenses/org.rhq,serverplugin-alertdef-rhq,4.12.0.JON330GA,GNU General Public License v2.0.txt Simeon, could you please check why the fix did not make it to DR01? I also noticed that the serverplugin-alertdef-rhq-4.12.0.JON330GA.jar file is in .old dir: jon-server-3.3.0.GA/.patched/3.3.0.GA-update-10_01-09-18_09-48-35/.old/modules/org/rhq/server-startup/main/deployments/rhq.ear/rhq-serverplugins/serverplugin-alertdef-rhq-4.12.0.JON330GA.jar Simeon, are you aware of the reason why it's on remove list? Moving to ON_QA. JON 3.3.10 CR01 artifacts are available for test from here: http://download.eng.bos.redhat.com/brewroot/packages/org.jboss.on-jboss-on-parent/3.3.0.GA/166/maven/org/jboss/on/jon-server-patch/3.3.0.GA/jon-server-patch-3.3.0.GA.zip *Note: jon-server-patch-3.3.0.GA.zip maps to CR01 build of jon-server-3.3.0.GA-update-10.zip. Verified in: Version : 3.3.0.GA Update 10 Build Number : 800d329:2f3e0db Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0325 |