Bug 1230411 - Storage node results in 1000s of configuration changes filling up the database
Summary: Storage node results in 1000s of configuration changes filling up the database
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Plugin -- Other, Storage Node
Version: JON 3.3.2
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ER02
: JON 3.3.4
Assignee: Libor Zoubek
QA Contact: Sunil Kondkar
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-10 20:29 UTC by dsteigne
Modified: 2019-08-15 04:42 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-28 14:36:39 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 1478383 0 None None None Never
Red Hat Product Errata RHSA-2015:1947 0 normal SHIPPED_LIVE Important: Red Hat JBoss Operations Network 3.3.4 update 2015-10-28 18:36:15 UTC

Description dsteigne 2015-06-10 20:29:38 UTC
Description of problem:
Table RHQ_CONFIG_PROPERTY not purged see upstream - https://bugzilla.redhat.com/show_bug.cgi?id=1208510

Looking at the Configuration history report, alot of items for the Storage Node are listed, for each one line in the report it adds a few hundred rows to the rhq_config_property table and nothing is purged.

Version-Release number of selected component (if applicable):
3.3.2

How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Larry O'Leary 2015-06-11 15:09:46 UTC
To be clear, the issue here is the configuration properties that exist in the storage node plug-in. For example, for whatever reason, we include the tokens assigned to the node as a configuration property. This property changes constantly. This results in configuration change detection which results in hundreds of rows of data being added to the RHQ_CONFIG_PROPERTY table.

It is not clear why this is a configuration property. It is read-only. Therefore, it is not a configuration option. Perhaps this should be a trait. There may be other properties that cause this behavior too. From the looks of it, most of the configuration is read-only. Therefore, it is not configuration.

Comment 2 dsteigne 2015-06-12 15:52:11 UTC
After having the customer look at there "Recent Operations" they were running snapshots of their 3 storage nodes every hour. This is what was causing some many rows in the rhq_config_property table

Comment 3 Larry O'Leary 2015-07-09 19:25:09 UTC
To be clear, comment 2 only highlights one way of triggering this bug. In this case, the hourly snapshot was causing a flush which in turn resulted in the configuration changes. The run state of the node should not alter the configuration. Configuration are things that can be changed/tweaked or "configured" while tokens, state, status, number of loaded classes, etc. are all "states" that can be either treated as a numeric measurement or a identifiable trait.

Comment 4 Libor Zoubek 2015-09-18 13:26:50 UTC
I tried to reproduce this issue and I agree with Larry, that we must move some configuration properties to traits

I identified, that suspect resource is type Storage Service

Config Properties: Load Map, Ownership, Token to Endpoint Map.

Load Map - a Map<storage hosts, amount of stored data>. Changes *all the time** because we're still storing new metrics over time
 - I'll remove this config property and create "Load" trait which will only output Load value for current host

Ownership - usually changes after snapshot/repair
Token to Endpoint Map - changes after snapshot/repair - not sure what is this good for, maybe it can be removed. I don't see a good way of representig this as a trait metric.

Comment 5 Libor Zoubek 2015-09-18 14:19:01 UTC
Also MessagingService contains lots of read-only config properties (usually couters) I'll redo those config properties to metrics.

Comment 6 Larry O'Leary 2015-09-18 14:50:55 UTC
(In reply to Libor Zoubek from comment #4)
> Ownership - usually changes after snapshot/repair
> Token to Endpoint Map - changes after snapshot/repair - not sure what is
> this good for, maybe it can be removed. I don't see a good way of
> representig this as a trait metric.

I am not sure what this data is good for either. Perhaps John S can comment? Moving this into operations could be a better solution then a trait. Of course that is assuming that the return data will not exceed 2000 characters.

Comment 7 Libor Zoubek 2015-09-21 07:41:14 UTC
2000 char limit applies only on simple operation result (where we store everthing into 1 text field), result of this operation would be a list of small maps, so I think we're ok here.

Comment 8 Libor Zoubek 2015-09-30 13:43:32 UTC
branch:  master
link:    https://github.com/rhq-project/rhq/commit/6ad035892
time:    2015-09-30 15:39:47 +0200
commit:  6ad035892c83059494469072415259e180df1ecc
author:  Libor Zoubek - lzoubek
message: Bug 1230411 - Storage node results in 1000s of configuration changes
         filling up the database

         Several Configuration properties were transformed to metrics or
         operations.

         StorageService resourceType: LoadMap and Ownership config
         properties were removed (metrics already existed) we're
         "loosing" information about other nodes (which can be obtained
         on those node resources). TokenToEndpointMap property was
         removed and transformed to operation called "View Token To Map
         Ednpoint"

         MessagingService resourceType:DroppedMessages and
         RecentlyDroppedMessages transformed to operations, all other
         config properties transformed to metrics (no properties left)

Comment 9 Libor Zoubek 2015-09-30 17:34:16 UTC
branch:  release/jon3.3.x
link:    https://github.com/rhq-project/rhq/commit/59be2e606
time:    2015-09-30 19:33:16 +0200
commit:  59be2e6068a17799fb430332a2285222ff83bd36
author:  Libor Zoubek - lzoubek
message: Bug 1230411 - Storage node results in 1000s of configuration changes
         filling up the database

         Several Configuration properties were transformed to metrics or
         operations.

         StorageService resourceType: LoadMap and Ownership config
         properties were removed (metrics already existed) we're
         "loosing" information about other nodes (which can be obtained
         on those node resources). TokenToEndpointMap property was
         removed and transformed to operation called "View Token To Map
         Ednpoint"

         MessagingService resourceType:DroppedMessages and
         RecentlyDroppedMessages transformed to operations, all other
         config properties transformed to metrics (no properties left)

         (cherry picked from commit
         6ad035892c83059494469072415259e180df1ecc) Signed-off-by: Libor
         Zoubek <lzoubek>

Comment 10 Simeon Pinder 2015-10-09 04:40:22 UTC
Moving to ON_QA as available to test with the following build:
https://brewweb.devel.redhat.com/buildinfo?buildID=460382

 *Note: jon-server-patch-3.3.0.GA.zip maps to ER01 build of
 jon-server-3.3.0.GA-update-04.zip.

Comment 11 Simeon Pinder 2015-10-15 05:18:00 UTC
Moving target milestone to ER02 to retest after latest Cassandra changes.

Comment 12 Simeon Pinder 2015-10-15 05:22:37 UTC
Moving to ON_QA as available to test with the following build:
https://brewweb.devel.redhat.com//buildinfo?buildID=461043

 *Note: jon-server-patch-3.3.0.GA.zip maps to ER02 build of
 jon-server-3.3.0.GA-update-04.zip.

Comment 13 Sunil Kondkar 2015-10-19 16:03:08 UTC
Tested applying patch to JBoss ON 3.3GA before installation and after installation.

1)  When the patch is applied in advance:

Storage Node->Database Management Services->Storage Services does not have the configs -  LoadMap and 'Ownership' and 'Token To Endpoint Map, . Verified that the 'Token To Endpoint Map is now an operation "View Token To Endpoint Map".
 
Storage Node->Network Services->Messaging service does not have any configurations.
The previous configs 'Command Pending Tasks', 'Command Completed Tasks', 'Response Pending Tasks', 'Response Completed Tasks', 'Timeouts Per Host', 'Recent Timeouts Per Host' are now listed as metrics

and the previous configs 'Dropped Messages, and 'Recently Dropped Messages' are now operations "List Dropped Messages" and "List recently Dropped Messages".

Verified that below operations are working: 
View Token To Endpoint Map
List Dropped Messages
List recently Dropped Messages

The operation 'Take Snapshot' results an increase of around 10 rows of rhq_config_property table.
---------------------------------------------------------------------------------

2) Applying patch to already installed JBoss ON 3.3GA:

Tested applying patch to already installed JBoss ON 3.3GA, I am facing the same result as explained in Bug# 1272528
The storage node availability is down in UI.

Steps to reproduce:
- Install and start JBoss ON 3.3GA
- Stop with rhqctl stop
- apply the ER02 patch (3.3.0.GA Update 04)
- Merge the 'cassandra-jvm.properties.new' with 'cassandra-jvm.properties'.

cp ./jon-server-3.3.0.GA/rhq-storage/conf/cassandra-jvm.properties.new ./jon-server-3.3.0.GA/rhq-storage/conf/cassandra-jvm.properties

- rhqctl start
- The storage node availability is down in the UI.

Comment 14 Sunil Kondkar 2015-10-19 16:13:26 UTC
Marking as assigned for storage node availability issue when tested with applying patch to already installed JBoss ON 3.3GA.

Comment 16 Sunil Kondkar 2015-10-20 15:46:16 UTC
Thanks Simeon..it works after doing the operation: Update all Plugins at agent or 'Update Plugins on Agents' in administration page->agent plugins.
The storage node is up and the test is pass.

Did some more testing as below:

1) Patch applied on Jboss 3.3GA which is installed but not started.

- ./rhqctl install
- Apply 3.3.0.GA Update 04
- Replace the JMX_OPTS value in cassandra-jvm.properties to JMX_OPTS="-Dcassandra.jmx.local.port=${jmx_port}"
( The value of JMX_OPTS in cassandra-jvm.properties.new is : JMX_OPTS="-Dcassandra.jmx.local.port=${jmx_port}" )
- ./rhqctl start

The storage node is up and the test is pass.

2) Patch applied on Jboss 3.3GA which is installed, started and then stopped

- ./rhqctl install
- ./rhqctl start
- Import the resources in inventory
- ./rhqctl stop
- Apply 3.3.0.GA Update 04
- Replace the JMX_OPTS value in cassandra-jvm.properties to JMX_OPTS="-Dcassandra.jmx.local.port=${jmx_port}"
( The value of JMX_OPTS in cassandra-jvm.properties.new is : JMX_OPTS="-Dcassandra.jmx.local.port=${jmx_port}" )
- ./rhqctl start
- The storage node availability was down
- Execute operation Update all Plugins at agent/ OR 'Update Plugins on Agents' in administration page->agent plugins.

The storage node is up and the test is pass.

Comment 18 errata-xmlrpc 2015-10-28 14:36:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1947.html


Note You need to log in before you can comment on or make changes to this bug.