Bug 1115725 - Operation history continues to grow unbound with no easy way to purge old history
Summary: Operation history continues to grow unbound with no easy way to purge old his...
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Database, Operations
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: DR01
: JON 3.3.0
Assignee: Jay Shaughnessy
QA Contact: Mike Foley
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-02 23:18 UTC by Larry O'Leary
Modified: 2018-12-06 17:09 UTC (History)
3 users (show)

(edit)
It is now possible to purge operation history to provide better database performance and reduce table space issues. This feature adds an operation history purge to the data purge job.  A new system setting called "Delete Operation History Older Than" is added to Administration > Configuration > System Settings >> Data Manager Configuration Properties. The default for this system setting is 0 days, which means disabled. The db-upgrade adds the new system setting (also set to disabled) to prevent upgrades from automatically forcing an unexpected purge of operation history. Auto-purge and retention configuration was available for alert history and history. However operation history was excluded from these operations. The existing option of going into each resource individually to track down operation history was unacceptable. Customers can now better manage operation history in JBoss ON.
Clone Of:
(edit)
Last Closed: 2014-12-11 14:01:07 UTC


Attachments (Terms of Use)

Description Larry O'Leary 2014-07-02 23:18:07 UTC
Description of problem:
Each time an operation is invoked, it stores the invocation information and result as operation history. This history includes the result or error message and final execution state of the operation. This information is very useful but over time, brings the database to its knees.

We provide auto-purge and retention configuration for alert history and even history. But it doesn't seem operation history is treated the same way.

The existing option of going into each resource individually to track down operation history is unacceptable. Therefore we either need to purge this data as part of the normal data purge jobs and provide configurable retention options or we need a single operation history page that can provide operation history for all resources and filters so that history can be purged by date/time, resource, resource type, etc.

Version-Release number of selected component (if applicable):
3.2.0.GA

How reproducible:
Always

Steps to Reproduce:
1. Invoke the platform _get process list_ operation on a schedule of once every minute for a year.
2. Wait a year.

Actual results:
Operation history database table becomes very large and slow to respond. Data is kept forever.

Expected results:
Data is purged after a reasonable amount of time (default 90 days perhaps?)

Additional info:
This impacts performance of the database and leads to table space issues. Additionally, this is a huge usability issue as there is no way to know what resources have operation history. Therefore, where do you begin if you want to spend the hours of time manually tracking down each resource in inventory that has had an operation invoked on it. Which resource types even support operations? 

Therefore the options seem reasonable:

 1) Just provide purge schedule just like other data such as drift, call time, events, alerts, etc.

 2) Provide new UI widget that allows mass acknowledgment of operation status and history including filtering based on type, date/time, status, result, and resource. This providing the ability to delete/purge operation history in addition to seeing all operation failures in one place.

Comment 1 Jay Shaughnessy 2014-07-03 20:07:25 UTC
Looking at this to see if I can understand why we don't have an analogous purge here...

Comment 2 Jay Shaughnessy 2014-07-09 21:50:29 UTC
The consensus is that operation history amounts to auditing data and the idea of automatically purging audit data seemed like a bad idea.  Also, it seemed the chances of the operation history growing large was unlikely.

But, many operations don't perform actions that really merit much in the way of auditing and certainly infinite growth can always cause issues.

The proposal is to add a purge but to have it disabled by default.  Or, a large initial purge age, like 1 year.

There is also the operation report view, which does allow a date range filter and a delete button.  It would be unwieldly for large removals but does exist today as a way to see/delete operation history cross-resource. A Filter on operationName could be useful here.

Comment 3 Jay Shaughnessy 2014-07-16 16:30:47 UTC
master commit 37cd10a7edc8e50072dfd7801ab44562b3f0b402
Author: Jay Shaughnessy <jshaughn@redhat.com>
Date:   Wed Jul 16 12:27:48 2014 -0400

    Add operation history purge to the data purge job.  A new system setting
    is added with a default of 0 days, which means disabled. The db-upgrade
    will add the new setting, also set to 0=disabled so that upgrades don't
    automatically force an unexpected purge of operation history.
    Also:
    - cleaned up some I18N when adding the new properties, removing some
      duplicates, adding commented, missing translations, etc. Some touched
      files were only due to some remvals of deuplicate I18N properties.
    - added DatabaseType.getLimitClause() although in the end I didn't use it.

Comment 4 Thomas Segismont 2014-07-24 13:47:19 UTC
Additional commit in master

commit 584e51683a46cac10e417083a543b60b280b7b75
Author: Thomas Segismont <tsegismo@redhat.com>
Date:   Thu Jul 24 15:44:31 2014 +0200

Fix an issue on Oracle, no ID conversion needed, Hibernate does it

Also, some code cleanup

Comment 5 Thomas Segismont 2014-07-24 13:55:56 UTC
(In reply to Thomas Segismont from comment #4)
> Additional commit in master
> 
> commit 584e51683a46cac10e417083a543b60b280b7b75
> Author: Thomas Segismont <tsegismo@redhat.com>
> Date:   Thu Jul 24 15:44:31 2014 +0200

Cherry-picked over to release/jon3.3.x

commit c466c9e8910cb1d15132ce28d68f31632bb2b314
Author: Thomas Segismont <tsegismo@redhat.com>
Date:   Thu Jul 24 15:53:29 2014 +0200

Comment 6 Simeon Pinder 2014-07-31 15:51:57 UTC
Moving to ON_QA as available to test with brew build of DR01: https://brewweb.devel.redhat.com//buildinfo?buildID=373993


Note You need to log in before you can comment on or make changes to this bug.