Description of problem: C & U collection on the basis of tags was enabled, vim_performance_tag_values is growing too much. Version-Release number of selected component (if applicable): 5.7.3.2 How reproducible: In customer environment Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: as an emergency measure customer switched off the C& U collector role to stop db growth. the vim_performance_tag_values table size is 100 GB now
Created attachment 1354213 [details] vim_performance_tag_values purge tool The table "vim_performance_tag_values" is used for grouping metrics by managed tag categories and tags for reporting purposes. The categories and tags that are rolled up to this tables is configurable for each category and is off by default for all categories. This table can grow very large very quickly if it is enabled for many categories and there are many objects for which metrics are being collected. The recommendation would be to disable this feature for any categories where is is not absolutely necessary. Disabling can be done under region level configuration managed tag categories UI by setting "Capture C & U Data" to false for each category where is has been enabled. After disabling all of the desired categories, the attached script should be run to purge the existing rolled up tag data for the categories that were disabled. This should reclaim the space in the DB. This is a replacement for the current file at /var/www/miq/vmdb/tools/purge_orphaned_tag_values.rb Running this file with no options will purge rows from the vim_performance_tag_values table which no longer have a corresponding metric. This new version adds an option to remove rows which map to a tag category which is no longer set to collect data from metrics. These rows could still be used in reports (as their metrics likely still exist), but if the C&U collection was enabled for many tags by mistake, this is how you can remove some of the data which would accumulate in this table. To delete both types of rows you can run the following from the vmdb directory: `tools/purge_orphaned_tag_values.rb --purge-disabled-tag-values`
On which appliance role should we executed this script? We have 3 appliances, console one for UI, CU one for CU, automation one for automation?
This should ideally be executed on your database owning appliance
script can't play: disk full /dev/mapper/VG_DATA-lv_data 176245 160230 16016 91% /data -> there is 16 GO free but it looks like it is not enough. [root@cplcfm-au-v01 vmdb]# ./tools/purge_orphaned_tag_values.rb --purge-disabled-tag-values Purging orphaned tag values... Finding known metric ids... /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/postgresql_adapter.rb:598:in `async_exec': PG::DiskFull: ERROR: could not write block 2047859 of temporary file: No space left on device (ActiveRecord::StatementInvalid) : SELECT DISTINCT metric_type, metric_id FROM "vim_performance_tag_values" from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/postgresql_adapter.rb:598:in `block in exec_no_cache' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/abstract_adapter.rb:590:in `block in log' from /opt/rh/cfme-gemset/gems/activesupport-5.0.3/lib/active_support/notifications/instrumenter.rb:21:in `instrument' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/abstract_adapter.rb:583:in `log' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/postgresql_adapter.rb:598:in `exec_no_cache' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/postgresql_adapter.rb:585:in `execute_and_clear' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/postgresql/database_statements.rb:103:in `exec_query' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/abstract/database_statements.rb:373:in `select' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/abstract/database_statements.rb:41:in `select_all' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/connection_adapters/abstract/query_cache.rb:95:in `select_all' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/querying.rb:39:in `find_by_sql' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/relation.rb:702:in `exec_queries' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/relation.rb:583:in `load' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/relation.rb:260:in `records' from /opt/rh/cfme-gemset/gems/activerecord-5.0.3/lib/active_record/relation/delegation.rb:40:in `each' from ./tools/purge_orphaned_tag_values.rb:30:in `block in <main>' from /opt/rh/rh-ruby23/root/usr/share/ruby/benchmark.rb:308:in `realtime' from ./tools/purge_orphaned_tag_values.rb:29:in `<main>'
Not sure it is a good idea to select a 733782980 rows table without limit and put the results in a ruby hash: # Determine all of the known metric ids in the tag values table log("Finding known metric ids...") perf_ids = Hash.new { |h, k| h[k] = [] } # TODO: there is probably a way to do this without bringing the ids back t = Benchmark.realtime do VimPerformanceTagValue.select("metric_type, metric_id").distinct.order(nil).each { |v| perf_ids[v.metric_type] << v.metric_id } in postgres log: 2017-11-20 10:51:10.548 CET >STATEMENT: SELECT DISTINCT metric_type, metric_id FROM "vim_performance_tag_values"
It would be good to also check the status of the metrics purging to see if it's up to date. This can be done by using the following SQL query snd comparing the output to the currently configured metrics retention settings - On the DB appliance do psql -d vmdb_production select timestamp from metric_rollups order by timestamp asc limit 1;
This is the output of the query timestamp 2017-07-03 16:00:00
If I understand correctly, the customer has their metrics purge setting to 2.months. If that's the case then the purge has fallen behind. I think the first step would be to get the purging up to date. This can be done with the purge_metrics script in the tools directory on the appliance. Running the script as follows will give you the count of metrics to be purged - bin/rails r tools/purge_metrics.rb -- -h 2.months -d 2.months Running it with mode purge will actually do the purge bin/rails r tools/purge_metrics.rb -- -h 2.months -d 2.months -m purge
Preliminary PR to stop using vim_performance_tag_values to generate reports and charts - https://github.com/ManageIQ/manageiq/pull/16582 Another PR is needed to remove the code that generates this data during C&U processing.
PR removing code that generates vim_performance_tag_values - https://github.com/ManageIQ/manageiq/pull/16692
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/d0bc3369c04d7196e97a54176505299109fb2107 commit d0bc3369c04d7196e97a54176505299109fb2107 Author: Gregg Tanzillo <gtanzill> AuthorDate: Tue Dec 19 15:36:36 2017 -0500 Commit: Gregg Tanzillo <gtanzill> CommitDate: Tue Jan 2 14:17:14 2018 -0500 Stop generating `vim_performance_tag_values` rows Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1510484 Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1514505 app/models/metric/ci_mixin/rollup.rb | 1 - app/models/storage.rb | 2 -- .../connection_adapters/miq_postgres_adapter.rb | 4 ---- .../connection_adapters/miq_postgres_legacy_adapter.rb | 4 ---- spec/models/vim_performance_tag_spec.rb | 2 -- tools/metrics_populate_retro_tags.rb | 13 ------------- 6 files changed, 26 deletions(-)