1537733 – Region size of 10,000 Objects Supportable for VMware Provider

Bug 1537733 - Region size of 10,000 Objects Supportable for VMware Provider

Summary: Region size of 10,000 Objects Supportable for VMware Provider

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	Providers
Sub Component:
Version:	5.8.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	GA
Target Release:	5.10.0
Assignee:	Gregg Tanzillo
QA Contact:	Tasos Papaioannou
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1547161 (view as bug list)
Depends On:
Blocks:	1553472 1553473
TreeView+	depends on / blocked

Reported:	2018-01-23 18:56 UTC by myoder
Modified:	2021-06-10 14:19 UTC (History)
CC List:	16 users (show)
Fixed In Version:	5.10.0.0
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1553472 1553473 (view as bug list)
Environment:
Last Closed:	2019-02-11 14:06:31 UTC
Category:	Bug
Cloudforms Team:	VMware
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Comment 8 Keenan Brock 2018-02-09 21:22:28 UTC

Michael,

I see 2 problems: metrics (not metrics_rollups) and vim_performance_states

vim_performance_states purging was just added to master. so that would explain it beign too big.

commit 84ca04ce3c833cd94896ece5f6db662b4bb494ab
Author: Nick Carboni <ncarboni>
Date:   Fri Jan 5 11:28:38 2018 -0500

    Purge orphans from VimPerformanceState
    
    This uses the new "orhaned" purging mode to remove rows from
    vim_performance_states which have dangling pointers
    
    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1434918


metrics purging is not as efficient as it could be. which would explain it timing out
Once it starts timing out - it is difficult to get out of that problem.
I'll look into this

Anything else you want to share to steer me correctly?

Thanks,
Keenan

Comment 9 Keenan Brock 2018-02-09 22:06:37 UTC

I stand corrected, metric_rollups is the less efficient one. Which seems to be working for you.

metrics on the other hand is tuned pretty well.
I'm guessing table contention is the issue.
The batch size is probably your best bet.

That is controlled by performance.history.purge_window_size
It defaults to 1000 - which seems low.
I'd change that to 10k

I had expected that it would be too high for you, and having table contention.
but a value that low will require too many updates. a bigger batch size should speed that up

Comment 12 Keenan Brock 2018-02-15 21:40:24 UTC

Question:

Have you tried vacuuming the database?

Reasoning:

I went in to get a look at the tables and noticed that it took 14 seconds to do a count of just one of them.

explain analyze select count(*) from metrics_21;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=496204.62..496204.64 rows=1 width=0) (actual time=14397.077..14397.077 rows=1 loops=1)
   ->  Seq Scan on metrics_21  (cost=0.00..480741.90 rows=6185090 width=0) (actual time=9.232..13290.674 rows=6185090 loops=1)
 Planning time: 0.782 ms
 Execution time: 14397.241 ms


The explain text for the truncate statement looks very large. will look into reducing that.

Comment 25 CFME Bot 2018-03-08 20:26:03 UTC

https://github.com/ManageIQ/manageiq/pull/17124

Comment 28 Satoe Imaishi 2018-03-09 14:07:15 UTC

https://github.com/ManageIQ/manageiq/pull/17017

Comment 30 Marianne Feifer 2018-04-17 20:26:17 UTC

*** Bug 1547161 has been marked as a duplicate of this bug. ***

Comment 32 Tasos Papaioannou 2018-11-02 17:43:06 UTC

Verified on 5.10.0.22.

Note You need to log in before you can comment on or make changes to this bug.