Bug 1244905 - Scheduling of C&U Perf Collections exceeds 10 minutes causing indeterminate number of elements to not have C&U collections performed
Summary: Scheduling of C&U Perf Collections exceeds 10 minutes causing indeterminate n...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Performance
Version: 5.3.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: GA
: 5.7.0
Assignee: Keenan Brock
QA Contact: Pradeep Kumar Surisetty
URL:
Whiteboard: c&u:perf
Depends On:
Blocks: 1290080 1346988
TreeView+ depends on / blocked
 
Reported: 2015-07-20 18:07 UTC by Thomas Hennessy
Modified: 2019-08-06 20:04 UTC (History)
11 users (show)

Fixed In Version: 5.7.0.0
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1290080 1346988 (view as bug list)
Environment:
Last Closed: 2016-12-12 13:32:26 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
zip of evm.log for pid 30952 in which the C&U collection scheduling times out (5.99 MB, application/zip)
2015-07-20 18:07 UTC, Thomas Hennessy
no flags Details
one page pdf charting number of vms in since VC environment for two weeks before problem reported (331.54 KB, application/pdf)
2015-07-21 13:38 UTC, Thomas Hennessy
no flags Details

Description Thomas Hennessy 2015-07-20 18:07:39 UTC
Created attachment 1054009 [details]
zip of evm.log for pid 30952 in which the C&U collection scheduling times out

Description of problem: the message for the command 'Metric::Capture.perf_capture_timer' exceeds the scheduled 10 minute timeout period preventing C&U collections from being scheduled into miq_queue table


Version-Release number of selected component (if applicable):
Version: 5.2.5.3
Build:   20140811145418_a290c21

How reproducible: 


Steps to Reproduce:
1. create a vmware provider with 5400 vm instances
2. turn on C&U
3. monitor the time that the specified command takes to complete.

Actual results: the message encounters a timeout at about 600 seconds (10 minutes)


Expected results: all elements included in the message ought to be scheduled for C&U collections and the message should complete with a status of 'ok'


Additional info:
a sample of the process executing the scheduling of collections is provided (pid 30952) .  In this example about 1900 C&U Perf Collections were scheduled before the timeout value as reached.   There are about 5500 VMs in the vSphere in this customers environment.

Comment 2 Thomas Hennessy 2015-07-21 13:38:58 UTC
Created attachment 1054403 [details]
one page pdf charting number of vms in since VC environment for two weeks before problem reported

Comment 5 Keenan Brock 2016-09-06 13:58:11 UTC
There are ~10 PRs for this. Not sure I could track them down

2 of interest:

https://github.com/ManageIQ/manageiq/pull/9766
https://github.com/ManageIQ/manageiq/pull/9611
https://github.com/ManageIQ/manageiq/pull/9477

I have been given feedback by Alex Krzos that this has been fixed on 5.6.x (and master)

Comment 6 Archit Sharma 2016-12-12 13:15:53 UTC
single db appliance, 4 worker appliances, 0 recycled workers. 5k online vms from Vmware provider.
=======================================================================

showing output from log file: ~/perf_capture_timer_last_avg.log:

This reading was recorded on: 2016-12-12 08:12:08,268
...Reading logs from /var/www/miq/vmdb/log/evm.log
      and extracting last 100 events.
      Begin time: 2016-12-12T07:34
      End time: 2016-12-12T08:04
10 IDs found..

#	 | Timestamp			 | IDs | Metric::Capture.perf_capture_timer
----------------------------------------------------------------------
0	 | Mon Dec 12 07:34:52 EST 2016	 | 8667 | 20.486461867
1	 | Mon Dec 12 07:38:00 EST 2016	 | 10880 | 27.291034255
2	 | Mon Dec 12 07:41:09 EST 2016	 | 29407 | 32.840495136
3	 | Mon Dec 12 07:44:04 EST 2016	 | 37936 | 31.357679297
4	 | Mon Dec 12 07:47:00 EST 2016	 | 45778 | 27.009139387
5	 | Mon Dec 12 07:49:56 EST 2016	 | 53909 | 23.166915821
6	 | Mon Dec 12 07:52:54 EST 2016	 | 55273 | 19.152661771
7	 | Mon Dec 12 07:55:53 EST 2016	 | 56148 | 21.572165016
8	 | Mon Dec 12 07:58:56 EST 2016	 | 57016 | 21.522156136
9	 | Mon Dec 12 08:01:54 EST 2016	 | 57898 | 22.586555143
----------------------------------------------------------------------
▁▅█▇▅▃▁▂▂▂
Average: 24.69850000000000000000
----------------------------------------------------------------------
10 events were queued, out of which 10 events were delivered with [ok] status...


Note You need to log in before you can comment on or make changes to this bug.