Bug 1244905
| Summary: | Scheduling of C&U Perf Collections exceeds 10 minutes causing indeterminate number of elements to not have C&U collections performed | |||
|---|---|---|---|---|
| Product: | Red Hat CloudForms Management Engine | Reporter: | Thomas Hennessy <thenness> | |
| Component: | Performance | Assignee: | Keenan Brock <kbrock> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Pradeep Kumar Surisetty <psuriset> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 5.3.0 | CC: | arcsharm, cpelland, dmetzger, jdeubel, jhardy, jocarter, jprause, kbrock, mfeifer, obarenbo, simaishi | |
| Target Milestone: | GA | Keywords: | TestOnly, ZStream | |
| Target Release: | 5.7.0 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | c&u:perf | |||
| Fixed In Version: | 5.7.0.0 | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1290080 1346988 (view as bug list) | Environment: | ||
| Last Closed: | 2016-12-12 13:32:26 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1290080, 1346988 | |||
| Attachments: | ||||
Created attachment 1054403 [details]
one page pdf charting number of vms in since VC environment for two weeks before problem reported
There are ~10 PRs for this. Not sure I could track them down 2 of interest: https://github.com/ManageIQ/manageiq/pull/9766 https://github.com/ManageIQ/manageiq/pull/9611 https://github.com/ManageIQ/manageiq/pull/9477 I have been given feedback by Alex Krzos that this has been fixed on 5.6.x (and master) single db appliance, 4 worker appliances, 0 recycled workers. 5k online vms from Vmware provider.
=======================================================================
showing output from log file: ~/perf_capture_timer_last_avg.log:
This reading was recorded on: 2016-12-12 08:12:08,268
...Reading logs from /var/www/miq/vmdb/log/evm.log
and extracting last 100 events.
Begin time: 2016-12-12T07:34
End time: 2016-12-12T08:04
10 IDs found..
# | Timestamp | IDs | Metric::Capture.perf_capture_timer
----------------------------------------------------------------------
0 | Mon Dec 12 07:34:52 EST 2016 | 8667 | 20.486461867
1 | Mon Dec 12 07:38:00 EST 2016 | 10880 | 27.291034255
2 | Mon Dec 12 07:41:09 EST 2016 | 29407 | 32.840495136
3 | Mon Dec 12 07:44:04 EST 2016 | 37936 | 31.357679297
4 | Mon Dec 12 07:47:00 EST 2016 | 45778 | 27.009139387
5 | Mon Dec 12 07:49:56 EST 2016 | 53909 | 23.166915821
6 | Mon Dec 12 07:52:54 EST 2016 | 55273 | 19.152661771
7 | Mon Dec 12 07:55:53 EST 2016 | 56148 | 21.572165016
8 | Mon Dec 12 07:58:56 EST 2016 | 57016 | 21.522156136
9 | Mon Dec 12 08:01:54 EST 2016 | 57898 | 22.586555143
----------------------------------------------------------------------
▁▅█▇▅▃▁▂▂▂
Average: 24.69850000000000000000
----------------------------------------------------------------------
10 events were queued, out of which 10 events were delivered with [ok] status...
|
Created attachment 1054009 [details] zip of evm.log for pid 30952 in which the C&U collection scheduling times out Description of problem: the message for the command 'Metric::Capture.perf_capture_timer' exceeds the scheduled 10 minute timeout period preventing C&U collections from being scheduled into miq_queue table Version-Release number of selected component (if applicable): Version: 5.2.5.3 Build: 20140811145418_a290c21 How reproducible: Steps to Reproduce: 1. create a vmware provider with 5400 vm instances 2. turn on C&U 3. monitor the time that the specified command takes to complete. Actual results: the message encounters a timeout at about 600 seconds (10 minutes) Expected results: all elements included in the message ought to be scheduled for C&U collections and the message should complete with a status of 'ok' Additional info: a sample of the process executing the scheduling of collections is provided (pid 30952) . In this example about 1900 C&U Perf Collections were scheduled before the timeout value as reached. There are about 5500 VMs in the vSphere in this customers environment.