Bug 1301587 - [scale] - hosts initialization taking too long (with 500 fake hosts and 10K vms)
[scale] - hosts initialization taking too long (with 500 fake hosts and 10K vms)
Status: NEW
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core (Show other bugs)
3.6.2
x86_64 Linux
unspecified Severity high (vote)
: ovirt-4.2.0
: ---
Assigned To: Martin Perina
eberman
: Performance
Depends On: 1364791
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-25 07:56 EST by Eldad Marciano
Modified: 2017-06-20 10:41 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑4.2?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
engine thread dumps (54.94 KB, application/zip)
2016-01-25 08:00 EST, Eldad Marciano
no flags Details

  None (edit)
Description Eldad Marciano 2016-01-25 07:56:51 EST
Description of problem:
hosts initialization takes too long, one the engine restarted.
we used loaded engine with 500 fake hosts and 10K vms.

the fake hosts powered by ovirt-vdsmfake.

they latency is very fast 1 sec, and it supports multiple request (so it might not be the bottleneck). 


Version-Release number of selected component (if applicable):
rhevm 3.6.2.0-1

How reproducible:
100%

Steps to Reproduce:
1. just restart loaded engine.


Actual results:
hosts initialization takes too long (more than 30 min).

Expected results:
faster results.

Additional info:
Comment 1 Eldad Marciano 2016-01-25 08:00 EST
Created attachment 1117980 [details]
engine thread dumps
Comment 2 Yaniv Kaul 2016-01-25 08:12:12 EST
Eldad,
- Can you provide engine logs?
- What is 'long', in the sense that how many are initialized per minute? (is it linear, is it slowing down?). Is it still same for 100 hosts, for example?
- Does the number change depending on the number of VMs? Is it the same without the VMs?
- Have you seen any difference between fake and real hosts?

Lastly, rhevm 3.6.2.0-1 is a bit old. While I don't think there were critical changes in this area, the latest is rhevm-3.6.2.6-0.1
Comment 3 Eldad Marciano 2016-01-27 06:57:27 EST
(In reply to Yaniv Kaul from comment #2)
> Eldad,
> - Can you provide engine logs?
Yes i'll.

> - What is 'long', in the sense that how many are initialized per minute? (isthe 
> it linear, is it slowing down?). Is it still same for 100 hosts, for example?
didn't test, i notice that problem when i restart the engine. and seems like it's serial.

> - Does the number change depending on the number of VMs? Is it the same
> without the VMs?
didn't test it.
> - Have you seen any difference between fake and real hosts?
yes, we have 37 real hosts vs 500 fake hosts.

> 
> Lastly, rhevm 3.6.2.0-1 is a bit old. While I don't think there were
> critical changes in this area, the latest is rhevm-3.6.2.6-0.1
we'll upgrade it ASAP.
Comment 4 Oved Ourfali 2016-01-27 09:01:25 EST
(In reply to Eldad Marciano from comment #3)
> (In reply to Yaniv Kaul from comment #2)
> > Eldad,
> > - Can you provide engine logs?
> Yes i'll.
> 
> > - What is 'long', in the sense that how many are initialized per minute? (isthe 
> > it linear, is it slowing down?). Is it still same for 100 hosts, for example?
> didn't test, i notice that problem when i restart the engine. and seems like
> it's serial.
> 

I'd be interested in knowing exactly how long it takes.
Comment 5 Oved Ourfali 2016-01-27 09:02:25 EST
Currently targeting to 3.6.4, but for such scale we might address it only on 4.0.
Comment 6 Red Hat Bugzilla Rules Engine 2016-01-27 09:11:51 EST
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
Comment 7 Sandro Bonazzola 2016-05-02 05:59:46 EDT
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Comment 8 Yaniv Lavi (Dary) 2016-05-23 09:16:12 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 9 Yaniv Lavi (Dary) 2016-05-23 09:20:00 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 11 Red Hat Bugzilla Rules Engine 2016-05-25 10:04:01 EDT
Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone.
Comment 12 Eldad Marciano 2016-08-16 10:29:35 EDT
(In reply to Oved Ourfali from comment #4)
> (In reply to Eldad Marciano from comment #3)
> > (In reply to Yaniv Kaul from comment #2)
> > > Eldad,
> > > - Can you provide engine logs?
> > Yes i'll.
> > 
> > > - What is 'long', in the sense that how many are initialized per minute? (isthe 
> > > it linear, is it slowing down?). Is it still same for 100 hosts, for example?
> > didn't test, i notice that problem when i restart the engine. and seems like
> > it's serial.
> > 
> 
> I'd be interested in knowing exactly how long it takes.

Currently we dont have this such of scale capacity, i'll update one we have it.
Comment 13 Eldad Marciano 2017-05-28 10:35:40 EDT
Is this bug still relevant in terms of topology ?!
we would like to reproduce it with 500 hosts and 10K vms ?!
Comment 14 Martin Perina 2017-05-29 03:33:09 EDT
(In reply to Eldad Marciano from comment #13)
> Is this bug still relevant in terms of topology ?!
> we would like to reproduce it with 500 hosts and 10K vms ?!

We haven't done any improvements for engine startup time, but I think that fixes for BZ1438497 might help also here. Please test 500 hosts and 10K VMs, if there is still issue we will try to optimize.

Note You need to log in before you can comment on or make changes to this bug.