Bug 984287

Summary: [rhevm] - ovirt-engine service - slow stop/start of the service
Product: Red Hat Enterprise Virtualization Manager Reporter: David Botzer <dbotzer>
Component: ovirt-engineAssignee: Juan Hernández <juan.hernandez>
Status: CLOSED NOTABUG QA Contact: David Botzer <dbotzer>
Severity: high Docs Contact:
Priority: high    
Version: 3.3.0CC: acathrow, bazulay, dbotzer, iheim, juan.hernandez, lpeer, pstehlik, Rhev-m-bugs, yeylon, ylavi
Target Milestone: ---Keywords: Triaged
Target Release: 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-22 08:50:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
slow-start-engine none

Description David Botzer 2013-07-14 13:45:02 UTC
Created attachment 773320 [details]
slow-start-engine

Description of problem:
ovirt-engine service -  slow stop/start of the service
after ovirt-engine restart it takes to connect to reports portal at least 1:05 min
Should take 15sec (3.2/sf18.2)

Version-Release number of selected component (if applicable):
3.3/is5
jasperreports-server-pro-5.2.0-1.el6ev.noarch
rhevm-reports-3.3.0-5.el6ev.noarch
rhevm-dwh-3.3.0-4.el6ev.noarch

How reproducible:
always

Steps to Reproduce:
1.install rhevm+dwh+reports (local)
2.create VM/s
3.restart ovirt-engine service
4.Connect to engine portal
5.Connect to reports portal (login page)

Actual results:
It takes 1:05min to connect

Expected results:
Should take less at least 15sec
(Compare to 3.2/SF18.2)

Additional info:

Comment 1 Barak 2013-07-21 12:05:17 UTC
David,

As I understand from Yaniv.D this is different only in the stop service not in the satart, please confirm.

In addition I would like to know (on the same host) how much it takes for the stop (and restart) opertions to happen without having the Reports packages  installed.

Comment 2 David Botzer 2013-07-21 12:09:33 UTC
Hi,

Please describe the matrix of testing in order to avoid misunderstanding:
Suggestion:

On a bare metal server
-----------------------
1. rhevm3.2 without reports
stop ovirt-engine = time ?
restart ovirt-engine = time ? 

2. rhevm3.2 with reports
stop ovirt-engine = time ?
restart ovirt-engine = time ?
connecting time to reports portal ?

3. rhevm3.3 without reports
stop ovirt-engine = time ?
restart ovirt-engine = time ?

4. rhevm3.3 with reports
stop ovirt-engine = time ?
restart ovirt-engine = time ?
connecting time to reports portal ?

Comment 3 Juan Hernández 2013-07-22 08:50:32 UTC
Today we performed some tests with two identical machines (in terms of hardware power) and these are the results:

1. Machine with RHEV-M 3.2 and reports.

This machine needs 37s since issuing "service ovirt-engine restart" till the reports login page is available in the browser. Of those 37 seconds 13 correspond to the deployment of the application, as can be seen in server.log:

2013-07-22 10:29:37,791 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-3) JBAS015876: Starting deployment of "rhevm-reports.war"
2013-07-22 10:29:50,643 INFO  [org.jboss.as.server] (ServerService Thread Pool -- 17) JBAS018559: Deployed "rhevm-reports.war"

In this machine the service (including reports) takes 1.2 seconds to restart:

[root@aqua-vds1 ~]# time service ovirt-engine restart
Stopping engine-service: [  OK  ]
Starting engine-service: [  OK  ]

real	0m1.263s
user	0m0.142s
sys	0m0.094s

And approx the same to stop:

[root@aqua-vds1 ~]# time service ovirt-engine stop
Stopping engine-service: [  OK  ]

real	0m1.139s
user	0m0.087s
sys	0m0.048s

2. Machine with RHEV-M 3.3 and reports.

This machine needs 43s since issuing "service ovirt-engine restart" till the reports login page is available in the browser. Of those 43 seconds 15 correspond to the deployment of the application, as can be seen in server.log:

2013-07-22 10:24:35,071 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-9) JBAS015876: Starting deployment of "rhevm-reports.war"
2013-07-22 10:24:49,702 INFO  [org.jboss.as.server] (ServerService Thread Pool -- 19) JBAS018559: Deployed "rhevm-reports.war"

In this machine the service (including reports) takes 1.3s to restart:

[root@aqua-vds8 ovirt-engine]# time service ovirt-engine restart
Stopping oVirt Engine:                                     [  OK  ]
Starting oVirt Engine:                                     [  OK  ]

real	0m1.318s
user	0m0.142s
sys	0m0.065s

And approx the same to stop:

[root@aqua-vds8 ovirt-engine]# time service ovirt-engine stop
Stopping oVirt Engine:                                     [  OK  ]

real	0m1.133s
user	0m0.018s
sys	0m0.006s

So the 3.3 machine is a bit slower than the 3.2 machine, approx 6 seconds slower. Taking into account that this is a manual process and that I am connecting remotely I think that this difference isn't meaningful. Also, even I tested repeatedly, I didn't find any situation where the engine takes more than 2 seconds to stop. However I remember seeing that situation during the previous tests that David and I did together, so I think is worth trying to find what triggers it. These two environments are completely empty, no hosts or VMs, maybe it is worth testing with environments that are populated with VMs as it may happen that some of the threads that we start refuse to stop if busy, thus delaying the stop operation.

All in all, if we are able to reproduce the long stop time I can continue studying it, otherwise I can't make any progress.