Bug 1578996

Summary: [RHV] When Graph refresh is ON, RHV provider refresh time is longer
Product: Red Hat CloudForms Management Engine Reporter: Satoe Imaishi <simaishi>
Component: ProvidersAssignee: Boriso <bodnopoz>
Status: CLOSED ERRATA QA Contact: Ilanit Stein <istein>
Severity: high Docs Contact:
Priority: high    
Version: 5.9.0CC: bodnopoz, cpelland, gblomqui, jfrey, jhardy, jprause, mperina, obarenbo
Target Milestone: GAKeywords: ZStream
Target Release: 5.9.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: rhev:graph refresh
Fixed In Version: 5.9.3.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1566468 Environment:
Last Closed: 2018-07-12 13:15:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: RHEVM Target Upstream Version:
Embargoed:
Bug Depends On: 1566468    
Bug Blocks:    
Attachments:
Description Flags
gr_off_evm_log.tgz
none
gr_off_automation_log.tgz
none
gr_on_evm_log.tgz
none
gr_on_automation_log.tgz none

Comment 2 CFME Bot 2018-05-16 18:37:33 UTC
New commits detected on ManageIQ/manageiq-providers-ovirt/gaprindashvili:

https://github.com/ManageIQ/manageiq-providers-ovirt/commit/64e7cd3bdbd67f5c83e86abcb64eb092cc6e15ef
commit 64e7cd3bdbd67f5c83e86abcb64eb092cc6e15ef
Author:     Piotr Kliczewski <piotr.kliczewski>
AuthorDate: Mon May 14 04:37:30 2018 -0400
Commit:     Piotr Kliczewski <piotr.kliczewski>
CommitDate: Mon May 14 04:37:30 2018 -0400

    Merge pull request #237 from borod108/refresh_huge

    Performance improvements for graph refresh
    (cherry picked from commit d631734debc7a84e8aa8b8f32fd0f28c10e9adaf)

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1578996

 app/models/manageiq/providers/redhat/inventory_collection_default/infra_manager.rb | 6 +-
 spec/models/manageiq/providers/redhat/infra_manager/refresh/refresh_recording_modifier_spec.rb | 58 +
 spec/models/manageiq/providers/redhat/infra_manager/refresh/refresher_huge_async_spec.rb | 141 +
 spec/support/modify_refresh_yml_recording.rb | 214 +
 spec/vcr_cassettes/manageiq/providers/redhat/infra_manager/refresh/ovirt_sdk_refresh_recording_for_mod.yml | 2819 +
 5 files changed, 3236 insertions(+), 2 deletions(-)


https://github.com/ManageIQ/manageiq-providers-ovirt/commit/9f425673f033459f51761df80cd17ace5cedc188
commit 9f425673f033459f51761df80cd17ace5cedc188
Author:     Boris Od <boris.od>
AuthorDate: Wed May 16 02:12:08 2018 -0400
Commit:     Boris Od <boris.od>
CommitDate: Wed May 16 02:12:08 2018 -0400

    Merge pull request #240 from agrare/folder_bulk_connect

    Improve performance of EmsFolders and EmsClusters child save blocks
    (cherry picked from commit 4c764a9af5b8339ca0aa2c50490605e29c0260d2)

    https://bugzilla.redhat.com/show_bug.cgi?id=1578996

 app/models/manageiq/providers/redhat/inventory_collection_default/infra_manager.rb | 25 +-
 1 file changed, 7 insertions(+), 18 deletions(-)

Comment 3 Ilanit Stein 2018-05-23 13:58:25 UTC
Moving to verified based on the following test:


Here are testing details, towards turning RHV Graph refresh ON, by default.

Test purpose:
Time measurement of full refresh for Graph refresh turned ON, 
to examine the RHV Graph refresh optimizations, added in CFME-5.9.3. 

Tested parts:
2 CFME-5.9.3 machines (Memory 8G, 4 cores) (in US)::
1. With Graph refresh OFF (default)
2. With Graph refresh ON.

Scale RHV-4.2.3.5-0.1.el7 (in RDU):
b01-h21-r620.rhev.openstack.engineering.redhat.com
Clusters	 4
Hosts	 403
Datastores	 6
Virtual Machines 4059
Templates 13

Test results:
Graph refresh OFF:
1st refresh: 19 min; 1st removal 17 min
2nd refresh: 16 min; 2nd removal 1 min

Graph refresh ON:
1st refresh: 15 min; 1st removal 41 min
2nd refresh: 15 min; 2nd removal 1 min


Conclusion:
Graph refresh seem to provide refresh time improvement.

Issues:
The 1st removal time of the RHV provider is quite large: GR OFF: 17 min,
and for GR ON it's even larger GR ON: 41 min.

Notes:
* Both RHV & CFME located in the US, so these tests are with no latency.
* Refresh time is taken from evm.log (messages: Refreshing targets for EMS... &     
  Refreshing targets for EMS...complete)
* Removal time is taken from the CFME UI tasks page, task: "Destroying  
  ManageIQ::Providers::Redhat::InfraManager with id: "
* The overall time measurement is larger than those measured in previous   
  testing, using RHV in the same VMs/Hosts scale. 

Testing recommendation:
Repeat this test with high latency.

Comment 4 Ilanit Stein 2018-05-24 07:27:24 UTC
Created attachment 1440964 [details]
gr_off_evm_log.tgz

Comment 5 Ilanit Stein 2018-05-24 08:50:47 UTC
Created attachment 1440979 [details]
gr_off_automation_log.tgz

Comment 6 Ilanit Stein 2018-05-24 08:51:24 UTC
Created attachment 1440980 [details]
gr_on_evm_log.tgz

Comment 7 Ilanit Stein 2018-05-24 08:51:57 UTC
Created attachment 1440982 [details]
gr_on_automation_log.tgz

Comment 9 errata-xmlrpc 2018-07-12 13:15:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2184