Bug 1304844 - [scale] Long delays in updating of web admin "events" pane after many long running storage operations
Summary: [scale] Long delays in updating of web admin "events" pane after many long ru...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin
Version: 3.6.2.5
Hardware: Unspecified
OS: Unspecified
unspecified
medium vote
Target Milestone: ---
: ---
Assignee: Martin Perina
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-04 18:41 UTC by mlehrer
Modified: 2022-06-30 08:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-07-04 12:17:49 UTC
oVirt Team: Infra
sbonazzo: ovirt-4.2-


Attachments (Terms of Use)
Example of lag between events messages after long running operations (59.33 KB, image/png)
2016-02-04 18:41 UTC, mlehrer
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-46770 0 None None None 2022-06-30 08:06:34 UTC

Description mlehrer 2016-02-04 18:41:59 UTC
Created attachment 1121178 [details]
Example of lag between events messages after long running operations

Description of problem:

After multiple long running storage domain operations the Events pane in the web admin shows delayed event message descriptions.  The delay or lag between what the "last message" web admin pane shows and event occurrence or status, can last for several minutes.

eg: see attachment for picture of host that shows different status then the event pane shows for up to several minutes.



Version-Release number of selected component (if applicable):

vdsm-hook-vmfex-dev-4.17.17-0.el7ev.noarch
vdsm-python-4.17.17-0.el7ev.noarch
vdsm-yajsonrpc-4.17.17-0.el7ev.noarch
vdsm-4.17.17-0.el7ev.noarch
vdsm-xmlrpc-4.17.17-0.el7ev.noarch
vdsm-jsonrpc-4.17.17-0.el7ev.noarch
vdsm-cli-4.17.17-0.el7ev.noarch
vdsm-infra-4.17.17-0.el7ev.noarch
rhevm-*-3.6.2.5-0.1

Env details:
-------------------
in 1 Cluster:
50 total SDs of which 
   21 are ISCSI
   30 are NFS

15 running VMs
2  Hosts



How reproducible:

Requires long running operations, and happens over time.
Not easily reproducible within a few clicks.

Steps to Reproduce:

1. Perform some long running storage operations like domain attachment, domain creation, creation of VMs from pools, Disk Migration of large disks.

2. After several long running operations notice events UI pane lagging behind several minutes from actual events occurring.  The difference between UI updates of pane events vs actual changes should be come noticeable at that point.

Actual results:

Event Pane last messages aren't current and lag by several minutes.

Expected results:

Event pane shows up to date status within 15 seconds of change or less. 

Additional info:

Noticed this while focusing on scale storage scenarios.

Comment 1 Oved Ourfali 2016-02-29 10:02:50 UTC
Mordechai - not sure the use case of using half FC half iSCSI is reflecting a real use-case.
Do you see the same when there are a lot of storage operations in general?

Comment 2 mlehrer 2016-02-29 13:24:44 UTC
(In reply to Oved Ourfali from comment #1)
> Mordechai - not sure the use case of using half FC half iSCSI is reflecting
> a real use-case.

The half *NFS* / half iSCSCI were suggested to cover domain scale testing built from existing customer issues with a small growth factor for simulated customer datasets. I agree the number of domains are high, but this was intended.

> Do you see the same when there are a lot of storage operations in general?

Currently I don't have data from an enviroment that has only a few SD's but is also heavy in storage operations.  In the environments that we checked using heavy storage operations, they also contained multiple domains as described above.  

It seems that simply having many (50) domains won't reproduce this delayed event behavior, its  necessary to have executed some long running storage operations, in addition to having many domains.  Further investigation would be necessary to see the effect of many long running operations in an enviroment with less SD's.

Comment 3 Sandro Bonazzola 2016-05-02 09:56:36 UTC
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.

Comment 4 Yaniv Lavi 2016-05-23 13:17:49 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 5 Yaniv Lavi 2016-05-23 13:24:30 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 7 Oved Ourfali 2017-07-04 12:17:49 UTC
I don't see us prioritizing this at the moment.
Closing as wontfix.


Note You need to log in before you can comment on or make changes to this bug.