Bug 1304844 - [scale] Long delays in updating of web admin "events" pane after many long running storage operations
[scale] Long delays in updating of web admin "events" pane after many long ru...
Status: CLOSED WONTFIX
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin (Show other bugs)
3.6.2.5
Unspecified Unspecified
unspecified Severity medium (vote)
: ovirt-4.2.0
: ---
Assigned To: Martin Perina
Pavel Stehlik
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-04 13:41 EST by mlehrer
Modified: 2017-07-04 08:17 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-07-04 08:17:49 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
oourfali: ovirt‑4.2?
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)
Example of lag between events messages after long running operations (59.33 KB, image/png)
2016-02-04 13:41 EST, mlehrer
no flags Details

  None (edit)
Description mlehrer 2016-02-04 13:41:59 EST
Created attachment 1121178 [details]
Example of lag between events messages after long running operations

Description of problem:

After multiple long running storage domain operations the Events pane in the web admin shows delayed event message descriptions.  The delay or lag between what the "last message" web admin pane shows and event occurrence or status, can last for several minutes.

eg: see attachment for picture of host that shows different status then the event pane shows for up to several minutes.



Version-Release number of selected component (if applicable):

vdsm-hook-vmfex-dev-4.17.17-0.el7ev.noarch
vdsm-python-4.17.17-0.el7ev.noarch
vdsm-yajsonrpc-4.17.17-0.el7ev.noarch
vdsm-4.17.17-0.el7ev.noarch
vdsm-xmlrpc-4.17.17-0.el7ev.noarch
vdsm-jsonrpc-4.17.17-0.el7ev.noarch
vdsm-cli-4.17.17-0.el7ev.noarch
vdsm-infra-4.17.17-0.el7ev.noarch
rhevm-*-3.6.2.5-0.1

Env details:
-------------------
in 1 Cluster:
50 total SDs of which 
   21 are ISCSI
   30 are NFS

15 running VMs
2  Hosts



How reproducible:

Requires long running operations, and happens over time.
Not easily reproducible within a few clicks.

Steps to Reproduce:

1. Perform some long running storage operations like domain attachment, domain creation, creation of VMs from pools, Disk Migration of large disks.

2. After several long running operations notice events UI pane lagging behind several minutes from actual events occurring.  The difference between UI updates of pane events vs actual changes should be come noticeable at that point.

Actual results:

Event Pane last messages aren't current and lag by several minutes.

Expected results:

Event pane shows up to date status within 15 seconds of change or less. 

Additional info:

Noticed this while focusing on scale storage scenarios.
Comment 1 Oved Ourfali 2016-02-29 05:02:50 EST
Mordechai - not sure the use case of using half FC half iSCSI is reflecting a real use-case.
Do you see the same when there are a lot of storage operations in general?
Comment 2 mlehrer 2016-02-29 08:24:44 EST
(In reply to Oved Ourfali from comment #1)
> Mordechai - not sure the use case of using half FC half iSCSI is reflecting
> a real use-case.

The half *NFS* / half iSCSCI were suggested to cover domain scale testing built from existing customer issues with a small growth factor for simulated customer datasets. I agree the number of domains are high, but this was intended.

> Do you see the same when there are a lot of storage operations in general?

Currently I don't have data from an enviroment that has only a few SD's but is also heavy in storage operations.  In the environments that we checked using heavy storage operations, they also contained multiple domains as described above.  

It seems that simply having many (50) domains won't reproduce this delayed event behavior, its  necessary to have executed some long running storage operations, in addition to having many domains.  Further investigation would be necessary to see the effect of many long running operations in an enviroment with less SD's.
Comment 3 Sandro Bonazzola 2016-05-02 05:56:36 EDT
Moving from 4.0 alpha to 4.0 beta since 4.0 alpha has been already released and bug is not ON_QA.
Comment 4 Yaniv Lavi 2016-05-23 09:17:49 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 5 Yaniv Lavi 2016-05-23 09:24:30 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 7 Oved Ourfali 2017-07-04 08:17:49 EDT
I don't see us prioritizing this at the moment.
Closing as wontfix.

Note You need to log in before you can comment on or make changes to this bug.