Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1056064 - [SCALE] Events are being pulled from audit_log in a very inefficient way
[SCALE] Events are being pulled from audit_log in a very inefficient way
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.3.0
Unspecified Unspecified
urgent Severity high
: ---
: 3.4.0
Assigned To: Liran Zelkha
Eldad Marciano
infra
: ZStream
Depends On:
Blocks: GSS_RHEV_33_BETA 1057561 1061165 rhev3.4beta 1142926
  Show dependency treegraph
 
Reported: 2014-01-21 08:45 EST by Tomas Dosek
Modified: 2016-02-10 14:28 EST (History)
14 users (show)

See Also:
Fixed In Version: ovirt-3.4.0-beta2
Doc Type: Bug Fix
Doc Text:
Previously, an inefficient query caused information pulled from the database to take a long time. This resulted in the query taking longer than expected to display results. Now, an updated query has fixed the problem. As a result, information is displayed more quickly.
Story Points: ---
Clone Of:
: 1057561 1061165 (view as bug list)
Environment:
Last Closed: 2014-06-09 11:09:09 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Postgres log (7.74 MB, text/plain)
2014-01-21 08:45 EST, Tomas Dosek
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 23567 None None None Never
oVirt gerrit 24048 None None None Never
oVirt gerrit 24094 None None None Never
Red Hat Product Errata RHSA-2014:0506 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Virtualization Manager 3.4.0 update 2014-06-09 14:55:38 EDT

  None (edit)
Description Tomas Dosek 2014-01-21 08:45:30 EST
Created attachment 853237 [details]
Postgres log

Description of problem:

Events displayed in UI (used by reports and API as well) are being pulled from database by very inefficient query.

This was found during scalability testing done by strategic EMEA customer.

The query is (taken from postgres logs with enhanced verbosity level):
[pid 3766 127.0.0.1(39426)]DEBUG:  00000: parse <unnamed>: SELECT * FROM (SELECT * FROM audit_log WHERE ( audit_log_id IN (SELECT audit_log.audit_log_id FROM  audit_log   WHERE  audit_log.severity = '10' ))  ORDER BY audit_log_id DESC ) as T1 OFFSET (1 -1) LIMIT 100


Version-Release number of selected component (if applicable):
is29 (basically any RHEV-M until now)

How reproducible:
100 %

Steps to Reproduce:
1. enhance log verbosity of postgres (in postgresql.conf):
        log_error_verbosity = verbose
        log_min_messages = debug2
        log_line_prefix = '[pid %p %r]'
2. restart postgres
3. go to events in portal, or fetch event via api

Actual results:
The query used is very inefficient and takes very long time to finish.

Expected results:
We should most likely use just one SELECT (ain't ;))

Additional info:
Hopefully not needed. But still for the record attaching the postgres log
Comment 2 Itamar Heim 2014-01-21 17:15:58 EST
liran/eli - didn't we fix something in these for 3.4?
Comment 3 Eli Mesika 2014-01-22 04:07:13 EST
(In reply to Itamar Heim from comment #2)
> liran/eli - didn't we fix something in these for 3.4?

As far as I remember Liran handled that for a diffrent query for 3.4 
Liran ?
Comment 4 Eli Mesika 2014-01-22 04:14:37 EST
Actually , looking again in the query described in BZ description , it seems that this is the query that populate the Alerts TAB ( WHERE  audit_log.severity = '10' ) => 10 is the severity of Alerts 

So, we had optimized the regular query that populates the Events TAB but not that one
Comment 6 Tareq Alayan 2014-02-17 05:17:38 EST
what is considered to be OK time  ?
now it takes ~2seconds is it ok?

tested on ovirt-engine-3.4.0-0.7.beta2.el6.noarch
Comment 7 Tomas Dosek 2014-02-17 05:39:37 EST
That depends on the amount events you have in your audit_log. Is that a clear environment or some long-termly used one?
Comment 8 Tareq Alayan 2014-02-17 06:30:55 EST
clear environment
Comment 9 Tomas Dosek 2014-02-17 06:35:28 EST
In that case 2 seconds is quite a lot, but we should try to feed the audit log with some entries to see how much UI limits this and what is the performance degradation per 100, 1000, 10 000 lines of audit log (10k should be the upper limit where we start cleaning up the audit_log table automatically so there's no need to test more than that).
Comment 10 Tareq Alayan 2014-02-17 06:43:06 EST
functional testing point of view: i see that the audit_log and event_log are viewed ok. 

Gil, do u see apropriate to do scale testing here by Yuri's team? 
Or shall i move it to verify?
Comment 11 Gil Klein 2014-02-17 11:18:41 EST
Eldad, would you be able to verify this on the scale lab?
Comment 13 Eldad Marciano 2014-05-13 07:03:08 EDT
Not reproduced 

not in scale setup and even small one,

tested on 3.3 and also 3.4
Comment 14 errata-xmlrpc 2014-06-09 11:09:09 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-0506.html

Note You need to log in before you can comment on or make changes to this bug.