Bug 1120796 - audit_log getting huge in size which causing slowness of rhev GUI [NEEDINFO]
Summary: audit_log getting huge in size which causing slowness of rhev GUI
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 3.5.0
Assignee: Eli Mesika
QA Contact:
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-17 17:50 UTC by Shishir Prakash
Modified: 2016-02-10 19:11 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-27 13:51:11 UTC
oVirt Team: Infra
lzelkha: needinfo? (sprakash)


Attachments (Terms of Use)

Description Shishir Prakash 2014-07-17 17:50:10 UTC
Description of problem:

We have a setup nagios monitoring which logins to rhev and collects the health status. Each login creates a entry in audit_log table in database.
Now the size of this tables reaches to 400MB and the rhev-manager GUI login was very slow . can say almost impossible.

Loking at the log it says "java.lang.OutOfMemoryError: GC overhead limit exceeded " Which does not had any clue that audit_log table may causes the problem.

"2014-07-17 05:17:44,200 ERROR [org.ovirt.engine.core.bll.adbroker.GSSAPIDirContextAuthenticationStrategy] (ajp-/127.0.0.1:8702-20) Kerberos error: java.lang.OutOfMemoryError: GC overhead limit exceeded"

There are few things which i would like to imporve in audit_log.
1) if we are loggin the information in audit log then we also log source IP which can actually give us some clue also.

2) If the audit_log size is huge then a specific alert should generate which can let the admin know that its time for cleanup or we can actually automate the cleanup based on max limit of audit_log table size.


Version-Release number of selected component (if applicable):
3.4

How reproducible:
Hit atleast 1000000 login to reproduce the issue. If the audit_log table size is exceded then you will find the rhev-manager UI will be extremely slow. 


shishir-

Comment 5 Liran Zelkha 2014-07-20 06:17:06 UTC
How much memory was allocated to the heap? Can you send a heap dump (using jmap)?

Comment 6 Eli Mesika 2014-07-24 09:56:44 UTC
Tested with 1,000,000 records of login/logout events in the audit_log
I had no performance issues and no OOM exceptions

Comment 7 Eli Mesika 2014-07-27 13:51:11 UTC
Added to the previous test (comment 6) a script that does periodically API calls (log-in, do some GET on DC/Cluster/Host and log-out)

Did not succeeded to reproduce the problem with 1000000 records in audit_log table and the script that was running.

I am closing this as CURRENTRELEASE (approved by Oved)


Note You need to log in before you can comment on or make changes to this bug.