Bug 1120796

Summary: audit_log getting huge in size which causing slowness of rhev GUI
Product: Red Hat Enterprise Virtualization Manager Reporter: Shishir Prakash <sprakash>
Component: ovirt-engineAssignee: Eli Mesika <emesika>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: high    
Version: 3.4.0CC: acathrow, dwysocha, ecohen, emesika, gklein, iheim, lpeer, npatil, oourfali, pstehlik, Rhev-m-bugs, sprakash, tdosek, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-27 13:51:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shishir Prakash 2014-07-17 17:50:10 UTC
Description of problem:

We have a setup nagios monitoring which logins to rhev and collects the health status. Each login creates a entry in audit_log table in database.
Now the size of this tables reaches to 400MB and the rhev-manager GUI login was very slow . can say almost impossible.

Loking at the log it says "java.lang.OutOfMemoryError: GC overhead limit exceeded " Which does not had any clue that audit_log table may causes the problem.

"2014-07-17 05:17:44,200 ERROR [org.ovirt.engine.core.bll.adbroker.GSSAPIDirContextAuthenticationStrategy] (ajp-/127.0.0.1:8702-20) Kerberos error: java.lang.OutOfMemoryError: GC overhead limit exceeded"

There are few things which i would like to imporve in audit_log.
1) if we are loggin the information in audit log then we also log source IP which can actually give us some clue also.

2) If the audit_log size is huge then a specific alert should generate which can let the admin know that its time for cleanup or we can actually automate the cleanup based on max limit of audit_log table size.


Version-Release number of selected component (if applicable):
3.4

How reproducible:
Hit atleast 1000000 login to reproduce the issue. If the audit_log table size is exceded then you will find the rhev-manager UI will be extremely slow. 


shishir-

Comment 5 Liran Zelkha 2014-07-20 06:17:06 UTC
How much memory was allocated to the heap? Can you send a heap dump (using jmap)?

Comment 6 Eli Mesika 2014-07-24 09:56:44 UTC
Tested with 1,000,000 records of login/logout events in the audit_log
I had no performance issues and no OOM exceptions

Comment 7 Eli Mesika 2014-07-27 13:51:11 UTC
Added to the previous test (comment 6) a script that does periodically API calls (log-in, do some GET on DC/Cluster/Host and log-out)

Did not succeeded to reproduce the problem with 1000000 records in audit_log table and the script that was running.

I am closing this as CURRENTRELEASE (approved by Oved)

Comment 8 Red Hat Bugzilla 2023-09-14 02:11:40 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days