Bug 1528988

Summary: Audits table queries cause high load in large deployments
Product: Red Hat Satellite Reporter: Tomer Brisker <tbrisker>
Component: Audit LogAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Ales Dujicek <adujicek>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2.0CC: adujicek, inecas
Target Milestone: 6.4.0Keywords: Performance, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-16 19:25:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomer Brisker 2017-12-25 14:36:29 UTC
Regularly in our foreman + puppet environment, we see this type of query hitting mysql:

<pre>
SELECT MAX(`audits`.`version`) AS max_id FROM `audits`  WHERE `audits`.`auditable_id` = 7555 AND `audits`.`auditable_type` = 'LookupKey';
</pre>

With the default indexes that ship with foreman, these queries can take upwards of 13 seconds to process, and under periods of heavy load (i.e. when foreman is cleaning up its reports/logs table), this can take even longer:

<pre>
+--------+
| max_id |
+--------+
| 117292 |
+--------+
1 row in set (13.44 sec)
</pre>

Our environment has quite a few audits from 2064 hosts:

<pre>
select count(*) from audits;
+----------+
| count(*) |
+----------+
| 29872974 |
+----------+
1 row in set (13.18 sec)
</pre>

After analyzing the query, I realized that the index that ships by default with foreman is not optimized for the query that is being performed; for a max() where a = x and b = y, you'd expect a perfect index to be (a, b, max_column). So I added that to our foreman db and saw these results:

<pre>
+--------+
| max_id |
+--------+
| 117292 |
+--------+
1 row in set (0.00 sec)
</pre>

And as can be seen from the query analysis, it no longer has to hit the table at all to get the single result this query delivers:

<pre>
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                        |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Select tables optimized away |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+
1 row in set (0.02 sec)
</pre>

Fixing this should just be a case of updating the index to be:

<pre>
`auditable_index` (`auditable_id`,`auditable_type`,`version`)
</pre>

Comment 1 Tomer Brisker 2017-12-25 14:36:35 UTC
Created from redmine issue http://projects.theforeman.org/issues/22068

Comment 7 Bryan Kearney 2018-10-16 19:25:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2927