Bug 1656470

Summary:	Available Errata report performs poorly for some filters
Product:	Red Hat Satellite	Reporter:	Lukáš Hellebrandt <lhellebr>
Component:	Errata Management	Assignee:	John Mitsch <jomitsch>
Status:	CLOSED ERRATA	QA Contact:	Lukáš Hellebrandt <lhellebr>
Severity:	low	Docs Contact:
Priority:	unspecified
Version:	6.5.0	CC:	ehelms, inecas, jomitsch, jturel, lhellebr, mhulan, oprazak
Target Milestone:	6.5.0	Keywords:	Triaged
Target Release:	Unused
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	tfm-rubygem-katello-3.10.0.24-1	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-05-14 12:39:23 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Lukáš Hellebrandt 2018-12-05 15:20:38 UTC

Description of problem:
I've tested performance of the report with the following results:

*a) 10000 hosts with these channels synced (but no applicable errata): rhel-7-server-extras-rpms, rhel-7-server-optional-rpms, rhel-7-server-rpms, rhel-7-server-satellite-capsule-6.4-rpms, rhel-7-server-satellite-tools-6.4-rpms - 80 s
*b) The same as a) but with filter to 0 hosts: 1 s
*c) The same as a) but added 63 hosts with 10 000 applicable errata: 140 s
*d) The same as c) but filtered to only 7 hosts: 16 s
*e) The same as c) but errata filtered to 0 (search in form 'nonexistentname'): 16 m
*f) The same as c) but errata filtered to 0 (search in form 'id="nonexistentname"'): 90 s
*g) The same as c) but errata filtered to 1: 110 s
*h) The same as a but with filters from a) and g): 11 s

Notice the 16 *minutes* long time in e) compared to 90 *seconds* in f) while those two cases only differ in filter form. This might be reasoned as expected behavior (form in e compares different data and it may be that much more resource consuming) but I think this is worth reporting and checking.

Tested on a RHEL7 machine with Intel Xeon, 4 sockets, 32 cores, hyperthreading, 2.7 Ghz, 128 GB RAM.

Version-Release number of selected component (if applicable):
Sat 6.5 snap 5

How reproducible:
Deterministic but depends on configuration

Steps to Reproduce:
1. Monitor -> Report Templates
2. In the Available Errata report's row, click Generate
3. Fill in the search in form 'id="nonexistentname"'
4. Submit
5. Do the same as in 1..4 but with search 'nonexistentname'

Actual results:
The second report generation takes much longer

Expected results:
Not sure. Perhaps a lesser difference.

Comment 3 Marek Hulan 2018-12-07 13:20:06 UTC

This basically means erratas allow searching on too many columns without specifyin :only_explicit => true and will need to be limited. This searches in all fields such as updated timestamp, severity, issued dates, all linked packages, cves etc

Moving to errata management to find out whether this could be limited to smaller amount of things. I'm happy to send the patch, but I'm not sure whether this is being used somewhere.

Comment 4 John Mitsch 2019-02-11 17:56:26 UTC

Created redmine issue https://projects.theforeman.org/issues/26030 from this bug

Comment 5 Bryan Kearney 2019-02-11 19:03:42 UTC

Upstream bug assigned to jomitsch

Comment 6 Bryan Kearney 2019-02-11 19:03:43 UTC

Upstream bug assigned to jomitsch

Comment 7 Jonathon Turel 2019-02-11 21:38:59 UTC

Hey Lukáš,

I'm testing John's fix for this. Do you still have the machine in breaker with data set up? I'd like to apply his patch and do some comparisons if possible.

Comment 8 Bryan Kearney 2019-02-13 15:02:29 UTC

Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/26030 has been resolved.

Comment 9 Lukáš Hellebrandt 2019-02-18 08:44:24 UTC

Hi Jonathon,
No, I don't have the machine anymore.

Comment 12 Lukáš Hellebrandt 2019-03-12 18:23:28 UTC

Results:
a) 75s
b) 0s
c) 120s
d) 4s
e) 0s
f) Field 'id' not recognized for searching!
g) 90s
h) 4s
i) The same as c), but filtered to 0 errata: 145s

Comments/questions:
e) is fixed, I call that an improvement!
f) started failing but I suppose this is intentional. Just to make sure, someone confirm it, please.
i) is an issue I newly discovered: when filtering to 0 errata, I would expect the query to run much quicker but it does the opposite - it takes a long time. Do you want me to file a separate bug about it or fail this BZ based on it?

Comment 13 John Mitsch 2019-03-12 20:13:19 UTC

Hey Lukáš,

I am following your tables - let me make sure I understand, generating an errata report for 10000 hosts with a unspecified search query like 'nonexistant', it went from 16 minutes to 0 seconds?

If so, that doesn't seem right, were there any errors? I would expect it to take less time, but not zero seconds. What happens when you do an unspecified query that actually should match errata or a host?

The "Field 'id' not recognized for searching!" sounds like a valid bug, though it doesn't seem like the changes I've made would introduce that error. I'll check upstream as well.

What is the scenario for i? I'm having trouble following which letter is which.

Comment 14 Lukáš Hellebrandt 2019-03-13 10:13:02 UTC

Hi John,

I've looked into it once again and you were right. In case of e), I mistakenly entered the filter to Hosts field rather than Errata which resulted in 0s time.
The correct time for e) is: 140s
... which is still a huge improvement!
Also, I haven't noticed any tracebacks or errors and the data seems correct: no entries for e) and (hosts X errata) in case of filters that match something.

As for the "Field 'id' not recognized for searching!" error in f), exactly the same mistake happened: this error actually happens for host filter which is most likely unrelated to this BZ and I doubt it is even a bug at all.

And it seems like actually, i) was a correct way to do e) and the measured results confirm it.

As a result, I think you can completely disregard my previous comment. However, I noticed another weird thing: errata filter in form "id=<existing_erratum>" matches the erratum while filter in form "<existing_erratum>" matches nothing. For example, entering a filter "id=RHBA-2018:3014" shows hosts where erratum RHBA-2018:3014 can be applied while filter "RHBA-2018:3014" shows empty results. This is also the case when searching for errata on Errata page so it might be correct behavior. Anyway, I once again need you to confirm this is the expected behavior.

Comment 15 John Mitsch 2019-03-13 14:58:52 UTC

Thanks for clarifying Lukáš,

Currently, the errata id needs to be explicitly specified, this looks like it has been the case for a while, though I'm not sure the context of why that was decided. Either way, it is expected behavior currently.

Let me know if you need anything else from me!

Comment 16 Lukáš Hellebrandt 2019-03-13 15:07:49 UTC

Verified with Sat 6.5 snap 18

Comment 19 errata-xmlrpc 2019-05-14 12:39:23 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:1222