Bug 1544447

Summary: Scaling limitations in Host Search
Product: Red Hat Satellite Reporter: Beat Rubischon <brubisch>
Component: HostsAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2.14CC: inecas, peter.vreman, sshtein, tbrisker
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-02-19 13:38:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1122832    

Description Beat Rubischon 2018-02-12 14:23:13 UTC
Description of problem:

Complex search terms in the hosts page tends to create large SQL statements in the back, sometimes requiring PostgreSQL to use temporary space.

Version-Release number of selected component (if applicable):

Satellite 6.2

How reproducible:

Depending on the amount of managed hosts

Steps to Reproduce:
1. Click on "Good host reports in the last 35 minutes" in Satellites dashboard
2. Watch space used in /var/lib/pgsql

Actual results:

In large environments the database will consume large amount of temporary space

Expected results:

The queries are smooth and quick

Additional info:

Comment 7 Tomer Brisker 2018-02-19 13:38:41 UTC
This appears to be a duplicate of BZ1447958 which was fixed in a 6.2.9 async errata.

*** This bug has been marked as a duplicate of bug 1447958 ***

Comment 8 Peter Vreman 2018-02-19 14:01:28 UTC
Tomer,

The issue was found on 6.2.9 that had the patch of 1447958 so the duplicate is not correct. Also there is no OOM Kill being seen.

Peter

Comment 9 Peter Vreman 2018-02-19 14:23:27 UTC
Small correction from my side. We were running Sat6.2.9 with only the patch from https://github.com/Katello/katello/pull/6774/files applied to fix the blocking the OOM.

Does the other included patches of RHBA-2017:1234 reduce also the memory consumption in Postgres?

Peter

Comment 11 Tomer Brisker 2018-02-20 11:08:43 UTC
Hi Peter,

The other issues in the errata are unrelated. 

However, the logs you provided indicate that the patch was not in effect on the affected system. Perhaps possible the patch got reverted somehow?
Specifically, this part of the query is only present on unpatched 6.2.9 systems:

LEFT OUTER JOIN "katello_content_facets" ON "katello_content_facets"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_content_views" ON "katello_content_views"."id" = "katello_content_facets"."content_view_id" LEFT OUTER JOIN "katello_content_facets" "content_facets_hosts_join" ON "content_facets_hosts_join"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_environments" ON "katello_environments"."id" = "content_facets_hosts_join"."lifecycle_environment_id" LEFT OUTER JOIN "katello_subscription_facets" ON "katello_subscription_facets"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_content_facets" "content_facets_hosts_join_2" ON "content_facets_hosts_join_2"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_content_facet_errata" ON "katello_content_facet_errata"."content_facet_id" = "content_facets_hosts_join_2"."id" LEFT OUTER JOIN "katello_errata" ON "katello_errata"."id" = "katello_content_facet_errata"."erratum_id" LEFT OUTER JOIN "katello_content_facets" "content_facets_hosts" ON "content_facets_hosts"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_content_facet_repositories" ON "katello_content_facet_repositories"."content_facet_id" = "content_facets_hosts"."id" LEFT OUTER JOIN "katello_repositories" ON "katello_repositories"."id" = "katello_content_facet_repositories"."repository_id" LEFT OUTER JOIN "katello_content_facet_errata" "content_facet_errata_katello_content_facets_join" ON "content_facet_errata_katello_content_facets_join"."content_facet_id" = "content_facets_hosts"."id" LEFT OUTER JOIN "katello_errata" "applicable_errata_katello_content_facets" ON "applicable_errata_katello_content_facets"."id" = "content_facet_errata_katello_content_facets_join"."erratum_id" LEFT OUTER JOIN "katello_content_views" "content_views_katello_content_facets" ON "content_views_katello_content_facets"."id" = "content_facets_hosts"."content_view_id" LEFT OUTER JOIN "katello_environments" "lifecycle_environments_katello_content_facets" ON "lifecycle_environments_katello_content_facets"."id" = "content_facets_hosts"."lifecycle_environment_id"

The OOM killer in the other BZ kicked in either before the disk space was exhausted for the query in that case, or after the query was completed when trying to process the results. 
Both however are symptoms of the sub-optimal query that was being generated, leading to a huge join table being generated for certain searches.

Comment 12 Peter Vreman 2018-02-20 11:36:19 UTC
Tomer,

Thanks for the confirmation that the query being seen was the mentioned issue on 6.2.9.

I can confirm that it was my mistake. I checked my history what i did on the affected Sat6 instance and noted that in Jan-2018 i had to applying a new patchset for interfaces-fact parsing, i used there my latest patchset based on 6.2.12, and in that patchset of 6.2.12 the mentioned fix for the long query above was not included anymore.

That also explains why i did not see the problem on the Sat6 with the previous OS patching rounds in 2017.


Thanks for your time to explain,
Peter