Bug 1651297

Summary: Very slow query when using facts on user roles as filters
Product: Red Hat Satellite Reporter: Karl Abbott <kabbott>
Component: FactAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Radovan Drazny <rdrazny>
Severity: high Docs Contact:
Priority: high    
Version: 6.3.4CC: andrew.schofield, aruzicka, baitken, inecas, kabbott, mhulan, pmoravec, tbrisker
Target Milestone: 6.8.0Keywords: Performance, Triaged
Target Release: UnusedFlags: kabbott: needinfo-
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: foreman-1.23 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 12:58:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Karl Abbott 2018-11-19 16:12:27 UTC
Description of problem:

We have a host filter in place for a role in which we look to filter hosts by certain facts. The filter is (we also have an edit_filter):

Resource: Hosts
Permissions: view_hosts
Filter: facts.domain="example.net" and facts.my_networkzone  = "durham"


Version-Release number of selected component (if applicable):

6.3.5

How reproducible:

Always

Steps to Reproduce:
1. Run with the above filter.

Actual results:

This Query is slow:

SELECT  "hosts".* FROM "hosts"
INNER JOIN fact_values fact_values_1 ON (hosts.id = fact_values_1.host_id)
INNER JOIN fact_names fact_names_1 ON (fact_names_1.id = fact_values_1.fact_name_id)
INNER JOIN fact_values fact_values_2 ON (hosts.id = fact_values_2.host_id)
INNER JOIN fact_names fact_names_2 ON (fact_names_2.id = fact_values_2.fact_name_id)
WHERE "hosts"."type" IN ('Host::Managed')
AND "hosts"."organization_id" = 1
AND "hosts"."location_id" IN (14, 23, 93, 151, 158, 38, 37, 154, 155, 157, 156, 94, 185, 35, 95, 33, 96, 24, 15, 172, 173, 16)
AND (((fact_names_1.name = 'domain' AND fact_values_1.value = 'example.net') OR (fact_names_2.name = 'my_networkzone' AND fact_values_2.value = 'durham')))
ORDER BY "hosts"."name" ASC LIMIT 30 offset 0;

Explain plan:

 Limit  (cost=13014883.83..13014883.84 rows=1 width=1155) (actual time=419820.592..419820.594 rows=30 loops=1)
   ->  Sort  (cost=13014883.83..13014883.84 rows=1 width=1155) (actual time=419820.590..419820.590 rows=30 loops=1)
         Sort Key: hosts.name
         Sort Method: top-N heapsort  Memory: 40kB
         ->  Hash Join  (cost=731195.68..13014883.82 rows=1 width=1155) (actual time=18253.359..419239.876 rows=633410 loops=1)
               Hash Cond: (fact_values_1.host_id = hosts.id)
               Join Filter: ((((fact_names_1.name)::text = 'domain'::text) AND (fact_values_1.value = 'example.net'::text)) OR (((fact_names_2.name)::t
ext = 'my_networkzone'::text) AND (fact_values_2.value = 'durham'::text)))
               Rows Removed by Join Filter: 826154562
               ->  Hash Join  (cost=4780.73..313547.30 rows=6170559 width=47) (actual time=40.338..5008.211 rows=6160688 loops=1)
                     Hash Cond: (fact_values_1.fact_name_id = fact_names_1.id)
                     ->  Seq Scan on fact_values fact_values_1  (cost=0.00..120077.59 rows=6170559 width=22) (actual time=0.012..1126.942 rows=6160688
 loops=1)
                     ->  Hash  (cost=3015.77..3015.77 rows=86877 width=33) (actual time=40.210..40.210 rows=87337 loops=1)
                           Buckets: 2048  Batches: 8  Memory Usage: 710kB
                           ->  Seq Scan on fact_names fact_names_1  (cost=0.00..3015.77 rows=86877 width=33) (actual time=0.008..19.710 rows=87337 loo
ps=1)
               ->  Hash  (cost=465417.40..465417.40 rows=1602284 width=1202) (actual time=10549.655..10549.655 rows=1989164 loops=1)
                     Buckets: 1024  Batches: 16384 (originally 2048)  Memory Usage: 1184kB
                     ->  Hash Join  (cost=8226.39..465417.40 rows=1602284 width=1202) (actual time=65.430..8132.544 rows=1989164 loops=1)
                           Hash Cond: (fact_values_2.host_id = hosts.id)
                           ->  Hash Join  (cost=4780.73..313547.30 rows=6170559 width=47) (actual time=38.697..4522.089 rows=6160688 loops=1)
                                 Hash Cond: (fact_values_2.fact_name_id = fact_names_2.id)
                                 ->  Seq Scan on fact_values fact_values_2  (cost=0.00..120077.59 rows=6170559 width=22) (actual time=0.006..1093.904
rows=6160688 loops=1)
                                 ->  Hash  (cost=3015.77..3015.77 rows=86877 width=33) (actual time=38.565..38.565 rows=87337 loops=1)
                                       Buckets: 2048  Batches: 8  Memory Usage: 710kB
                                       ->  Seq Scan on fact_names fact_names_2  (cost=0.00..3015.77 rows=86877 width=33) (actual time=0.011..17.898 ro
ws=87337 loops=1)
                           ->  Hash  (cost=2583.07..2583.07 rows=5487 width=1155) (actual time=26.313..26.313 rows=5262 loops=1)
                                 Buckets: 1024  Batches: 8  Memory Usage: 227kB
                                 ->  Seq Scan on hosts  (cost=0.00..2583.07 rows=5487 width=1155) (actual time=0.020..20.112 rows=5262 loops=1)
                                       Filter: (((type)::text = 'Host::Managed'::text) AND (organization_id = 1) AND (location_id = ANY ('{14,23,93,15
1,158,38,37,154,155,157,156,94,185,35,95,33,96,24,15,172,173,16}'::integer[])))
                                       Rows Removed by Filter: 15869
 Total runtime: 419822.889 ms
(30 rows)

Expected results:

Query should execute much faster.

Additional info:

Comment 7 Pavel Moravec 2019-03-27 13:07:01 UTC
Imho this is a duplicate of:

https://bugzilla.redhat.com/show_bug.cgi?id=1511254

that is linked to:

https://projects.theforeman.org/issues/21625

that explicitly speaks about slow query (and also duplicate results).

Worth to close?

Comment 8 Karl Abbott 2019-05-28 12:21:44 UTC
Pavel,

I'm going to ask Blair on that one, but from my recollection, we're probably good to close this one and continue work in 1511254.

Cheers,
Karl

Comment 9 Pavel Moravec 2019-07-03 12:49:35 UTC
Linking support cases from the closed duplicated BZ and making this BZ public (havent found nothing secret in either public comment).

Comment 10 Pavel Moravec 2019-07-03 12:52:28 UTC
Reviewing the upstream issue and this bug description, they dont match - unlinking.

Comment 14 Radovan Drazny 2020-09-03 06:34:12 UTC
Tested on Sat 6.8 Snap 13.1 using the reproducer from the original record. Using a Satellite with more than 30k hosts registered. An user has been created with a custom user role with a host filter using facts enabled. The initial listing of filtered hosts took about four seconds, browsing the list was quick and without issues.

Comment 17 errata-xmlrpc 2020-10-27 12:58:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.8 release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:4366

Comment 18 Red Hat Bugzilla 2023-09-15 00:14:03 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days