Bug 2158849

Summary: Search query on host collections using != retrieves wrong result
Product: Red Hat Satellite Reporter: Paul Dudley <pdudley>
Component: Host CollectionsAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact: Satellite QE Team <sat-qe-bz-list>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.12.0CC: lufu, mkushwah
Target Milestone: UnspecifiedKeywords: Reopened, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-06-29 15:08:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paul Dudley 2023-01-06 19:02:13 UTC
Description of problem:
Using the query of `host_collection = test1 and host_collection != test3` retrieves only the hosts from the first collection mentioned

For example, I have these two collections;
─────────────────────────────────────
[root@iridium ~]# hammer host list --search 'host_collection = test1' --thin true
---|-------------|------------------|------------|--------------|----------------------
ID | NAME        | OPERATING SYSTEM | HOST GROUP | CONTENT VIEW | LIFECYCLE ENVIRONMENT
---|-------------|------------------|------------|--------------|----------------------
58 | dockertest1 |                  |            |              |
57 | dockertest2 |                  |            |              |
---|-------------|------------------|------------|--------------|----------------------
[root@iridium ~]# hammer host list --search 'host_collection = test3' --thin true
---|-------------|------------------|------------|--------------|----------------------
ID | NAME        | OPERATING SYSTEM | HOST GROUP | CONTENT VIEW | LIFECYCLE ENVIRONMENT
---|-------------|------------------|------------|--------------|----------------------
57 | dockertest2 |                  |            |              |
59 | dockertest3 |                  |            |              |
---|-------------|------------------|------------|--------------|----------------------
─────────────────────────────────────

If I run a query to select for all hosts in collection test1 minus the hosts in collection test3 I still only see all hosts in collection test1;
─────────────────────────────────────
[root@iridium ~]# hammer host list --search 'host_collection = test1 and host_collection != test3' --thin true
---|-------------|------------------|------------|--------------|----------------------
ID | NAME        | OPERATING SYSTEM | HOST GROUP | CONTENT VIEW | LIFECYCLE ENVIRONMENT
---|-------------|------------------|------------|--------------|----------------------
58 | dockertest1 |                  |            |              |                      
57 | dockertest2 |                  |            |              |                      
---|-------------|------------------|------------|--------------|----------------------
─────────────────────────────────────



How reproducible:
With this query, all the time using host collections


Steps to Reproduce:
1. add hosts in collection 1
2. add some of the same hosts in collection 2, with others
3. try to use query to retrieve hosts from one collection, minus hosts in the other collection

Additional info:
Using 'and not' seems to work okay, though this is also not flawless. For example;
The simple query from above can be made to work;
─────────────────────────────────────
[root@iridium ~]# hammer host list --search 'host_collection = test1 and not host_collection = test3' --thin true
---|-------------|------------------|------------|--------------|----------------------
ID | NAME        | OPERATING SYSTEM | HOST GROUP | CONTENT VIEW | LIFECYCLE ENVIRONMENT
---|-------------|------------------|------------|--------------|----------------------
58 | dockertest1 |                  |            |              |                      
---|-------------|------------------|------------|--------------|----------------------
(this is the correct output)
─────────────────────────────────────

But adding an *or* to add more collections makes it fail;
─────────────────────────────────────
[root@iridium ~]# hammer host list --search 'host_collection = test1 or host_collection = test2 and not host_collection = test3' --thin true
---|-------------|------------------|------------|--------------|----------------------
ID | NAME        | OPERATING SYSTEM | HOST GROUP | CONTENT VIEW | LIFECYCLE ENVIRONMENT
---|-------------|------------------|------------|--------------|----------------------
58 | dockertest1 |                  |            |              |                      
57 | dockertest2 |                  |            |              |                      
60 | dockertest4 |                  |            |              |                      
---|-------------|------------------|------------|--------------|----------------------
(This should only retrieve dockertest1 and 4, 2 should be excluded from the list as it is a part of the test3 collection)
─────────────────────────────────────

Comment 1 Paul Dudley 2023-01-06 19:07:00 UTC
From the queries made it's possible this errant double quote is the problem;
Query generated for an UNSUCCESSFUL '!=' run;
─────────────────────────────────────
SELECT COUNT(*) FROM (SELECT DISTINCT "hosts"."id" FROM "hosts" LEFT OUTER JOIN "katello_host_collection_hosts"
 ON "katello_host_collection_hosts"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_host_collections"
 ON "katello_host_collections"."id" = "katello_host_collection_hosts"."host_collection_id" WHERE "hosts"."type" = $1 AND ((("hosts"."id" IN (SELECT "hosts"."id" FROM "hosts"
          INNER JOIN "katello_host_collection_hosts"
                  ON "hosts"."id" = "katello_host_collection_hosts"."host_id"
                  INNER JOIN "katello_host_collections"
                  ON "katello_host_collection_hosts"."host_collection_id" = "katello_host_collections"."id"
         WHERE "katello_host_collections"."name" = 'test1' )) AND ("hosts"."id" IN (SELECT "hosts"."id" FROM "hosts"
          INNER JOIN "katello_host_collection_hosts"
                  ON "hosts"."id" = "katello_host_collection_hosts"."host_id"
                  INNER JOIN "katello_host_collections"
                  ON "katello_host_collection_hosts"."host_collection_id" = "katello_host_collections"."id"
         WHERE "katello_host_collections"."name" <> 'test3' ))))) subquery_for_count
SELECT DISTINCT "hosts"."name" AS alias_0, "hosts"."id" FROM "hosts" LEFT OUTER JOIN "katello_host_collection_hosts"
 ON "katello_host_collection_hosts"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_host_collections"
 ON "katello_host_collections"."id" = "katello_host_collection_hosts"."host_collection_id" WHERE "hosts"."type" = $1 AND ((("hosts"."id" IN (SELECT "hosts"."id" FROM "hosts"
          INNER JOIN "katello_host_collection_hosts"
                  ON "hosts"."id" = "katello_host_collection_hosts"."host_id"
                  INNER JOIN "katello_host_collections"
                  ON "katello_host_collection_hosts"."host_collection_id" = "katello_host_collections"."id"
         WHERE "katello_host_collections"."name" = 'test1' )) AND ("hosts"."id" IN (SELECT "hosts"."id" FROM "hosts"
          INNER JOIN "katello_host_collection_hosts"
                  ON "hosts"."id" = "katello_host_collection_hosts"."host_id"
                  INNER JOIN "katello_host_collections"
                  ON "katello_host_collection_hosts"."host_collection_id" = "katello_host_collections"."id"
         WHERE "katello_host_collections"."name" <> 'test3' )))) ORDER BY "hosts".""name ASC LIMIT $2 OFFSET $3  <--- here you can see "hosts".""name where the " is likely misplaced and this should instead be "hosts"."name"
SELECT DISTINCT "hosts"."id", "hosts"."name" FROM "hosts" LEFT OUTER JOIN "katello_host_collection_hosts"
 ON "katello_host_collection_hosts"."host_id" = "hosts"."id" LEFT OUTER JOIN "katello_host_collections"
 ON "katello_host_collections"."id" = "katello_host_collection_hosts"."host_collection_id" WHERE "hosts"."type" = $1 AND ((("hosts"."id" IN (SELECT "hosts"."id" FROM "hosts"
          INNER JOIN "katello_host_collection_hosts"
                  ON "hosts"."id" = "katello_host_collection_hosts"."host_id"
                  INNER JOIN "katello_host_collections"
                  ON "katello_host_collection_hosts"."host_collection_id" = "katello_host_collections"."id"
         WHERE "katello_host_collections"."name" = 'test1' )) AND ("hosts"."id" IN (SELECT "hosts"."id" FROM "hosts"
          INNER JOIN "katello_host_collection_hosts"
                  ON "hosts"."id" = "katello_host_collection_hosts"."host_id"
                  INNER JOIN "katello_host_collections"
                  ON "katello_host_collection_hosts"."host_collection_id" = "katello_host_collections"."id"
         WHERE "katello_host_collections"."name" <> 'test3' )))) AND "hosts"."id" IN ($2, $3) ORDER BY "hosts"."name" ASC
─────────────────────────────────────

Comment 2 Lucy Fu 2023-04-25 22:39:48 UTC

*** This bug has been marked as a duplicate of bug 2023281 ***

Comment 3 Lucy Fu 2023-06-29 14:37:54 UTC
Hey Paul,

I don't see the format issue that you got:
   13d81baf |  WHERE "katello_host_collections"."name" <> 'test3' ))) ORDER BY "hosts"."name" ASC LIMIT $5 OFFSET $6  [["type", "Token::Build"], ["type", "Host::Managed"], ["organization_id", 1], ["location_id", 2], ["LIMIT", 20], ["OFFSET", 0]]
The query looks good to me.

Tested the same queries in web UI and got the same results. But they are correct.

--search 'host_collection = test1 or host_collection = test2 and not host_collection = test3' could be read as 
  'host_collection = test1 or (host_collection = test2 and not host_collection = test3)' since the logical-AND operator has higher precedence than the logical-OR operator.

You will get the result you expect if you try --search '(host_collection = test1 or host_collection = test2) and not host_collection = test3'.

So this BZ really is the same as bz 2023281.

Comment 4 Paul Dudley 2023-06-29 15:08:54 UTC
Lucy,

In the intervening time my Satellite version is now 6.13, though I'm unsure if what we saw before was due to version or just a gremlin. Running it again with everything being the same I no longer receive the formatting issue on the query being built. Thanks for your time and testing, and the clarification on the operator hierarchy. Closing again as duplicate of 2023281.

Thanks
Paul

*** This bug has been marked as a duplicate of bug 2023281 ***