Bug 1667647 - getting host details takes several times more for non-admin user
Summary: getting host details takes several times more for non-admin user
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Hosts
Version: 6.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: 6.9.0
Assignee: Tomer Brisker
QA Contact: tstrych
URL:
Whiteboard:
Depends On:
Blocks: 1353215
TreeView+ depends on / blocked
 
Reported: 2019-01-19 11:15 UTC by Pavel Moravec
Modified: 2024-03-25 15:12 UTC (History)
17 users (show)

Fixed In Version: foreman-2.3.0-0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1899317 (view as bug list)
Environment:
Last Closed: 2021-04-21 13:11:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 31024 0 High Closed Host show api endpoint is slow for non-admin user 2021-02-19 00:33:32 UTC
Red Hat Knowledge Base (Solution) 3818401 0 None None None 2019-01-19 13:04:11 UTC
Red Hat Product Errata RHSA-2021:1313 0 None None None 2021-04-21 13:12:12 UTC

Description Pavel Moravec 2019-01-19 11:15:29 UTC
Description of problem:
When a non-admin user fetches host details, it takes much more time than for admin user. Something similar like:

https://bugzilla.redhat.com/show_bug.cgi?id=1598855

but just for getting host info, not editing.

When applied upstream patch from bz1598855 to Sat 6.4.1, no improvement in performance was noticed. So it is a separate issue (or not sufficiently general patch covering both scenarios).

This bug is critical for Satellite being an Inventory source for Ansible Tower, where foreman.py script (sequentially) queries each and every Host in Satellite - ideally as non-admin user with read only permissions. And that brings the performance problems.


Version-Release number of selected component (if applicable):
6.4.1, seen also on 6.3


How reproducible:
100%


Steps to Reproduce:
1. Have a Satellite with some Hosts (I think the count does not matter)
2. Have a non-admin user with all rights but not Admin checkbox.
3. Run:

time curl -X GET -s -k -u nonadmin:password https://$(hostname -f)/api/v2/hosts/529 > /dev/null

4. Select Admin checkbox for the user.
5. Run the curl command again.


Actual results:
3. shows several times higher time than 5.


Expected results:
3. shows comparable time like 5. Some higher value is understandable, but not several times more.


Additional info:

Comment 3 Pavel Moravec 2019-01-19 12:09:12 UTC
Similar outcome is observed when fetching host's facts via:

time curl -X GET -s -k -u nonadmin:password https://$(hostname -f)/api/v2/hosts/529/facts > /dev/null

Comment 4 Pavel Moravec 2019-01-19 12:59:26 UTC
And yet another HUGE performance degradation - still very relevant to integration with Ansible Tower:

compare (for >250 systems registered):

time curl -s -u nonadmin:password -H "Accept:application/json,version=2" -H "Content-Type:application/json" "https://$(hostname -f)/api/v2/hosts?per_page=250&page=1" > /dev/null 2>&1

with admin and non-admin user. 5+ times slower for non-admin user, per my testing.

Comment 7 Evgeni Golov 2019-01-29 07:47:15 UTC
I wonder if this is related to #1527277

Comment 8 Pavel Moravec 2019-01-29 08:08:15 UTC
(In reply to Evgeni Golov from comment #7)
> I wonder if this is related to #1527277

Could be - I can try removing all tasks there on my reproducer, if it improves performance.

But rather I suspect taxonomy and permissions validation is over-complex; enabling sql debugging for a host request for (non)admin user, I see many times more requests for non-admin user, like:

[root@sat6 ~]# wc host_details.*log
    147    7019  116457 host_details.admin.log
   3702  113273 1111255 host_details.nonadmin.log
   3849  120292 1227712 total
[root@sat6 ~]# grep -c -e permissions -e taxonomies host_details.*
host_details.admin.log:8
host_details.nonadmin.log:3495
[root@sat6 ~]# grep -c -e taxonomies host_details.*
host_details.admin.log:6
host_details.nonadmin.log:3249
[root@sat6 ~]# grep -c -e permissions host_details.*
host_details.admin.log:2
host_details.nonadmin.log:300
[root@sat6 ~]# 


Why 3k queries to taxonomies? All such queries took similar time, like:

2019-01-29 08:21:40 d615b633 [sql] [D]   Taxonomy Load (0.6ms)  SELECT "taxonomies"."id" FROM "taxonomies" WHERE (("taxonomies"."ancestry" ILIKE '3/4/5/%' OR "taxonomies"."ancestry" = '3/4/5') OR "taxonomies"."id" = 5)  ORDER BY "taxonomies"."title" ASC

but the traversal of taxonomy hiearchy (see "taxonomies"."ancestry" = '3/4/5' or similar) sounds the performance killer?

(anyway having "flat" taxonomy with 1 org and 1 loc, I see twice longer host details fetch by non-admin user, on my another Sat6.4.1; the above isnt probably the only cause)

Comment 16 Pavel Moravec 2020-08-20 13:44:41 UTC
Trivial reproducer (that was seen at some customer - I will try to confirm if this direction of the reproducer is the only / most relevant one):

summary of the reproducer: the more taxonomies are assigned to the user, the evident worse results are seen

reproducer itself:
- have Satellite with even very few hosts
- add an user and add all roles to it, dont set the user as administer
- create tens of locations and organisations and assign them to all users
- test both the non-admin user and admin:

time curl -Ss -u user:password https://$(hostname -f)/api/v2/hosts/4/

- grant Administer priviledges to the non-admin user and try the same

- conclusion: time to get a host details for a non-admin user is linear wrt. numer of locations and/or organization. Despite the host is still assigned to a default org/loc (or unassigned, doesnt matter)

Possible testing script (just customise hostid and credentials on first two lines, where creds = credentials of the non-admin user):

adminPwd=hesloNereknu
creds=viewer:password
hostid=4

hmr="hammer -u admin -p $adminPwd "
userids=$($hmr user list | grep ^[0-9] | awk '{ print $1 }' | tr '\n' ',')
userids=${userids::-1}
tests=20

run_test () {
        locs=$(su - postgres -c "psql foreman -c \"copy (select count(*) from taxonomies where type='Location') to stdout\"")
        orgs=$(su - postgres -c "psql foreman -c \"copy (select count(*) from taxonomies where type='Organization') to stdout\"")
        f=results.${locs}.${orgs}.txt
        (for i in $(seq 1 $tests); do time curl -Ss -u $creds https://$(hostname -f)/api/v2/hosts/${hostid}/ > /dev/null 2>&1; done) 2>&1 | grep real > $f
        avg=$(cat $f | awk '{ print $2 }' | sed "s/m/ /g" | sed "s/s//g" | awk -v tests=$tests '{ sum += 60*$1 + $2 } END { print sum/tests }')
        echo "locations:${locs} organizations:${orgs} time:${avg}"
}

run_test
for i in $(seq 0 9); do
        for i in $(seq $((1+10*i)) $((10+10*i))); do
                $hmr location create --name test_loc_${i} --organization-id 1 --user-ids $userids > /dev/null 2>&1
        done
        run_test 
        for i in $(seq $((1+10*i)) $((10+10*i))); do
                $hmr organization create --name test_org_${i} --user-ids $userids > /dev/null 2>&1
        done
        run_test 
done


Test output:

locations:1 organizations:1 time:1.0277
locations:11 organizations:1 time:1.78055
locations:11 organizations:11 time:2.50255
locations:21 organizations:11 time:3.36025
locations:21 organizations:21 time:4.12175
locations:31 organizations:21 time:4.8261
locations:31 organizations:31 time:5.6183
locations:41 organizations:31 time:6.30105
locations:41 organizations:41 time:7.10295
locations:51 organizations:41 time:7.89735
locations:51 organizations:51 time:8.30545
locations:61 organizations:51 time:8.87265
locations:61 organizations:61 time:9.38515
locations:71 organizations:61 time:11.5637
locations:71 organizations:71 time:12.4431
locations:81 organizations:71 time:13.2324
locations:81 organizations:81 time:13.9409


After granting Administer privileges to the user, it takes 0.45s to get the host details - even with 80 locations and 80 organizations.

Comment 17 Marek Hulan 2020-08-24 16:35:35 UTC
So I tried with 100 locations on the upstream and I can't reproduce the issue. The time is the same under non-admin user. I will retest on 6.8 beta instance, but then I may need you to create the reproducer on my environment.

Comment 18 Marek Hulan 2020-08-24 16:53:10 UTC
Ok I could reproduce on 6.8, just with much lower numbers. But I see the time growing with number of taxonomies and have a better understanding why.

Comment 19 Ondřej Ezr 2020-10-07 20:44:32 UTC
Created redmine issue https://projects.theforeman.org/issues/31024 from this bug

Comment 20 Bryan Kearney 2020-10-26 17:02:50 UTC
Upstream bug assigned to tbrisker

Comment 21 Bryan Kearney 2020-10-26 17:02:52 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/31024 has been resolved.

Comment 23 tstrych 2020-12-02 23:25:34 UTC
Verified with 6.9.0 snap 3 

based on comment #16

Reporoducer:
1. 50 locations where created. They are nested to each other. So first location has no parent. Second has the first one. The third has second and so on. 
2. Host was created and associated to location 50. 
3. New user was created all roles were added to him (user should not have admin rights), all organizations and locations were assigned to him


I basically run the run_test from #comment16 100 times with only new_host 
time curl -Ss -u test:<password> https://$(hostname -f)/api/v2/hosts/<new_host_id>/


The result was:
6.8.2 snap 0
locations:52 organizations:1 time:1.16433

6.9.0 snap 3
locations:52 organizations:1 time:1.14409

6.8.1 snap 4 (the last snap without the speed-up)
locations:52 organizations:1 time:2.01513

There is nice difference in performance.

Comment 24 tstrych 2020-12-02 23:29:02 UTC
typo - 6.8.2 snap 1

Comment 27 errata-xmlrpc 2021-04-21 13:11:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.9 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1313


Note You need to log in before you can comment on or make changes to this bug.