Bug 1667647
Summary: | getting host details takes several times more for non-admin user | |||
---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Pavel Moravec <pmoravec> | |
Component: | Hosts | Assignee: | Tomer Brisker <tbrisker> | |
Status: | CLOSED ERRATA | QA Contact: | tstrych | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 6.4 | CC: | aeladawy, ahumbe, aperotti, bkearney, cbolz, daniele, egolov, inecas, jbhatia, jentrena, kgaikwad, ktordeur, mhulan, milan.zelenka, mmccune, pdwyer, pmoravec | |
Target Milestone: | 6.9.0 | Keywords: | Performance, PrioBumpField, PrioBumpGSS, PrioBumpPM, PrioBumpQA, Triaged | |
Target Release: | Unused | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | foreman-2.3.0-0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1899317 (view as bug list) | Environment: | ||
Last Closed: | 2021-04-21 13:11:45 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1353215 |
Description
Pavel Moravec
2019-01-19 11:15:29 UTC
Similar outcome is observed when fetching host's facts via: time curl -X GET -s -k -u nonadmin:password https://$(hostname -f)/api/v2/hosts/529/facts > /dev/null And yet another HUGE performance degradation - still very relevant to integration with Ansible Tower: compare (for >250 systems registered): time curl -s -u nonadmin:password -H "Accept:application/json,version=2" -H "Content-Type:application/json" "https://$(hostname -f)/api/v2/hosts?per_page=250&page=1" > /dev/null 2>&1 with admin and non-admin user. 5+ times slower for non-admin user, per my testing. I wonder if this is related to #1527277 (In reply to Evgeni Golov from comment #7) > I wonder if this is related to #1527277 Could be - I can try removing all tasks there on my reproducer, if it improves performance. But rather I suspect taxonomy and permissions validation is over-complex; enabling sql debugging for a host request for (non)admin user, I see many times more requests for non-admin user, like: [root@sat6 ~]# wc host_details.*log 147 7019 116457 host_details.admin.log 3702 113273 1111255 host_details.nonadmin.log 3849 120292 1227712 total [root@sat6 ~]# grep -c -e permissions -e taxonomies host_details.* host_details.admin.log:8 host_details.nonadmin.log:3495 [root@sat6 ~]# grep -c -e taxonomies host_details.* host_details.admin.log:6 host_details.nonadmin.log:3249 [root@sat6 ~]# grep -c -e permissions host_details.* host_details.admin.log:2 host_details.nonadmin.log:300 [root@sat6 ~]# Why 3k queries to taxonomies? All such queries took similar time, like: 2019-01-29 08:21:40 d615b633 [sql] [D] Taxonomy Load (0.6ms) SELECT "taxonomies"."id" FROM "taxonomies" WHERE (("taxonomies"."ancestry" ILIKE '3/4/5/%' OR "taxonomies"."ancestry" = '3/4/5') OR "taxonomies"."id" = 5) ORDER BY "taxonomies"."title" ASC but the traversal of taxonomy hiearchy (see "taxonomies"."ancestry" = '3/4/5' or similar) sounds the performance killer? (anyway having "flat" taxonomy with 1 org and 1 loc, I see twice longer host details fetch by non-admin user, on my another Sat6.4.1; the above isnt probably the only cause) Trivial reproducer (that was seen at some customer - I will try to confirm if this direction of the reproducer is the only / most relevant one): summary of the reproducer: the more taxonomies are assigned to the user, the evident worse results are seen reproducer itself: - have Satellite with even very few hosts - add an user and add all roles to it, dont set the user as administer - create tens of locations and organisations and assign them to all users - test both the non-admin user and admin: time curl -Ss -u user:password https://$(hostname -f)/api/v2/hosts/4/ - grant Administer priviledges to the non-admin user and try the same - conclusion: time to get a host details for a non-admin user is linear wrt. numer of locations and/or organization. Despite the host is still assigned to a default org/loc (or unassigned, doesnt matter) Possible testing script (just customise hostid and credentials on first two lines, where creds = credentials of the non-admin user): adminPwd=hesloNereknu creds=viewer:password hostid=4 hmr="hammer -u admin -p $adminPwd " userids=$($hmr user list | grep ^[0-9] | awk '{ print $1 }' | tr '\n' ',') userids=${userids::-1} tests=20 run_test () { locs=$(su - postgres -c "psql foreman -c \"copy (select count(*) from taxonomies where type='Location') to stdout\"") orgs=$(su - postgres -c "psql foreman -c \"copy (select count(*) from taxonomies where type='Organization') to stdout\"") f=results.${locs}.${orgs}.txt (for i in $(seq 1 $tests); do time curl -Ss -u $creds https://$(hostname -f)/api/v2/hosts/${hostid}/ > /dev/null 2>&1; done) 2>&1 | grep real > $f avg=$(cat $f | awk '{ print $2 }' | sed "s/m/ /g" | sed "s/s//g" | awk -v tests=$tests '{ sum += 60*$1 + $2 } END { print sum/tests }') echo "locations:${locs} organizations:${orgs} time:${avg}" } run_test for i in $(seq 0 9); do for i in $(seq $((1+10*i)) $((10+10*i))); do $hmr location create --name test_loc_${i} --organization-id 1 --user-ids $userids > /dev/null 2>&1 done run_test for i in $(seq $((1+10*i)) $((10+10*i))); do $hmr organization create --name test_org_${i} --user-ids $userids > /dev/null 2>&1 done run_test done Test output: locations:1 organizations:1 time:1.0277 locations:11 organizations:1 time:1.78055 locations:11 organizations:11 time:2.50255 locations:21 organizations:11 time:3.36025 locations:21 organizations:21 time:4.12175 locations:31 organizations:21 time:4.8261 locations:31 organizations:31 time:5.6183 locations:41 organizations:31 time:6.30105 locations:41 organizations:41 time:7.10295 locations:51 organizations:41 time:7.89735 locations:51 organizations:51 time:8.30545 locations:61 organizations:51 time:8.87265 locations:61 organizations:61 time:9.38515 locations:71 organizations:61 time:11.5637 locations:71 organizations:71 time:12.4431 locations:81 organizations:71 time:13.2324 locations:81 organizations:81 time:13.9409 After granting Administer privileges to the user, it takes 0.45s to get the host details - even with 80 locations and 80 organizations. So I tried with 100 locations on the upstream and I can't reproduce the issue. The time is the same under non-admin user. I will retest on 6.8 beta instance, but then I may need you to create the reproducer on my environment. Ok I could reproduce on 6.8, just with much lower numbers. But I see the time growing with number of taxonomies and have a better understanding why. Created redmine issue https://projects.theforeman.org/issues/31024 from this bug Upstream bug assigned to tbrisker Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/31024 has been resolved. Verified with 6.9.0 snap 3 based on comment #16 Reporoducer: 1. 50 locations where created. They are nested to each other. So first location has no parent. Second has the first one. The third has second and so on. 2. Host was created and associated to location 50. 3. New user was created all roles were added to him (user should not have admin rights), all organizations and locations were assigned to him I basically run the run_test from #comment16 100 times with only new_host time curl -Ss -u test:<password> https://$(hostname -f)/api/v2/hosts/<new_host_id>/ The result was: 6.8.2 snap 0 locations:52 organizations:1 time:1.16433 6.9.0 snap 3 locations:52 organizations:1 time:1.14409 6.8.1 snap 4 (the last snap without the speed-up) locations:52 organizations:1 time:2.01513 There is nice difference in performance. typo - 6.8.2 snap 1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Satellite 6.9 Release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1313 |