Bug 1339696
Summary: | Importing manifest gets slow with increasing number of organizations | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Roman Plevka <rplevka> | ||||||||
Component: | Subscription Management | Assignee: | Justin Sherrill <jsherril> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Roman Plevka <rplevka> | ||||||||
Severity: | low | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 6.2.0 | CC: | abalakht, bbuckingham, bcourt, cdonnell, egolov, ehelms, jcallaha, jsherril, mstead, redakkan, rplevka, sghai, skallesh, tomckay, vrjain, zhunting | ||||||||
Target Milestone: | Unspecified | Keywords: | PrioBumpGSS, PrioBumpQA, Reopened, Triaged | ||||||||
Target Release: | Unused | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | tfm-rubygem-katello-3.4.4 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | |||||||||||
: | 1340229 1340245 1655981 (view as bug list) | Environment: | |||||||||
Last Closed: | 2018-02-21 16:59:05 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1339766, 1340229 | ||||||||||
Attachments: |
|
Description
Roman Plevka
2016-05-25 16:01:09 UTC
This will be fixed in a future release. See https://bugzilla.redhat.com/show_bug.cgi?id=1340229 for where it will be fixed. Could you please provide more info about the manifest that you are importing? or the manifest itself if you still have it. Created attachment 1171924 [details]
time measurements of the import task
# attached file contains import task time measurements up to 130 orgs. I'll just plot first 30 orgs for a better resolution.
gnuplot> plot 'times_all.csv' every ::0::20 using 1:3 with points title columnhead
50 +-+-----------+-------------+-------------+-------------+--A--------+-+
+ + + + A +
45 +-+..................................................A..............+-+
| : : : A A : |
40 +-+..........................................A......................+-+
35 +-+....................................A..A.........................+-+
| : : A : : |
30 +-+...............................A.................................+-+
| : : A : : |
25 +-+.........................A.......................................+-+
| : A A : : : |
20 +-+.................A...............................................+-+
| : A : : : |
15 +-+........A..A.....................................................+-+
10 +-+.................................................................+-+
| A A : : : : |
5 +-+.................................................................+-+
+ A + + + + +
0 +-+-----------+-------------+-------------+-------------+-----------+-+
0 5 10 15 20 25
Org #
Created attachment 1171926 [details]
plot of the time measurements
Thanks Roman. Would you mind describing your exact reproducer steps? Created attachment 1171953 [details]
candlepin.log - ELQpercZTmZ org sample
Attaching a part of candlepin log, where there are already 100+ organizations created.
This log captures a creation of org: ELQpercZTmZ and importing its manifest.
This action triggers about 1054 (!) GET requests to candlepin - many of them are duplicated.
(In reply to Roman Plevka from comment #8) > Created attachment 1171953 [details] > candlepin.log - ELQpercZTmZ org sample > > Attaching a part of candlepin log, where there are already 100+ > organizations created. > This log captures a creation of org: ELQpercZTmZ and importing its manifest. > This action triggers about 1054 (!) GET requests to candlepin - many of them > are duplicated. According to this - I don't think the issue lies inside Candlepin component. Looks like the problem is in Katello. (In reply to Michael Stead from comment #7) > Thanks Roman. Would you mind describing your exact reproducer steps? I used satellite 6.2 beta (GA17.0). On a clean installation I performed the following in a loop: - create an organization using satellite API - start timer - uploaded a manifest using satellite API - stopped timer there are 2 datasets in the CSV i've attached. the first one contains measurements of the nailgun (python framework for Satellite API) measurements. this is of worse resolution, as Nailgun uses 5-second intervals for checking the status of the tasks, so the measurements are rounded to 5 seconds. the second dataset contains the times (in seconds) extracted from `hammer task list` started_at and ended_at times. - Anyway, this won't be reproducible in Candlepin ot its own as I think it's katello which sends many GET requests to Candlepin (see my previous comments and attached logs) Looking at the import manifest code in katello there are indeed many places where candlepin is being hit for multiple orgs. https://github.com/Katello/katello/blob/master/app/models/katello/glue/candlepin/candlepin_object.rb#L37 Part of this, I believe, is because katello needs to keep its data state absolutely in sync with candlepin. There may be cases for optimization here but the risk versus reward of refactoring this area needs to be considered carefully. Overall, I would agree that this is a katello issue and not a candlepin one. Moving 6.2 bugs out to sat-backlog. Small comment after investigating some automation results, hope this will be helpful: not only manifest importing, but also manifest refresh/manifest delete commands have decreased performance with increased number of organizations, although the root cause is most likely the same. btw, the same issue applies to creating products. I believe the root cause is the same. Created redmine issue http://projects.theforeman.org/issues/20233 from this bug Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/20233 has been resolved. Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/20233 has been resolved. VERIFIED on sat6.3.0-16 20 +-+-----------+-------------+-------------+-------------+-----------+-+ + + + + + + | 'times' using 1:2 A | | | 15 +-+ +-+ | | | | | | | | 10 +-+ A A A A A A +-A A A A | | A | | | 5 +-+ +-+ | | | | | | + + + + + + 0 +-+-----------+-------------+-------------+-------------+-----------+-+ 0 20 40 60 80 100 Orgs # This now seems to be constant. I've also measured a time taken for deleting the manifest, got the identical results. - created org - if orgs# % 10 == 0: - start timer - upload manifest - stop timer - start timer - delete manifest - stop timer Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:0336 |