Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1339696 - Importing manifest gets slow with increasing number of organizations
Summary: Importing manifest gets slow with increasing number of organizations
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Subscription Management
Version: 6.2.0
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: Unspecified
Assignee: Justin Sherrill
QA Contact: Roman Plevka
URL:
Whiteboard:
Depends On:
Blocks: 1339766 1340229
TreeView+ depends on / blocked
 
Reported: 2016-05-25 16:01 UTC by Roman Plevka
Modified: 2019-09-26 16:20 UTC (History)
16 users (show)

Fixed In Version: tfm-rubygem-katello-3.4.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1340229 1340245 1655981 (view as bug list)
Environment:
Last Closed: 2018-02-21 16:59:05 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
time measurements of the import task (2.64 KB, text/plain)
2016-06-24 11:23 UTC, Roman Plevka
no flags Details
plot of the time measurements (36.77 KB, image/png)
2016-06-24 11:42 UTC, Roman Plevka
no flags Details
candlepin.log - ELQpercZTmZ org sample (50.30 KB, application/x-gzip)
2016-06-24 12:38 UTC, Roman Plevka
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Foreman Issue Tracker 20233 0 Normal Closed Importing manifest gets slow with increasing number of organizations 2021-01-21 05:20:34 UTC
Red Hat Bugzilla 1097446 0 unspecified CLOSED hammer errors out on large manifest import, candlepin.log shows import completed 2021-02-22 00:41:40 UTC

Internal Links: 1097446

Description Roman Plevka 2016-05-25 16:01:09 UTC
Description of problem:
It looks like candlepin polls every org and every activation key every time a manifest is being uploaded.

this is the section where my manifest has been uploaded to Default Organization, which was the only org on the system:
http://pastebin.com/RTZnhL85

here I created a new organization my_org_1 and did the same thing:
http://pastebin.com/nPZwdryD
(notice, there is polling to Default Organization done too).

I've also extracted the CI machine's candlepin log around the manifest import phase:
http://pastebin.com/1VhrEx1m
This took about 11 minutes

I realize 1000+ is a bit extreme and is not likely to be used in a production, however this causes many timeout issues to our automation. Also it is good to demonstrate the increasing time needed to process the manifest.

Also it looks like even a few organizations with many Activation Keys might cause the issue.

Version-Release number of selected component (if applicable):
6.2.0 beta (GA12.1)

Steps to Reproduce:
1. get multiple organizations, optionally with many activation keys
2. try to upload the manifest
3. experiment by adding more organizations and activation keys

Expected results:
The time taken to import a manifest should be constant and shouldn't exceed 1 minute (subjective)

Additional info:

Comment 2 Barnaby Court 2016-06-07 14:02:05 UTC
This will be fixed in a future release. See https://bugzilla.redhat.com/show_bug.cgi?id=1340229 for where it will be fixed.

Comment 3 Michael Stead 2016-06-21 18:52:00 UTC
Could you please provide more info about the manifest that you are importing? or the manifest itself if you still have it.

Comment 5 Roman Plevka 2016-06-24 11:23:09 UTC
Created attachment 1171924 [details]
time measurements of the import task

# attached file contains import task time measurements up to 130 orgs. I'll just plot first 30 orgs for a better resolution.
gnuplot> plot 'times_all.csv' every ::0::20 using 1:3 with points title columnhead  



  50 +-+-----------+-------------+-------------+-------------+--A--------+-+
     +             +             +             +             A             +
  45 +-+..................................................A..............+-+
     |             :             :             :     A A     :             |
  40 +-+..........................................A......................+-+
  35 +-+....................................A..A.........................+-+
     |             :             :       A     :             :             |
  30 +-+...............................A.................................+-+
     |             :             :  A          :             :             |
  25 +-+.........................A.......................................+-+
     |             :       A  A  :             :             :             |
  20 +-+.................A...............................................+-+
     |             :  A          :             :             :             |
  15 +-+........A..A.....................................................+-+
  10 +-+.................................................................+-+
     |     A A     :             :             :             :             |
   5 +-+.................................................................+-+
     +  A          +             +             +             +             +
   0 +-+-----------+-------------+-------------+-------------+-----------+-+
     0             5             10            15            20            25
                                      Org #

Comment 6 Roman Plevka 2016-06-24 11:42:22 UTC
Created attachment 1171926 [details]
plot of the time measurements

Comment 7 Michael Stead 2016-06-24 12:37:25 UTC
Thanks Roman. Would you mind describing your exact reproducer steps?

Comment 8 Roman Plevka 2016-06-24 12:38:45 UTC
Created attachment 1171953 [details]
candlepin.log - ELQpercZTmZ org sample

Attaching a part of candlepin log, where there are already 100+ organizations created.
This log captures a creation of org: ELQpercZTmZ and importing its manifest.
This action triggers about 1054 (!) GET requests to candlepin - many of them are duplicated.

Comment 9 Roman Plevka 2016-06-24 12:39:41 UTC
(In reply to Roman Plevka from comment #8)
> Created attachment 1171953 [details]
> candlepin.log - ELQpercZTmZ org sample
> 
> Attaching a part of candlepin log, where there are already 100+
> organizations created.
> This log captures a creation of org: ELQpercZTmZ and importing its manifest.
> This action triggers about 1054 (!) GET requests to candlepin - many of them
> are duplicated.

According to this - I don't think the issue lies inside Candlepin component. Looks like the problem is in Katello.

Comment 10 Roman Plevka 2016-06-24 12:44:53 UTC
(In reply to Michael Stead from comment #7)
> Thanks Roman. Would you mind describing your exact reproducer steps?

I used satellite 6.2 beta (GA17.0).
On a clean installation I performed the following in a loop:

- create an organization using satellite API
- start timer
- uploaded a manifest using satellite API
- stopped timer

there are 2 datasets in the CSV i've attached. the first one contains measurements of the nailgun (python framework for Satellite API) measurements. this is of worse resolution, as Nailgun uses 5-second intervals for checking the status of the tasks, so the measurements are rounded to 5 seconds.

the second dataset contains the times (in seconds) extracted from `hammer task list` started_at and ended_at times.
- Anyway, this won't be reproducible in Candlepin ot its own as I think it's katello which sends many GET requests to Candlepin (see my previous comments and attached logs)

Comment 11 Tom McKay 2016-06-24 13:31:37 UTC
Looking at the import manifest code in katello there are indeed many places where candlepin is being hit for multiple orgs.

https://github.com/Katello/katello/blob/master/app/models/katello/glue/candlepin/candlepin_object.rb#L37

Part of this, I believe, is because katello needs to keep its data state absolutely in sync with candlepin. There may be cases for optimization here but the risk versus reward of refactoring this area needs to be considered carefully.

Overall, I would agree that this is a katello issue and not a candlepin one.

Comment 12 Bryan Kearney 2016-07-26 18:57:42 UTC
Moving 6.2 bugs out to sat-backlog.

Comment 13 Andrii Balakhtar 2017-01-20 10:24:19 UTC
Small comment after investigating some automation results, hope this will be helpful:
not only manifest importing, but also manifest refresh/manifest delete commands have decreased performance with increased number of organizations, although the root cause is most likely the same.

Comment 17 Roman Plevka 2017-06-01 13:14:19 UTC
btw, the same issue applies to creating products. I believe the root cause is the same.

Comment 19 Justin Sherrill 2017-07-06 14:40:02 UTC
Created redmine issue http://projects.theforeman.org/issues/20233 from this bug

Comment 20 Satellite Program 2017-07-10 16:03:20 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/20233 has been resolved.

Comment 21 Satellite Program 2017-08-03 22:03:33 UTC
Moving this bug to POST for triage into Satellite 6 since the upstream issue http://projects.theforeman.org/issues/20233 has been resolved.

Comment 22 Roman Plevka 2017-09-20 16:10:49 UTC
VERIFIED
on sat6.3.0-16

  20 +-+-----------+-------------+-------------+-------------+-----------+-+   
     +             +             +             +             +             +   
     |                                           'times' using 1:2    A    |   
     |                                                                     |   
  15 +-+                                                                 +-+   
     |                                                                     |   
     |                                                                     |   
     |                                                                     |   
     |                                                                     |   
  10 +-+    A      A             A             A      A             A    +-A   
     A                    A                                  A             |   
     |                                  A                                  |   
     |                                                                     |   
   5 +-+                                                                 +-+   
     |                                                                     |   
     |                                                                     |   
     |                                                                     |   
     +             +             +             +             +             +   
   0 +-+-----------+-------------+-------------+-------------+-----------+-+   
     0             20            40            60            80           100  
                                     Orgs #                                    
   

This now seems to be constant. I've also measured a time taken for deleting the manifest, got the identical results.

- created org
- if orgs# % 10 == 0:
  - start timer
  - upload manifest
  - stop timer

  - start timer
  - delete manifest
  - stop timer

Comment 23 Bryan Kearney 2018-02-21 16:59:05 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA.

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0336


Note You need to log in before you can comment on or make changes to this bug.