Bug 1282838 - Extremely slow channel.software.syncErrata API leads to spacewalk-clone-by-date traceback
Extremely slow channel.software.syncErrata API leads to spacewalk-clone-by-da...
Product: Spacewalk
Classification: Community
Component: API (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Grant Gainey
Red Hat Satellite QA List
: Patch
Depends On:
Blocks: space27
  Show dependency treegraph
Reported: 2015-11-17 10:22 EST by Grant Gainey
Modified: 2017-09-28 14:08 EDT (History)
9 users (show)

See Also:
Fixed In Version: spacewalk-java-2.5.12-1
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1220487
Last Closed: 2016-06-16 03:48:10 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Grant Gainey 2015-11-17 10:22:07 EST
+++ This bug was initially created as a clone of Bug #1220487 +++

Description of problem:

Run spacewalk-clone-by-date for a large base channel, and it errors out when calling the channel.software.syncErrata API:

# spacewalk-clone-by-date -y -d "2015-05-01" -u admin -p XXXXXX --channels=rhel-x86_64-server-6 clone-rhel-x86_64-server-6
Reading repository information.
Using issue_date.

By continuing the following will be cloned:
  rhel-x86_64-server-6 -> clone-rhel-x86_64-server-6  (2899/2899 Errata)
Cloning Errata into clone-rhel-x86_64-server-6 (2899):
######################################## - complete
Traceback (most recent call last):
  File "/usr/bin/spacewalk-clone-by-date", line 412, in <module>
    sys.exit(abs(main() or 0))
  File "/usr/bin/spacewalk-clone-by-date", line 402, in main
    return cloneByDate.main(args)
  File "/usr/share/rhn/utils/cloneByDate.py", line 282, in main
  File "/usr/share/rhn/utils/cloneByDate.py", line 464, in clone
  File "/usr/share/rhn/utils/cloneByDate.py", line 621, in process
  File "/usr/share/rhn/utils/cloneByDate.py", line 730, in clone
  File "/usr/share/rhn/utils/cloneByDate.py", line 872, in sync_errata
    self.client.channel.software.syncErrata(self.auth_token, to_label)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1243, in request
xmlrpclib.ProtocolError: <ProtocolError for localhost/rpc/api: 503 Service Temporarily Unavailable>

Alternatively, run an API script that calls channel.software.syncErrata for a clone channel directly.

# cat api-sync-errata.py 
import xmlrpclib

SATELLITE_URL = "http://localhost/rpc/api"

client = xmlrpclib.Server(SATELLITE_URL, verbose=0)


print client.channel.software.syncErrata(key, "clone-rhel-x86_64-server-6")

# ./api-sync-errata.py
Traceback (most recent call last):
  File "./api-sync-errata.py", line 12, in <module>
    print client.channel.software.syncErrata(key, "clone-rhel-x86_64-server-6-4")
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
  File "/usr/lib64/python2.6/xmlrpclib.py", line 1243, in request
xmlrpclib.ProtocolError: <ProtocolError for localhost/rpc/api: 503 Service Temporarily Unavailable>

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:

See description above.

Actual results:

503 error when running spacewalk-clone-by-date, due to slow channel.software.syncErrata API.

Expected results:

No error, syncErrata API finishes in a reasonable amount of time.

Additional info:

The API runs extremely slowly for a few reasons:

1.) ChannelSoftwareHandler.syncErrata() calls ChannelManager.listErrataNeedingResync() and ChannelManager.listErrataPackagesForResync(), which call the datasource queries 
list_errata_needing_sync and list_packages_needing_sync in Errata_queries.xml, respectively, which return a List of ErrataOverview or PackageOverview objects. These objects are only used to get the corresponding errata or package id's. If there are many errata in the cloned channel that need to be synced, then creating all of these objects just to get their id's is inefficient, and we could replace these queries with ones that return the id's in the first place.


    public Integer syncErrata(User loggedInUser, String channelLabel) {
        List<ErrataOverview> errata = ChannelManager.listErrataNeedingResync(channel,
        List<Long> eids = new ArrayList<Long>();
        for (ErrataOverview e : errata) {

        List<PackageOverview> packages = ChannelManager
                .listErrataPackagesForResync(channel, loggedInUser);
        List<Long> pids = new ArrayList<Long>();
        for (PackageOverview p : packages) {

        ChannelEditor.getInstance().addPackages(loggedInUser, channel, pids);

        for (Long eid : eids) {
            Errata e = ErrataManager.lookupErrata(eid, loggedInUser);
            if (e.isPublished() && e.isCloned()) {
                ErrataFactory.syncErrataDetails((PublishedClonedErrata) e);
            else {
                log.fatal("Tried to sync errata with id " + eid +
                        " But it was not published or was not cloned");

2.) The queries used by list_errata_needing_sync and list_packages_needing_sync queries themselves are extremely slow. In my testing on Satellite 5.7 w/ embedded postgresql db, for a clone of rhel-x86_64-server-6 that already has all the errata synced, the queries ran for up to an hour without completing. The results can be obtained much more quickly without the inner joins on rhnErrata, rhnChannelErrata, etc. In my testing, the syncErrata API completed in 0.5 seconds, rather then timing out and never completing, with the new queries (see attached patch).

--- Additional comment from Grant Gainey on 2015-10-19 13:30:22 EDT ---

The patch submitted with this BZ does have a significant impact on the syncErrata call - I see it going from ~1.3s to ~0.5s. But syncErrata isn't the "long pole" in the channel-cloning process - indeed, even in its current suboptimal state it's a tiny part of the total time.

Some examples pre-patch:

[root@foo-011 conf]# time spacewalk-clone-by-date -y -d "2015-07-01" -u admin -p <pwd> --channels=rhel-x86_64-server-6 clone-rhel-x86_64-server-6 --no-errata-sync
real	81m38.495s
user	47m33.629s
sys	0m21.989s
[root@foo-011 conf]# time ./api-sync-errata.py 

real	0m1.118s
user	0m0.057s
sys	0m0.089s
[root@foo-011 conf]# 

and post-patch:

[root@foo-011 conf]# time spacewalk-clone-by-date -y -d "2015-07-01" -u admin -p <pwd> --channels=rhel-x86_64-server-6 clone-rhel-x86_64-server-6 --no-errata-sync
real	75m18.503s
user	45m48.657s
sys	0m22.711s
[root@foo-011 conf]# time ./api-sync-errata.py 

real	0m0.554s
user	0m0.031s
sys	0m0.025s
[root@foo-011 conf]#
Comment 1 Grant Gainey 2015-11-17 11:22:28 EST
spacewalk.github: 2be48080365ed26f12de18818c3327e84af0dcf9
Comment 2 Tomáš Kašpárek 2016-06-16 03:48:10 EDT
Spacewalk 2.5 was released.
Comment 3 Michael Calmer 2016-09-06 05:31:15 EDT
Hi Grant: would you explain why you reverted this fix in git and what is the replacement for it?
Comment 4 Grant Gainey 2016-09-06 17:30:09 EDT
I reverted it because a) it didn't help much, and b) the queries were broken in ways that broke errata-cloning.

See https://bugzilla.redhat.com/show_bug.cgi?id=1367915 for more details.

There have been a couple of 'fixes' to clone-by-date recently that actually broke required functionality - see https://bugzilla.redhat.com/show_bug.cgi?id=1369185 for another example.
Comment 5 Eric Herget 2017-09-28 14:08:02 EDT
This BZ closed some time during 2.5, 2.6 or 2.7.  Adding to 2.7 tracking bug.

Note You need to log in before you can comment on or make changes to this bug.