+++ This bug was initially created as a clone of Bug #1220487 +++ Description of problem: Run spacewalk-clone-by-date for a large base channel, and it errors out when calling the channel.software.syncErrata API: **** # spacewalk-clone-by-date -y -d "2015-05-01" -u admin -p XXXXXX --channels=rhel-x86_64-server-6 clone-rhel-x86_64-server-6 Reading repository information. Using issue_date. By continuing the following will be cloned: rhel-x86_64-server-6 -> clone-rhel-x86_64-server-6 (2899/2899 Errata) Cloning Errata into clone-rhel-x86_64-server-6 (2899): ________________________________________ ######################################## - complete Traceback (most recent call last): File "/usr/bin/spacewalk-clone-by-date", line 412, in <module> sys.exit(abs(main() or 0)) File "/usr/bin/spacewalk-clone-by-date", line 402, in main return cloneByDate.main(args) File "/usr/share/rhn/utils/cloneByDate.py", line 282, in main cloner.clone(options.skip_depsolve) File "/usr/share/rhn/utils/cloneByDate.py", line 464, in clone cloner.process() File "/usr/share/rhn/utils/cloneByDate.py", line 621, in process self.clone() File "/usr/share/rhn/utils/cloneByDate.py", line 730, in clone self.remote_api.sync_errata(self.to_label) File "/usr/share/rhn/utils/cloneByDate.py", line 872, in sync_errata self.client.channel.software.syncErrata(self.auth_token, to_label) File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__ return self.__send(self.__name, args) File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request verbose=self.__verbose File "/usr/lib64/python2.6/xmlrpclib.py", line 1243, in request headers xmlrpclib.ProtocolError: <ProtocolError for localhost/rpc/api: 503 Service Temporarily Unavailable> Alternatively, run an API script that calls channel.software.syncErrata for a clone channel directly. **** # cat api-sync-errata.py #!/usr/bin/python import xmlrpclib SATELLITE_URL = "http://localhost/rpc/api" SATELLITE_LOGIN = "admin" SATELLITE_PASSWORD = "XXXXXX" client = xmlrpclib.Server(SATELLITE_URL, verbose=0) key = client.auth.login(SATELLITE_LOGIN, SATELLITE_PASSWORD) print client.channel.software.syncErrata(key, "clone-rhel-x86_64-server-6") client.auth.logout(key) # ./api-sync-errata.py Traceback (most recent call last): File "./api-sync-errata.py", line 12, in <module> print client.channel.software.syncErrata(key, "clone-rhel-x86_64-server-6-4") File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__ return self.__send(self.__name, args) File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request verbose=self.__verbose File "/usr/lib64/python2.6/xmlrpclib.py", line 1243, in request headers xmlrpclib.ProtocolError: <ProtocolError for localhost/rpc/api: 503 Service Temporarily Unavailable> **** Version-Release number of selected component (if applicable): spacewalk-java-2.3.8-102.el6sat.noarch How reproducible: 100% Steps to Reproduce: See description above. Actual results: 503 error when running spacewalk-clone-by-date, due to slow channel.software.syncErrata API. Expected results: No error, syncErrata API finishes in a reasonable amount of time. Additional info: The API runs extremely slowly for a few reasons: 1.) ChannelSoftwareHandler.syncErrata() calls ChannelManager.listErrataNeedingResync() and ChannelManager.listErrataPackagesForResync(), which call the datasource queries list_errata_needing_sync and list_packages_needing_sync in Errata_queries.xml, respectively, which return a List of ErrataOverview or PackageOverview objects. These objects are only used to get the corresponding errata or package id's. If there are many errata in the cloned channel that need to be synced, then creating all of these objects just to get their id's is inefficient, and we could replace these queries with ones that return the id's in the first place. **** ./com/redhat/rhn/frontend/xmlrpc/channel/software/ChannelSoftwareHandler.java: public Integer syncErrata(User loggedInUser, String channelLabel) { [...] List<ErrataOverview> errata = ChannelManager.listErrataNeedingResync(channel, loggedInUser); List<Long> eids = new ArrayList<Long>(); for (ErrataOverview e : errata) { eids.add(e.getId()); } List<PackageOverview> packages = ChannelManager .listErrataPackagesForResync(channel, loggedInUser); List<Long> pids = new ArrayList<Long>(); for (PackageOverview p : packages) { pids.add(p.getId()); } ChannelEditor.getInstance().addPackages(loggedInUser, channel, pids); for (Long eid : eids) { Errata e = ErrataManager.lookupErrata(eid, loggedInUser); if (e.isPublished() && e.isCloned()) { ErrataFactory.syncErrataDetails((PublishedClonedErrata) e); } else { log.fatal("Tried to sync errata with id " + eid + " But it was not published or was not cloned"); } } **** 2.) The queries used by list_errata_needing_sync and list_packages_needing_sync queries themselves are extremely slow. In my testing on Satellite 5.7 w/ embedded postgresql db, for a clone of rhel-x86_64-server-6 that already has all the errata synced, the queries ran for up to an hour without completing. The results can be obtained much more quickly without the inner joins on rhnErrata, rhnChannelErrata, etc. In my testing, the syncErrata API completed in 0.5 seconds, rather then timing out and never completing, with the new queries (see attached patch). --- Additional comment from Grant Gainey on 2015-10-19 13:30:22 EDT --- The patch submitted with this BZ does have a significant impact on the syncErrata call - I see it going from ~1.3s to ~0.5s. But syncErrata isn't the "long pole" in the channel-cloning process - indeed, even in its current suboptimal state it's a tiny part of the total time. Some examples pre-patch: === [root@foo-011 conf]# time spacewalk-clone-by-date -y -d "2015-07-01" -u admin -p <pwd> --channels=rhel-x86_64-server-6 clone-rhel-x86_64-server-6 --no-errata-sync ... real 81m38.495s user 47m33.629s sys 0m21.989s [root@foo-011 conf]# time ./api-sync-errata.py 1 real 0m1.118s user 0m0.057s sys 0m0.089s [root@foo-011 conf]# === and post-patch: === [root@foo-011 conf]# time spacewalk-clone-by-date -y -d "2015-07-01" -u admin -p <pwd> --channels=rhel-x86_64-server-6 clone-rhel-x86_64-server-6 --no-errata-sync ... real 75m18.503s user 45m48.657s sys 0m22.711s [root@foo-011 conf]# time ./api-sync-errata.py 1 real 0m0.554s user 0m0.031s sys 0m0.025s [root@foo-011 conf]# ===
spacewalk.github: 2be48080365ed26f12de18818c3327e84af0dcf9
Spacewalk 2.5 was released.
Hi Grant: would you explain why you reverted this fix in git and what is the replacement for it?
I reverted it because a) it didn't help much, and b) the queries were broken in ways that broke errata-cloning. See https://bugzilla.redhat.com/show_bug.cgi?id=1367915 for more details. There have been a couple of 'fixes' to clone-by-date recently that actually broke required functionality - see https://bugzilla.redhat.com/show_bug.cgi?id=1369185 for another example.
This BZ closed some time during 2.5, 2.6 or 2.7. Adding to 2.7 tracking bug.