Bug 796264 - Katello task lookup request times out waiting for RHEL mirrors to promote
Summary: Katello task lookup request times out waiting for RHEL mirrors to promote
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: WebUI
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
medium
medium vote
Target Milestone: Unspecified
Assignee: Justin Sherrill
QA Contact: Og Maciel
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-22 15:19 UTC by Og Maciel
Modified: 2019-09-26 15:55 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-22 18:28:46 UTC
Target Upstream Version:


Attachments (Terms of Use)
Web UI displaying error (91.33 KB, image/png)
2012-02-22 15:19 UTC, Og Maciel
no flags Details
Traceback (3.79 KB, text/x-log)
2012-02-22 15:20 UTC, Og Maciel
no flags Details

Description Og Maciel 2012-02-22 15:19:09 UTC
Created attachment 565005 [details]
Web UI displaying error

Description of problem:

Katello's task lookup request during a repository promotion times out if a task takes too long to run. I'm not sure what the limit is in katello, but when promoting 3 RHEL repositories, this process will take more than 1-2 hours. The web ui tells me that my promotion has failed, but eventually pulp.log shows that the promotion succeeded.

Version-Release number of selected component (if applicable):

* candlepin-0.5.20-1.el6.noarch
* candlepin-tomcat6-0.5.20-1.el6.noarch
* katello-0.1.238-4.el6.noarch
* katello-all-0.1.238-4.el6.noarch
* katello-certs-tools-1.0.2-2.el6.noarch
* katello-cli-0.1.54-2.el6.noarch
* katello-cli-common-0.1.54-2.el6.noarch
* katello-common-0.1.238-4.el6.noarch
* katello-configure-0.1.64-5.el6.noarch
* katello-glue-candlepin-0.1.238-4.el6.noarch
* katello-glue-foreman-0.1.238-4.el6.noarch
* katello-glue-pulp-0.1.238-4.el6.noarch
* katello-httpd-ssl-key-pair-1.0-1.noarch
* katello-qpid-broker-key-pair-1.0-1.noarch
* katello-repos-0.1.5-1.el6.noarch
* katello-selinux-0.1.5-2.el6.noarch
* katello-trusted-ssl-cert-1.0-1.noarch
* pulp-0.0.265-1.el6.noarch
* pulp-admin-0.0.265-1.el6.noarch
* pulp-client-lib-0.0.265-1.el6.noarch
* pulp-common-0.0.265-1.el6.noarch
* pulp-selinux-server-0.0.265-1.el6.noarch

How reproducible:


Steps to Reproduce:
1. Created Seattle organization with Dev1, QA1 and GA1 environments
2. Added custom provider with a single product and 2 repositories pointing to el6-se and el6-tools content from latest puddle
3. Promoted all of the product to all environments, one at a time
4. Added a manifest and selected:
* Red Hat Enterprise Linux 6 Server RPMs x86_64 6Server
* Red Hat Enterprise Linux 6 Server RPMs x86_64 6.2
* Red Hat Enterprise Linux 6 Server RPMs x86_64 6.1
5. Synchronized all of the selected RHEL repositories
6. Added new filter and added httpd against all available RHEL repositories
7. Added new promotion and promoted only the RHEL content

Actual results:

After a long time the web ui showed (see attached screenshot) the following error message:

Failed to promote changeset 'promo-1-1'. Check notices for more details

Expected results:


Additional info:

When this problem happened, the following message was displayed in pulp.log:

pulp.server.api.synchronizers:INFO: synchronizers:829 Running createrepo, this may take a few minutes to complete.

katello/production.log had the following (full traceback attached) error:

Pulp::Task: Request Timeout

Comment 1 Og Maciel 2012-02-22 15:20:18 UTC
Created attachment 565006 [details]
Traceback

Traceback

Comment 2 Og Maciel 2012-02-22 15:23:14 UTC
delayed_jobs.log:

2012-02-22T09:47:51-0500: [Worker(delayed_job host:qetello03.usersys.redhat.com pid:11767)] Changeset#promote_content failed with RestClient::RequestTimeout: Pulp::Task: Request Timeout  (GET /pulp/api/tasks/?state=archived&state=current&id=1cd9785c-5d5b-11e1-959c-5254002b8762) - 0 failed attempts
2012-02-22T09:47:51-0500: [Worker(delayed_job host:qetello03.usersys.redhat.com pid:11767)] PERMANENTLY removing Changeset#promote_content because of 1 consecutive failures.
Pulp::Task: Request Timeout  (GET /pulp/api/tasks/?state=archived&state=current&id=1cd9785c-5d5b-11e1-959c-5254002b8762)

Comment 3 Justin Sherrill 2012-03-02 20:45:43 UTC
Trying something here:

17578c00fc216f8c84dc0c2ce1f4461fa468876b

Now if we get a timeout during a status check in promotion, we wait an extra 50 seconds to give the pulp server some breathing room.

If after 10 consecutive timeouts we still fail.  Most likely if pulp is throwing timeouts after 10 minutes, something bad is going on.

Let me know if this improves anything.

Comment 5 Mike McCune 2012-03-07 23:43:21 UTC
mass move ON_QA after brewing

Comment 6 Og Maciel 2012-03-09 15:40:18 UTC
Verified:
* candlepin-0.5.24-1.el6.noarch
* candlepin-tomcat6-0.5.24-1.el6.noarch
* katello-0.1.303-1.el6.noarch
* katello-all-0.1.303-1.el6.noarch
* katello-candlepin-cert-key-pair-1.0-1.noarch
* katello-certs-tools-1.0.4-1.el6.noarch
* katello-cli-0.1.102-1.el6.noarch
* katello-cli-common-0.1.102-1.el6.noarch
* katello-common-0.1.303-1.el6.noarch
* katello-configure-0.1.104-1.el6.noarch
* katello-glue-candlepin-0.1.303-1.el6.noarch
* katello-glue-foreman-0.1.303-1.el6.noarch
* katello-glue-pulp-0.1.303-1.el6.noarch
* katello-qpid-broker-key-pair-1.0-1.noarch
* katello-qpid-client-key-pair-1.0-1.noarch
* katello-selinux-0.1.8-1.el6.noarch
* pulp-1.0.0-4.el6.noarch
* pulp-common-1.0.0-4.el6.noarch
* pulp-selinux-server-1.0.0-4.el6.noarch


Note You need to log in before you can comment on or make changes to this bug.