Bug 773088 - During content sync, ui unresponsive and (Pulp::Repository: Request Timeout (GET /pulp/api/repositories/)
Summary: During content sync, ui unresponsive and (Pulp::Repository: Request Timeout (...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Content Management
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: Unspecified
Assignee: Mike McCune
QA Contact: Katello QA List
URL:
Whiteboard:
: 758502 (view as bug list)
Depends On: 784659
Blocks: katello-blockers
TreeView+ depends on / blocked
 
Reported: 2012-01-10 22:01 UTC by James Laska
Modified: 2019-09-26 13:27 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-22 18:18:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
pulp-logs.tgz (241.01 KB, application/x-gzip)
2012-01-10 22:01 UTC, James Laska
no flags Details

Description James Laska 2012-01-10 22:01:26 UTC
Created attachment 551953 [details]
pulp-logs.tgz

Description of problem:

While syncing "Red Hat Enterprise Linux 6 Server RPMs 61 x86_64", along with 3 other smaller repositories, I get periodic pulp::repository: request Timeout errors when attempting to access the katello web ui.

Perhaps anecdotal, but the content sync seems to be taking much longer than I'm accustomed to (on this same hardware).

Version-Release number of selected component (if applicable):
 * candlepin-0.5.8-1.el6.noarch
 * pulp-0.0.256-1.el6.noarch
 * katello-0.1.171-2.el6.noarch


Steps to Reproduce:
1. Import a Red Hat Manifest (manifests_stageSamTest20Nov2011.zip used)
2. Enable RHEL6.1 Server x86_64
3. Add a custom provider and product with several repositories (CloudEngine brew and puddle, and VMWare tools)
4. Start a content sync from the katello web interface
  
Actual results:

The sync appears to be taking much longer than I'm accustomed to on the same hardware+setup.  Additionally, the web interface seems fairly unresponsive and sluggish.  Often, requests are met with a pulp error:

  Pulp::Repository: Request Timeout (GET /pulp/api/repositories/)

Expected results:

 * A responsive web interface even during content sync

Additional info:

 * Attaching requested log files, which includes ...
drwxr-xr-x apache/apache     0 2012-01-10 08:53 var/log/pulp/
-rw-r--r-- root/root      2859 2012-01-10 08:53 var/log/pulp/db.log
-rw-r--r-- apache/apache     0 2012-01-10 08:53 var/log/pulp/events.log
-rw-r--r-- apache/apache 1173517 2012-01-10 10:48 var/log/pulp/pulp.log
-rw-r--r-- apache/apache 2486624 2012-01-10 11:52 var/log/pulp/grinder.log
drwxr-xr-x katello/katello     0 2012-01-10 08:54 var/log/katello/
-rw-r--r-- root/root         664 2012-01-10 08:54 var/log/katello/thin-log.5004.log
-rw-r--r-- root/root         664 2012-01-10 08:54 var/log/katello/thin-log.5002.log
-rw-r--r-- katello/root        0 2012-01-10 08:53 var/log/katello/production_delayed_jobs.log
-rw-r--r-- katello/katello   173 2012-01-10 08:54 var/log/katello/delayed_job.log
-rw-r--r-- root/root         664 2012-01-10 08:54 var/log/katello/thin-log.5000.log
-rw-r--r-- root/root         664 2012-01-10 08:54 var/log/katello/thin-log.5001.log
-rw-r--r-- katello/katello    44 2012-01-10 08:54 var/log/katello/jobs-startup.log
-rw-r--r-- root/root         664 2012-01-10 08:54 var/log/katello/thin-log.5003.log
-rw-r--r-- katello/root   348588 2012-01-10 11:52 var/log/katello/production.log
lrwxrwxrwx katello/katello     0 2012-01-10 08:51 var/log/katello/katello-configure -> katello-configure-20120110-085148
drwxr-xr-x katello/katello     0 2012-01-10 08:53 var/log/katello/katello-configure-20120110-085148/
-rw-r--r-- katello/root      204 2012-01-10 08:54 var/log/katello/katello-configure-20120110-085148/db_seed.log
-rw-r--r-- root/root          12 2012-01-10 08:52 var/log/katello/katello-configure-20120110-085148/create-postgresql-katello-user.log
-rw-r--r-- root/root          12 2012-01-10 08:52 var/log/katello/katello-configure-20120110-085148/create-postgresql-candlepin-user.log
-rw-r--r-- root/root          16 2012-01-10 08:52 var/log/katello/katello-configure-20120110-085148/create-postgresql-candlepin-database.log
-rw-r--r-- katello/root    29321 2012-01-10 08:53 var/log/katello/katello-configure-20120110-085148/db_migrate.log
-rw-r--r-- katello/katello     0 2012-01-10 08:51 var/log/katello/katello-configure-20120110-085148/katello-configure.conf
-rw-r--r-- katello/katello 95725 2012-01-10 08:54 var/log/katello/katello-configure-20120110-085148/main.log
-rw-r--r-- katello/root     2143 2012-01-10 08:53 var/log/katello/katello-configure-20120110-085148/db_env.log
-rw-r--r-- root/root         276 2012-01-10 08:53 var/log/katello/katello-configure-20120110-085148/cpsetup.log
-rw-r--r-- root/root          16 2012-01-10 08:52 var/log/katello/katello-configure-20120110-085148/create-postgresql-katello-database.log

Comment 1 Mike McCune 2012-01-11 02:00:59 UTC
fyi, this only appears to happen on single CPU instances.  I can't get it to reproduce on 2+ CPU guests or physical machines with more than one core.

Comment 2 Mike McCune 2012-01-11 02:15:46 UTC
I can reproduce this fairly easily now that I've narrowed it down to requiring a single CPU to error out.

Comment 4 John Matthews 2012-01-11 16:16:16 UTC
Next time this is seen please grab /var/log/httpd and attach the log files there along with what's under /var/log/pulp

Comment 5 John Matthews 2012-01-11 19:38:48 UTC
I'm able to duplicate a similar environment on a 2 CPU EC2 Large instance.
Mike told me that Katello is configured to display a timeout if the request takes more than 30 seconds.

Setup is Pulp with 500 repos created.
Synced 4 repos consisting of RHEL 5&6 both i386 and x86_64
Then created 500 dummy repos that are empty.

Next used a script to hit https://127.0.0.1/pulp/api/repositories/ in a loop.

When only 4 repos this took about .4 seconds.
With ~500 repos it took about 1.2 seconds.

Re-ran this fetch of 'https://127.0.0.1/pulp/api/repositories/' in a loop while syncing 1 repo.
Saw timings between 1.8 seconds on the low side to 4 seconds on the high side
Once I went to 2-4 repos syncing I saw timings around 6 seconds to 20 seconds for that one fetch.

When syncing I tested with clients running a foreground sync.

Final test was
1 thread looping on:  https://127.0.0.1/pulp/api/repositories/ 
4 pulp-admin clients polling repo sync status every .25 seconds.

heres an example of what I saw from the "GET" handler in our repository.py REST api.  I added a start/end timer for the method and output the seconds in the log.

2012-01-11 14:29:37,485 13921:140541117327104: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 11.8994979858 seconds
2012-01-11 14:29:49,109 13921:140541287184128: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 10.1845879555 seconds
2012-01-11 14:30:03,038 13921:140541297673984: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 12.9555640221 seconds
2012-01-11 14:30:16,325 13921:140541297673984: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 11.7111308575 seconds
2012-01-11 14:30:33,161 13921:140541085857536: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 15.4414169788 seconds
2012-01-11 14:30:56,569 13921:140541117327104: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 21.3497471809 seconds
2012-01-11 14:31:29,294 13921:140541096347392: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 30.4423940182 seconds
2012-01-11 14:31:52,192 13921:140541096347392: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 19.2887630463 seconds
2012-01-11 14:32:13,752 13921:140541297673984: pulp.server.webservices.controllers.repositories:INFO: repositories:208 repositories GET took: 20.3157689571 seconds
2012-01-11 14:32:35,615 1392

Comment 6 Mike McCune 2012-01-25 17:38:36 UTC
Added a short term patch while we work on this bug to bump the REST client timeout between Katello -> pulp/cp to 120 vs 30.

commit 7bdef9d52a938dbcc77510599752ae1a55096e5e
Author: Mike McCune <mmccune>
Date:   Wed Jan 25 09:37:13 2012 -0800

    773088 - short term bump of the REST client timeout to 120
    
    This is a short term fix while we work on the real solution
    to 773088 and the timeout issues we are seeing.  By no means is this the
    right or long term thing to do.
    
    Will move back to 30 after we have fixed the above bug

Comment 7 Mike McCune 2012-01-26 22:21:47 UTC
*** Bug 758502 has been marked as a duplicate of this bug. ***

Comment 8 John Matthews 2012-02-16 14:18:49 UTC
Please make the below change to the Katello installer so that pulp.conf has:
 task_weight: 2
 threads: 5


http://git.fedorahosted.org/git/?p=pulp.git;a=commitdiff;h=3c82b24230f16ccc45757d0325275d290b18de29

index d40f463..6c75b0a 100644 (file)
--- a/etc/pulp/pulp.conf
+++ b/etc/pulp/pulp.conf
@@ -79,7 +79,8 @@ lifetime: 180
 config: /etc/pulp/logging/basic.cfg
 
 [yum]
-threads: 15
+task_weight: 2
+threads: 5
 # True/False to flag if we should remove older rpms
 remove_old_versions: false
 # Integer to specify how many old packages to keep.

Comment 10 Mike McCune 2012-02-22 19:03:10 UTC
we made the appropriate sync config changes.  moving ON_QA since this was part of the last puddle on Feb 15

Comment 11 Corey Welton 2012-02-28 20:31:30 UTC
QA Verified - at very least, the changes have been made to the installer, and I have not (yet) seen timeouts. We can reopen if it shows up again.

Comment 14 Lukas Zapletal 2012-04-06 12:31:31 UTC
*** Bug 809234 has been marked as a duplicate of this bug. ***

Comment 16 Lukas Zapletal 2012-04-10 08:16:58 UTC
Mike says, "To be clear the original bug saw timeouts during rendering of the
sync_mgmt page but the new bug you closed as a dupe is a timeout during the POST call to clone"

So its the same error message, but different bug. It's not a clone probably, clearing the link.


Note You need to log in before you can comment on or make changes to this bug.