Bug 816188

Summary: Katello timeouts when syncing very small repo
Product: Red Hat Satellite Reporter: Lukas Zapletal <lzap>
Component: WebUIAssignee: Lukas Zapletal <lzap>
Status: CLOSED CURRENTRELEASE QA Contact: Katello QA List <katello-qa-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: NightlyCC: bkearney, cwelton, inecas, mmccune
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-02 14:06:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lukas Zapletal 2012-04-25 13:15:55 UTC
Description of problem:

In our nightly katello tests we are having the following error during sync of a small repo:

Pulp::Task: Request Timeout  (GET /pulp/api/tasks/?state=archived&state=current&id=4e38025e-8ec8-11e1-8c90-525400847278)

The repo I sync is: http://lzap.fedorapeople.org/fakerepos/zoo5/

This is a regression in Pulp perhaps, but we are facing this issue several weeks now.

This is similar issue as https://bugzilla.redhat.com/show_bug.cgi?id=784659 cos it has the same consequence (Katello timeouts during sync), but request is different (GET status instead GET list of repos).

Please note it only happens ONCE, if you sync another repository on the same box, the issue wont happen anymore.

Version-Release number of selected component (if applicable):

pulp-1.0.0-8.el6.noarch
grinder-0.0.144-1.el6.noarch

Steps to Reproduce:
1. Install katello testing.
2. Run our cli test, testsuite "changeset", the very first sync fails.
  
I think it could be also simulated using API call and then with a waiting loop, but installing Katello is easier.

I will attach all relevant log files.

Comment 2 Lukas Zapletal 2012-04-25 13:32:21 UTC
One additional note - if you run changeset suite, the error does not show ALWAYS. I had to put some IO stress to see it again. I simulated this using yum clean all; yum makecache in a bash loop.

Comment 4 Lukas Zapletal 2012-04-25 14:25:05 UTC
I noticed when I comment out this configuration variable:

#post_sync_url = https://localhost/katello/api/repositories/sync_complete

I dont see the issue anymore.

Comment 5 Ivan Necas 2012-04-25 14:38:28 UTC
There was already a bug causing this troubles, see

https://bugzilla.redhat.com/show_bug.cgi?id=807720

but seemed resolved in recent Pulp builds.

Comment 8 Lukas Zapletal 2012-04-25 15:13:30 UTC
I have another reproducer, just create a product with 10+ repos (even small) in it and sync it. Timeout guaranteed:

URL=http://lzap.fedorapeople.org/fakerepos/zoo5/
alias kk='/usr/bin/katello -u admin -p admin'
kk client remember --option org --value ACME_Corporation
kk provider create --name provider
kk product create --name product --provider provider
kk repo create --url $URL --product product --name zoo_A
kk repo create --url $URL --product product --name zoo_B
kk repo create --url $URL --product product --name zoo_C
kk repo create --url $URL --product product --name zoo_D
kk repo create --url $URL --product product --name zoo_E
kk repo create --url $URL --product product --name zoo_F
kk repo create --url $URL --product product --name zoo_G
kk repo create --url $URL --product product --name zoo_G
kk product synchronize --name product

Comment 10 Lukas Zapletal 2012-04-25 16:09:31 UTC
Ok to successfully reproduce this, you need to have ONE THIN instance configured. It works like a charm with TWO+ THIN instances.

This could be considered as workaround, but I guess we should still fix this somehow. I can imagine with two thins, two users could sync in parallel and hit it. The same for N users.

Comment 11 Lukas Zapletal 2012-04-26 12:20:15 UTC
Assigning to me, will tune our installer to setup 2 instances minimum.

Comment 12 Lukas Zapletal 2012-04-26 13:46:04 UTC
So I confirm

pulp-1.1.4-1.el6.noarch

fixes this issue, but I will configure two thins as minimum anyway.

Comment 13 Bryan Kearney 2014-01-21 19:08:05 UTC
Moving to Sat6 to be tracked there. Upstream bugs are moving to redmine.

Comment 16 Corey Welton 2014-05-19 16:02:49 UTC
This is a really old bug, but considering this verified in Satellite-6.0.3-RHEL-6-20140508.1 

I synced/promoted a repo containing only one rpm and it seemed to be OK.

Comment 17 Bryan Kearney 2014-07-02 14:06:02 UTC
This was delivered with 6.0.3, which is the Satellite 6 Beta.