Bug 787010

Summary: Repos inconsistent if elasticsearch index fails due to pulp timeout
Product: Red Hat Satellite Reporter: James Laska <jlaska>
Component: InfrastructureAssignee: Justin Sherrill <jsherril>
Status: CLOSED CURRENTRELEASE QA Contact: Corey Welton <cwelton>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0.0CC: bbuckingham, bkearney, cwelton, ehelms, jsherril, jturner
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-07-02 14:07:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
/var/log/pulp/pulp.log
none
/var/log/katello/delayed_job.log none

Description James Laska 2012-02-02 21:42:01 UTC
Created attachment 559141 [details]
/var/log/pulp/pulp.log

Description of problem:

I managed to get the system into a state where all promotions where failing, regardless of whether the content was [un]synced.

16:25:40  ehelms: so the issue I saw in your logs, which looks like we should file a bug on, if during or after a sync, the reindex for elasticsearch fails due to a pulp timeout, the data within katello for repos will be inconsistent

Version-Release number of selected component (if applicable):
 * katello-0.1.211-2.el6.noarch
 * pulp-0.0.263-1.el6.noarch

Steps to Reproduce:
1. Import manifest, enable repositories, and sync some of the enabled repositories
2. Attempt a promotion of the product (which includes some unsynced repositories)
3. Sync remaining unsynced repositories
4. Attempt another promotion
  
Actual results:

All promotions failed (see screenshot).

Expected results:

No failed promotions

Additional info:

Comment 1 James Laska 2012-02-02 21:42:26 UTC
Created attachment 559142 [details]
/var/log/katello/delayed_job.log

Comment 4 Lukas Zapletal 2014-03-12 10:12:49 UTC
I *think* this is no longer relevant, please re-evaluate.

Comment 5 Brad Buckingham 2014-05-23 22:13:19 UTC
Justin, this is a mighty old one.  Given all of the changes that have gone in since (e.g. pulp2.4, dynflow, enginification...etc), have you seen anything like this occurring?

In the past, a big contributor to the timeout was resource (e.g.memory).

Comment 6 Justin Sherrill 2014-05-27 02:41:30 UTC
I believe it is still possible for this behaviour to occur but is much much less likely to with pulp 2.4. I am guessing this will not be reproducible.   Will Move to on_qa

Comment 7 Corey Welton 2014-06-24 18:01:33 UTC
Considering this verified, I cannot get it to occur in latest builds.

Verified in Satellite-6.0.3-RHEL-6-20140619.0

Comment 8 Bryan Kearney 2014-07-02 14:07:50 UTC
This was delivered with 6.0.3, which is the Satellite 6 Beta.