Bug 806940

Summary: RHEL 6.2 not completing sync
Product: Red Hat Satellite 6 Reporter: Steve Reichard <sreichar>
Component: Configuration ManagementAssignee: Katello Bug Bin <katello-bugs>
Status: CLOSED ERRATA QA Contact: Og Maciel <omaciel>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.0.0CC: achan, cpelland, dmacpher, jliberma, jmatthew, lzap, mmccune, msuchy, omaciel, scollier, sghai
Target Milestone: UnspecifiedKeywords: Reopened, Triaged, ZStream
Target Release: --   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
When syncing repositories, the sync experienced intermittent hanging. This caused the sync to hang indefinitely and never complete. As a result, intermittent problems with the grinder component occurred, which looped over a poll of the socket with no sent data. A problem with the remote server caused this issue. This fix sets Grinder to abort connections slower than 1000 bytes over 5 minutes. Repository syncing does not hang indefinitely now.
Story Points: ---
Clone Of:
: 827543 (view as bug list) Environment:
Last Closed: 2012-12-04 14:43:49 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 827543    
Attachments:
Description Flags
katello debug
none
UI raised a notification when sync failed none

Description Steve Reichard 2012-03-26 10:46:17 EDT
Created attachment 572773 [details]
katello debug

Description of problem:

After installing Beta6 and creating a org,  uploading a manifest, I started a sync of RHEL 6.2, cf tools, and CE.  RHEL 6.2 is show 100% but not complete.  I started on Saturday, its not Monday.

katello-debug attached


Version-Release number of selected component (if applicable):


beta 6


[root@cf-se6 pulp]# /pub/scripts/post_install_configuration_scripts/cf-se-versions 
Red Hat Enterprise Linux Server release 6.2 (Santiago)
Linux cf-se6.cloud.lab.eng.bos.redhat.com 2.6.32-220.7.1.el6.x86_64 #1 SMP Fri Feb 10 15:22:22 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
PyYAML-3.09-14.el6_1.x86_64
facter-1.5.9-1.el6.noarch
js-1.8.5-6.el6.x86_64
mongodb-1.8.2-3.el6.x86_64
mongodb-server-1.8.2-3.el6.x86_64
puppet-2.6.14-1.el6.noarch
pymongo-1.9-8.el6_1.x86_64
tomcat6-6.0.24-35.el6_1.noarch
ruby-1.8.7.352-6.el6.x86_64
grinder-0.0.139-1.el6.noarch
postgresql-server-8.4.9-1.el6_1.1.x86_64
postgresql-8.4.9-1.el6_1.1.x86_64
candlepin-0.5.26-1.el6.noarch
pulp-1.0.0-4.el6.noarch
katello-0.1.306-1.el6.noarch
katello-all-0.1.306-1.el6.noarch
katello-cli-0.1.107-1.el6.noarch
katello-configure-0.1.104-1.el6.noarch
[root@cf-se6 pulp]# 



How reproducible:

Unknown at this time


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Mike McCune 2012-03-26 16:43:10 EDT
will flag this once we reproduce and diagnose
Comment 2 Sachin Ghai 2012-03-27 06:02:42 EDT
I think I found the similar issue with following build:

pulp-1.0.0-6.el6.noarch
katello-0.1.306-1.el6.noarch
katello-glue-candlepin-0.1.306-1.el6.noarch
candlepin-0.5.26-1.el6.noarch
katello-cli-0.1.107-1.el6.noarch


I uploaded manifest and started the sync for rhel6.2 (x86_64) repo and sync failed. Please see the attachment
Comment 3 Sachin Ghai 2012-03-27 06:03:45 EDT
Created attachment 573008 [details]
UI raised a notification when sync failed
Comment 4 Sachin Ghai 2012-03-27 06:05:20 EDT
Grinder log says **54 items had errors**

[root@scroponok Packages]# cat /var/log/pulp/grinder.log | grep error
2012-03-26 20:11:17,491 5072:140688498882304: grinder.BaseFetch:INFO: activeobject:160 Symlink missing in repo directory. Creating link /var/lib/pulp//repos/ACME_Corporation/Library/content/dist/rhel/server/6/6.2/x86_64/os//Packages/python-weberror-0.10.2-1.el6.noarch.rpm to ../../../../../../../../../../../../packages/python-weberror/0.10.2/1.el6/noarch/a380af7a5deae0ea7118bda3fc0630bd794e3ae2600ffc68e9c2d108b0359902/python-weberror-0.10.2-1.el6.noarch.rpm
2012-03-26 20:50:27,433 5072:140688580650752: grinder.ParallelFetch:INFO: ParallelFetch:242 ParallelFetch: 6937 items successfully processed, 3408 downloaded, 54 items had errors
2012-03-26 20:50:40,939 5072:140688580650752: grinder.RepoFetch:INFO: RepoFetch:187 Processed <>,<https://cdn.redhat.com/content/dist/rhel/server/6/6.2/x86_64/os> with <0> items in [9205] seconds. Report: 6937 successes, 3408 downloads, 54 errors
Comment 5 Steve Reichard 2012-03-27 08:23:44 EDT
I have a second SE which completed syncing successfully.

I can make the first available if desired.
Comment 6 Brad Buckingham 2012-03-27 08:40:23 EDT
Hi Steve, Please do send me offline the details to access that first env.   We attempted to repro and were unable to.  Thanks!
Comment 9 John Matthews 2012-03-27 15:59:24 EDT
Investigating the hung sync we saw that 1 package was in the process of downloading for over a day.  Looking at the process we could see that the grinder process was on a loop polling a socket, yet the socket was not returning any data.

The root cause appears to be a problem on the remote server side.

A change has been made to grinder to avoid hanging indefinitely in situations like this.  If Grinder does not receive at least 1000 bytes over 5 minutes it will abort the connection.

Commit is here:
http://git.fedorahosted.org/git/?p=grinder.git;a=commitdiff;h=c8ee8d94c566c3b68b17664647e8a19c50094f5d
Comment 11 Lukas Zapletal 2012-05-31 09:03:45 EDT
@Mike - this bug has grinder component but its assigned to katello-list. Was confused about ownership, but I am putting this to ON_DEV at least. It's upstream, the only question is if to brew it. I am for, fix seems to be small.
Comment 14 Og Maciel 2012-09-28 11:46:57 EDT
Verified using:

* candlepin-0.7.8-1.el6cf.noarch
* candlepin-selinux-0.7.8-1.el6cf.noarch
* candlepin-tomcat6-0.7.8-1.el6cf.noarch
* katello-1.1.12-9.el6cf.noarch
* katello-all-1.1.12-9.el6cf.noarch
* katello-candlepin-cert-key-pair-1.0-1.noarch
* katello-certs-tools-1.1.8-1.el6cf.noarch
* katello-cli-1.1.8-5.el6cf.noarch
* katello-cli-common-1.1.8-5.el6cf.noarch
* katello-common-1.1.12-9.el6cf.noarch
* katello-configure-1.1.9-4.el6cf.noarch
* katello-glue-candlepin-1.1.12-9.el6cf.noarch
* katello-glue-pulp-1.1.12-9.el6cf.noarch
* katello-qpid-broker-key-pair-1.0-1.noarch
* katello-qpid-client-key-pair-1.0-1.noarch
* katello-selinux-1.1.1-1.el6cf.noarch
* pulp-1.1.12-1.el6cf.noarch
* pulp-common-1.1.12-1.el6cf.noarch
* pulp-selinux-server-1.1.12-1.el6cf.noarch
Comment 17 errata-xmlrpc 2012-12-04 14:43:49 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-1543.html