Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 806940 - RHEL 6.2 not completing sync
Summary: RHEL 6.2 not completing sync
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Configuration Management
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: Unspecified
Assignee: Katello Bug Bin
QA Contact: Og Maciel
URL:
Whiteboard:
Depends On:
Blocks: 827543
TreeView+ depends on / blocked
 
Reported: 2012-03-26 14:46 UTC by Steve Reichard
Modified: 2019-09-26 15:54 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
When syncing repositories, the sync experienced intermittent hanging. This caused the sync to hang indefinitely and never complete. As a result, intermittent problems with the grinder component occurred, which looped over a poll of the socket with no sent data. A problem with the remote server caused this issue. This fix sets Grinder to abort connections slower than 1000 bytes over 5 minutes. Repository syncing does not hang indefinitely now.
Clone Of:
: 827543 (view as bug list)
Environment:
Last Closed: 2012-12-04 19:43:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
katello debug (728.95 KB, application/x-gzip)
2012-03-26 14:46 UTC, Steve Reichard
no flags Details
UI raised a notification when sync failed (83.22 KB, image/png)
2012-03-27 10:03 UTC, Sachin Ghai
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:1543 0 normal SHIPPED_LIVE Important: CloudForms System Engine 1.1 update 2012-12-05 00:39:57 UTC

Description Steve Reichard 2012-03-26 14:46:17 UTC
Created attachment 572773 [details]
katello debug

Description of problem:

After installing Beta6 and creating a org,  uploading a manifest, I started a sync of RHEL 6.2, cf tools, and CE.  RHEL 6.2 is show 100% but not complete.  I started on Saturday, its not Monday.

katello-debug attached


Version-Release number of selected component (if applicable):


beta 6


[root@cf-se6 pulp]# /pub/scripts/post_install_configuration_scripts/cf-se-versions 
Red Hat Enterprise Linux Server release 6.2 (Santiago)
Linux cf-se6.cloud.lab.eng.bos.redhat.com 2.6.32-220.7.1.el6.x86_64 #1 SMP Fri Feb 10 15:22:22 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
PyYAML-3.09-14.el6_1.x86_64
facter-1.5.9-1.el6.noarch
js-1.8.5-6.el6.x86_64
mongodb-1.8.2-3.el6.x86_64
mongodb-server-1.8.2-3.el6.x86_64
puppet-2.6.14-1.el6.noarch
pymongo-1.9-8.el6_1.x86_64
tomcat6-6.0.24-35.el6_1.noarch
ruby-1.8.7.352-6.el6.x86_64
grinder-0.0.139-1.el6.noarch
postgresql-server-8.4.9-1.el6_1.1.x86_64
postgresql-8.4.9-1.el6_1.1.x86_64
candlepin-0.5.26-1.el6.noarch
pulp-1.0.0-4.el6.noarch
katello-0.1.306-1.el6.noarch
katello-all-0.1.306-1.el6.noarch
katello-cli-0.1.107-1.el6.noarch
katello-configure-0.1.104-1.el6.noarch
[root@cf-se6 pulp]# 



How reproducible:

Unknown at this time


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Mike McCune 2012-03-26 20:43:10 UTC
will flag this once we reproduce and diagnose

Comment 2 Sachin Ghai 2012-03-27 10:02:42 UTC
I think I found the similar issue with following build:

pulp-1.0.0-6.el6.noarch
katello-0.1.306-1.el6.noarch
katello-glue-candlepin-0.1.306-1.el6.noarch
candlepin-0.5.26-1.el6.noarch
katello-cli-0.1.107-1.el6.noarch


I uploaded manifest and started the sync for rhel6.2 (x86_64) repo and sync failed. Please see the attachment

Comment 3 Sachin Ghai 2012-03-27 10:03:45 UTC
Created attachment 573008 [details]
UI raised a notification when sync failed

Comment 4 Sachin Ghai 2012-03-27 10:05:20 UTC
Grinder log says **54 items had errors**

[root@scroponok Packages]# cat /var/log/pulp/grinder.log | grep error
2012-03-26 20:11:17,491 5072:140688498882304: grinder.BaseFetch:INFO: activeobject:160 Symlink missing in repo directory. Creating link /var/lib/pulp//repos/ACME_Corporation/Library/content/dist/rhel/server/6/6.2/x86_64/os//Packages/python-weberror-0.10.2-1.el6.noarch.rpm to ../../../../../../../../../../../../packages/python-weberror/0.10.2/1.el6/noarch/a380af7a5deae0ea7118bda3fc0630bd794e3ae2600ffc68e9c2d108b0359902/python-weberror-0.10.2-1.el6.noarch.rpm
2012-03-26 20:50:27,433 5072:140688580650752: grinder.ParallelFetch:INFO: ParallelFetch:242 ParallelFetch: 6937 items successfully processed, 3408 downloaded, 54 items had errors
2012-03-26 20:50:40,939 5072:140688580650752: grinder.RepoFetch:INFO: RepoFetch:187 Processed <>,<https://cdn.redhat.com/content/dist/rhel/server/6/6.2/x86_64/os> with <0> items in [9205] seconds. Report: 6937 successes, 3408 downloads, 54 errors

Comment 5 Steve Reichard 2012-03-27 12:23:44 UTC
I have a second SE which completed syncing successfully.

I can make the first available if desired.

Comment 6 Brad Buckingham 2012-03-27 12:40:23 UTC
Hi Steve, Please do send me offline the details to access that first env.   We attempted to repro and were unable to.  Thanks!

Comment 9 John Matthews 2012-03-27 19:59:24 UTC
Investigating the hung sync we saw that 1 package was in the process of downloading for over a day.  Looking at the process we could see that the grinder process was on a loop polling a socket, yet the socket was not returning any data.

The root cause appears to be a problem on the remote server side.

A change has been made to grinder to avoid hanging indefinitely in situations like this.  If Grinder does not receive at least 1000 bytes over 5 minutes it will abort the connection.

Commit is here:
http://git.fedorahosted.org/git/?p=grinder.git;a=commitdiff;h=c8ee8d94c566c3b68b17664647e8a19c50094f5d

Comment 11 Lukas Zapletal 2012-05-31 13:03:45 UTC
@Mike - this bug has grinder component but its assigned to katello-list. Was confused about ownership, but I am putting this to ON_DEV at least. It's upstream, the only question is if to brew it. I am for, fix seems to be small.

Comment 14 Og Maciel 2012-09-28 15:46:57 UTC
Verified using:

* candlepin-0.7.8-1.el6cf.noarch
* candlepin-selinux-0.7.8-1.el6cf.noarch
* candlepin-tomcat6-0.7.8-1.el6cf.noarch
* katello-1.1.12-9.el6cf.noarch
* katello-all-1.1.12-9.el6cf.noarch
* katello-candlepin-cert-key-pair-1.0-1.noarch
* katello-certs-tools-1.1.8-1.el6cf.noarch
* katello-cli-1.1.8-5.el6cf.noarch
* katello-cli-common-1.1.8-5.el6cf.noarch
* katello-common-1.1.12-9.el6cf.noarch
* katello-configure-1.1.9-4.el6cf.noarch
* katello-glue-candlepin-1.1.12-9.el6cf.noarch
* katello-glue-pulp-1.1.12-9.el6cf.noarch
* katello-qpid-broker-key-pair-1.0-1.noarch
* katello-qpid-client-key-pair-1.0-1.noarch
* katello-selinux-1.1.1-1.el6cf.noarch
* pulp-1.1.12-1.el6cf.noarch
* pulp-common-1.1.12-1.el6cf.noarch
* pulp-selinux-server-1.1.12-1.el6cf.noarch

Comment 17 errata-xmlrpc 2012-12-04 19:43:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-1543.html


Note You need to log in before you can comment on or make changes to this bug.