Bug 1212024

Summary: Intermittent error downloading a large RPM with pulp 2.6.0 (Sat 6.1 compose from 3/17)
Product: Red Hat Satellite Reporter: John Matthews <jmatthew>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: jcallaha
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1.0CC: bbuckingham, bkearney, cwelton, ipanova, jcallaha, jkim, jmatthew, mhrivnak, tsanders
Target Milestone: UnspecifiedKeywords: Reopened, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-27 09:15:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1212602, 1258596    
Attachments:
Description Flags
sudo journalctl SYSLOG_IDENTIFIER=pulp + SYSLOG_IDENTIFIER=celery + SYSLOG_IDENTIFIER=httpd &> pulp.logs
none
JSON of Katello Sync Report
none
Screen shot from katello of the error report
none
Katello screenshot of the sync status with the 1 package error none

Description John Matthews 2015-04-15 12:24:58 UTC
Created attachment 1014714 [details]
sudo journalctl SYSLOG_IDENTIFIER=pulp + SYSLOG_IDENTIFIER=celery + SYSLOG_IDENTIFIER=httpd &> pulp.logs

Description of problem:

There may be an issue in Pulp downloading large RPMs, below is info I've seen while testing RHCI scenarios.

We are seeing intermittent but somewhat frequent issues downloading large RPMs from this repo:
  "Red Hat Enterprise Virtualization Manager 3.5 RPMs x86_64 6Server"

I'm seeing this occur perhaps once every 5-7 runs.  

The process is:
 - import a manifest and sync ~50GB of packages (RHEL 6 + RHEV 3.5)
 - some runs everything syncs successfully, yet it's fairly common to see a sync error in the RHEVM repo.
 - I've seen errors with the below packages this week:
    jasperreports-server-pro, "expected_size"=>473115020, "actual_size"=>312700
    rhev-guest-tools-iso, 'expected_size': 261383164, 'actual_size': 88080384
 - a re-sync is successful


Attached are:
 - screenshots from katello
 - json of the sync report from the katello task
 - output of:  sudo journalctl SYSLOG_IDENTIFIER=pulp + SYSLOG_IDENTIFIER=celery + SYSLOG_IDENTIFIER=httpd &> pulp.logs

Version-Release number of selected component (if applicable):

Using Sat 6.1 Compose from 3/17/15.
RPMs are:



Below are the RPMs on a Sat 6.1 setup we are using.

# rpm -qa | grep pulp
pulp-katello-0.3-4.el7sat.noarch
pulp-rpm-plugins-2.6.0-0.7.beta.el7sat.noarch
pulp-server-2.6.0-0.7.beta.1.el7sat.noarch
python-pulp-puppet-common-2.6.0-0.7.beta.el7sat.noarch
python-pulp-common-2.6.0-0.7.beta.1.el7sat.noarch
pulp-nodes-parent-2.6.0-0.7.beta.1.el7sat.noarch
pulp-puppet-tools-2.6.0-0.7.beta.el7sat.noarch
pulp-selinux-2.6.0-0.7.beta.1.el7sat.noarch
python-pulp-rpm-common-2.6.0-0.7.beta.el7sat.noarch
python-kombu-3.0.24-5.pulp.el7sat.noarch
python-pulp-bindings-2.6.0-0.7.beta.1.el7sat.noarch
pulp-docker-plugins-0.2.2-1.el7sat.noarch
pulp-nodes-common-2.6.0-0.7.beta.1.el7sat.noarch
rubygem-smart_proxy_pulp-1.0.1.1-1.el7.noarch
python-pulp-docker-common-0.2.2-1.el7sat.noarch
python-isodate-0.5.0-4.pulp.el7sat.noarch
pulp-puppet-plugins-2.6.0-0.7.beta.el7sat.noarch

# rpm -qa | grep celery
python-celery-3.1.11-1.el7sat.noarch

# rpm -qa | grep qpid
qpid-dispatch-router-0.4-2.20150223.el7.x86_64
libqpid-dispatch-0.4-2.20150223.el7.x86_64
fusor.example.com-qpid-router-client-1.0-1.noarch
qpid-qmf-0.30-5.el7.x86_64
fusor.example.com-qpid-broker-1.0-1.noarch
qpid-cpp-server-0.30-7.proton.0.9.el7.x86_64
fusor.example.com-qpid-router-server-1.0-1.noarch
qpid-tools-0.30-4.el7.noarch
fusor.example.com-qpid-client-cert-1.0-1.noarch
python-qpid-0.30-5.el7.noarch
qpid-cpp-client-devel-0.30-7.proton.0.9.el7.x86_64
qpid-java-common-0.30-3.el7.noarch
qpid-cpp-client-0.30-7.proton.0.9.el7.x86_64
qpid-java-client-0.30-3.el7.noarch
qpid-proton-c-0.9-1.20150223.el7.x86_64
python-qpid-qmf-0.30-5.el7.x86_64
python-gofer-qpid-2.6.1-1.el7sat.noarch
ruby193-rubygem-qpid_messaging-0.30.0-1.el7sat.x86_64
qpid-cpp-server-linearstore-0.30-7.proton.0.9.el7.x86_64


# rpm -qa | grep katello
katello-certs-tools-2.2.1-1.el7sat.noarch
pulp-katello-0.3-4.el7sat.noarch
katello-default-ca-1.0-1.noarch
ruby193-rubygem-katello-2.2.0.24-1.el7sat.noarch
rubygem-hammer_cli_katello-0.0.7.4-1.el7sat.noarch
katello-installer-2.3.5-1.el7sat.noarch
katello-server-ca-1.0-1.noarch
katello-2.2.0.8-1.el7sat.noarch
katello-installer-base-2.3.5-1.el7sat.noarch
katello-common-2.2.0.8-1.el7sat.noarch
katello-debug-2.2.0.8-1.el7sat.noarch




How reproducible:

We seen this every few runs in RHCI testing, perhaps 1 in 3 to 1 in 7... it varies.  The impacted RPM changes.


Steps to Reproduce:
1. Import a manifest into Katello
2. Sync RHEL 6 x86_64 and RHEV 3.5 repos

Comment 1 John Matthews 2015-04-15 12:25:26 UTC
Created attachment 1014715 [details]
JSON of Katello Sync Report

Comment 2 John Matthews 2015-04-15 12:25:51 UTC
Created attachment 1014717 [details]
Screen shot from katello of the error report

Comment 3 John Matthews 2015-04-15 12:26:15 UTC
Created attachment 1014718 [details]
Katello screenshot of the sync status with the 1 package error

Comment 4 RHEL Program Management 2015-04-15 12:33:15 UTC
Since this issue was entered in Red Hat Bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 6 Michael Hrivnak 2015-04-15 12:57:05 UTC
I've not heard of anyone else seeing this problem, but it's concerning that it seems so reproducible for you.

Have you been able to reproduce this from more than one client-side network or physical machine?

On what network has the client been running when you see these errors?

Does the "actual_size" tend to vary a lot, or does it consistently stop in the same place?

Comment 7 John Matthews 2015-04-15 13:59:56 UTC
I've seen this happen from installs at my house and from installs in the RDU Lab, so 2 different network setups.

I'm unsure on the actual_size, I saw this happen several times before I started to capture the data.

From what I can tell this is limited to large files, and seems to happen to only 1 file at a time, yet that 1 file does vary.  I've never seen more than 1 file be impacted in a run, and when impacted the file has always been largish as in 100MB+.

Comment 8 Brad Buckingham 2015-04-15 14:47:18 UTC
I have also observed the error on download.  In my case, Satellite is always in the process of syncing the entire set of repos needed to support RHEV and from the CDN (not an internal repo).  Although the server is configured to throttle the work, perhaps it is still too much.

The set of repos involved include:

       - :product_name: "Red Hat Enterprise Linux Server"
        :repository_set_name: "Red Hat Enterprise Linux 6 Server (RPMs)"
        :basearch: "x86_64"
        :releasever: "6Server"

      - :product_name: "Red Hat Enterprise Linux Server"
        :repository_set_name: "Red Hat Enterprise Linux 6 Server (Kickstart)"
        :basearch: "x86_64"
        :releasever: "6.6"

      - :product_name: "Red Hat Enterprise Linux Server"
        :repository_set_name: "Red Hat Enterprise Linux 6 Server - RH Common (RPMs)"
        :basearch: "x86_64"
        :releasever: "6Server"

      - :product_name: "Red Hat Enterprise Linux Server"
        :repository_set_name: "Red Hat Enterprise Linux 6 Server - Supplementary (RPMs)"
        :basearch: "x86_64"
        :releasever: "6Server"

      - :product_name: "Red Hat Enterprise Virtualization"
        :repository_set_name: "Red Hat Enterprise Virtualization Hypervisor (RPMs)"
        :basearch: "x86_64"
        :releasever: "6Server"

      - :product_name: "Red Hat Enterprise Virtualization"
        :repository_set_name: "Red Hat Enterprise Virtualization Management Agents (RPMs)"
        :basearch: "x86_64"
        :releasever: "6Server"

      - :product_name: "Red Hat Enterprise Virtualization"
        :repository_set_name: "Red Hat Enterprise Virtualization Manager 3.5 (RPMs)"
        :basearch: "x86_64"
        :releasever: "6Server"

      - :product_name: "JBoss Enterprise Application Platform"
        :repository_set_name: "JBoss Enterprise Application Platform 6 (RHEL 6 Server) (RPMs)"
        :basearch: "x86_64"
        :releasever: "6Server"

Comment 9 John Matthews 2015-04-20 23:19:20 UTC
Saw this again on 4/20/15.
Same repo:  "Red Hat Enterprise Virtualization Manager 3.5 RPMs x86_64 6Server",


Below is the package that failed to download.

{"expected_size"=>119814460,
               "error_code"=>"size_mismatch",
               "actual_size"=>1110812,
               "unit_key"=>
                {"name"=>"rhevm-webadmin-portal",
                 "checksum"=>"fbfa34e6cf6dc4259bc02dacb41c9a40cf88e3cf",
                 "epoch"=>"0",
                 "version"=>"3.5.0",
                 "release"=>"0.32.el6ev",
                 "arch"=>"noarch",
                 "checksumtype"=>"sha1"}}

Comment 10 pulp-infra@redhat.com 2015-05-14 14:29:58 UTC
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 11 pulp-infra@redhat.com 2015-05-14 14:30:00 UTC
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.

Comment 13 John Matthews 2015-07-17 15:47:20 UTC
Saw this again on 7/17:

"expected_size"=>333826592, "error_code"=>"size_mismatch",   "actual_size"=>299892736, "unit_key"=>{"name"=>"rhel-guest-image-6"




2015-07-17 11:36:11 [I] Processing by Katello::Api::V2::RepositoriesController#sync_complete as JSON
2015-07-17 11:36:11 [I]   Parameters: {"call_report"=>"[FILTERED]", "event_type"=>"repo.sync.finish", "payload"=>{"importer_id"=>"yum_importer", "exception"=>nil, "repo_id"=>"Default_Organization-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_x86_64_6Server", "traceback"=>nil, "started"=>"2015-07-17T14:02:42Z", "_ns"=>"repo_sync_results", "completed"=>"2015-07-17T15:36:10Z", "importer_type_id"=>"yum_importer", "error_message"=>nil, "summary"=>{"content"=>{"state"=>"FINISHED"}, "comps"=>{"state"=>"FINISHED"}, "distribution"=>{"state"=>"FINISHED"}, "errata"=>{"state"=>"FINISHED"}, "metadata"=>{"state"=>"FINISHED"}}, "added_count"=>18160, "result"=>"success", "updated_count"=>31, "details"=>{"content"=>{"size_total"=>27597224364, "items_left"=>0, "items_total"=>15038, "state"=>"FINISHED", "size_left"=>0, "details"=>{"rpm_total"=>15038, "rpm_done"=>15038, "drpm_total"=>0, "drpm_done"=>0}, "error_details"=>[{"expected_size"=>333826592, "error_code"=>"size_mismatch", "actual_size"=>299892736, "unit_key"=>{"name"=>"rhel-guest-image-6", "checksum"=>"e97f84a5ab223d7e63f29a73eb615cba3c509f47", "epoch"=>"0", "version"=>"6.6", "release"=>"20141222.0.el6_6", "arch"=>"noarch", "checksumtype"=>"sha1"}}]}, "comps"=>{"state"=>"FINISHED"}, "distribution"=>{"items_total"=>0, "state"=>"FINISHED", "error_details"=>[], "items_left"=>0}, "errata"=>{"state"=>"FINISHED"}, "metadata"=>{"state"=>"FINISHED"}}, "id"=>"55a920ea30b0425aaec32347", "removed_count"=>0}, "token"=>"LdsGNyXcL9TS76mPUpeVp4dYmwnMerNC", "api_version"=>"v2", "repository"=>{}}

Comment 14 John Matthews 2015-07-23 12:13:57 UTC
Reopening since this is coming up more.

Saw this again on 7/22, saw it on 5 of 5 attempts of running a deployment. Of the 5 deployments I'd see either one or both of the below.

Error #1
{"url"=>"https://cdn.redhat.com/content/dist/rhel/server/6/6Server/x86_64/rh-common/os/Packages/rhel-guest-image-6-6.5-20140411
.0.el6_5.noarch.rpm", "errors"=>["('Connection aborted.', BadStatusLine(\"''\",))"]}]}, 

Manually downloaded from CDN and file was fine.




Error #2:

{"expected_size"=>297488612, "error_code"=>"size_mismatch", "actual_size"=>29360128, "unit_key"=>{"name"=>"rhel-guest-image-6", "checksum"=>"451e28979d05c2887182c9cd38b7fdee213d2bbc52575ecd6cd1aa35815850a1", "epoc
h"=>"0", "version"=>"6.5", "release"=>"20140116.1.el6_5", "arch"=>"noarch", "checksumtype"=>"sha256"}}, 

I manually downloaded from CDN and file checks out OK.

$ ls -lart rhel-guest-image-6-6.5-20140116.1.el6_5.noarch.rpm 
-rw-r----- 1 jmatthews jmatthews 297488612 Jul 23 08:06 rhel-guest-image-6-6.5-20140116.1.el6_5.noarch.rpm


$ sha256sum rhel-guest-image-6-6.5-20140116.1.el6_5.noarch.rpm 
451e28979d05c2887182c9cd38b7fdee213d2bbc52575ecd6cd1aa35815850a1  rhel-guest-image-6-6.5-20140116.1.el6_5.noarch.rpm










2015-07-22 15:29:47 [I] Processing by Katello::Api::V2::RepositoriesController#sync_complete as JSON
2015-07-22 15:29:47 [I]   Parameters: {"call_report"=>"[FILTERED]", "event_type"=>"repo.sync.finish", "payload"=>{"importer_id"=>"yum_importer", "exception"=>nil, "repo_id"=>"Default_Organization-Red_Hat_Enterprise_Linux_Server-Red
_Hat_Enterprise_Linux_6_Server_-_RH_Common_RPMs_x86_64_6Server", "traceback"=>nil, "started"=>"2015-07-22T19:24:05Z", "_ns"=>"repo_sync_results", "completed"=>"2015-07-22T19:29:46Z", "importer_type_id"=>"yum_importer", "error_messa
ge"=>nil, "summary"=>{"content"=>{"state"=>"FINISHED"}, "comps"=>{"state"=>"FINISHED"}, "distribution"=>{"state"=>"FINISHED"}, "errata"=>{"state"=>"FINISHED"}, "metadata"=>{"state"=>"FINISHED"}}, "added_count"=>129, "result"=>"succ
ess", "updated_count"=>0, "details"=>{"content"=>{"size_total"=>4677973184, "items_left"=>0, "items_total"=>106, "state"=>"FINISHED", "size_left"=>0, "details"=>{"rpm_total"=>106, "rpm_done"=>106, "drpm_total"=>0, "drpm_done"=>0}, 
"error_details"=>[{"expected_size"=>297488612, "error_code"=>"size_mismatch", "actual_size"=>29360128, "unit_key"=>{"name"=>"rhel-guest-image-6", "checksum"=>"451e28979d05c2887182c9cd38b7fdee213d2bbc52575ecd6cd1aa35815850a1", "epoc
h"=>"0", "version"=>"6.5", "release"=>"20140116.1.el6_5", "arch"=>"noarch", "checksumtype"=>"sha256"}}, {"url"=>"https://cdn.redhat.com/content/dist/rhel/server/6/6Server/x86_64/rh-common/os/Packages/rhel-guest-image-6-6.5-20140411
.0.el6_5.noarch.rpm", "errors"=>["('Connection aborted.', BadStatusLine(\"''\",))"]}]}, "comps"=>{"state"=>"FINISHED"}, "distribution"=>{"items_total"=>0, "state"=>"FINISHED", "error_details"=>[], "items_left"=>0}, "errata"=>{"stat
e"=>"FINISHED"}, "metadata"=>{"state"=>"FINISHED"}}, "id"=>"55afef2ad4ebf31fa2a37418", "removed_count"=>0}, "token"=>"SzDC8mYXKbkohXWCMiP9fg8Mva8RzAs7", "api_version"=>"v2", "repository"=>{}}
2015-07-22 15:29:47 [I] Sync_complete called for Red Hat Enterprise Linux 6 Server - RH Common RPMs x86_64 6Server, running after_sync.

Comment 15 pulp-infra@redhat.com 2015-09-01 09:30:23 UTC
The Pulp upstream bug status is at CLOSED - DUPLICATE. Updating the external tracker on this bug.

Comment 16 Ina Panova 2015-09-01 13:32:49 UTC
The upstream fix for this bug is https://pulp.plan.io/issues/1210

Comment 18 Brian Bouterse 2015-10-02 13:37:08 UTC
The upstream bug for this is fixed. See its commit details here: https://pulp.plan.io/issues/1210

Comment 19 jcallaha 2015-11-10 20:37:50 UTC
Verified in Upstream. Sync'd numerous large repositories with no issues encountered.

  RELEASE: Red Hat Enterprise Linux Server release 7.1 (Maipo)
  FOREMAN: 1.11.0-develop
     RUBY: ruby 2.0.0p598 (2014-11-13) [x86_64-linux]
   PUPPET: 3.8.4

Comment 22 errata-xmlrpc 2016-07-27 09:15:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1501