Bug 1410649

Summary:	Sync fails due to error - PulpExecutionException: Importer indicated a failed response
Product:	Red Hat Satellite	Reporter:	Paul Dudley <pdudley>
Component:	Pulp	Assignee:	satellite6-bugs <satellite6-bugs>
Status:	CLOSED ERRATA	QA Contact:	Bruno Rocha <rochacbruno>
Severity:	high	Docs Contact:
Priority:	high
Version:	6.2.4	CC:	aagrawal, ahuchcha, bbuckingham, bill.scherer, bkearney, bmbouter, daviddavis, dkliban, egolov, ehelms, ggainey, ggatward, gpatil, gpayelka, hmore, ipanova, jcallaha, jfoots, kabbott, katello-qa-list, ktordeur, mhrivnak, michael.hanson, mmccune, mmithaiw, mverma, omaciel, patricia.moeller, pcreech, pdudley, pdwyer, pmoravec, pmorey, rchan, rchauhan, rhbgs.10.bigi_gigi, schamilt, sghai, slutade, tasander, ttereshc, wpinheir, xdmoon
Target Milestone:	Unspecified	Keywords:	Triaged
Target Release:	Unused
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	pulp-rpm-2.8.7.9-1	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1429671 (view as bug list)		Environment:
Last Closed:	2017-05-01 13:57:51 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1429671

Description Paul Dudley 2017-01-06 01:19:23 UTC

Description of problem:

Sync of rhel 7 epel repo fails with the following traceback in sync task:
    Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
        R = retval = fun(*args, **kwargs)
      File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 473, in __call__
        return super(Task, self).__call__(*args, **kwargs)
      File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 103, in __call__
        return super(PulpTask, self).__call__(*args, **kwargs)
      File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 437, in __protected_call__
        return self.run(*args, **kwargs)
      File "/usr/lib/python2.7/site-packages/pulp/server/controllers/repository.py", line 810, in sync
        raise pulp_exceptions.PulpExecutionException(_('Importer indicated a failed response'))
    PulpExecutionException: Importer indicated a failed response
...
      errata:
        state: FAILED
        error: command document too large
...

How reproducible:
So far, only reproducible on customer system.

Repo available at url: https://dl.fedoraproject.org/pub/epel/7Server/x86_64/


Additional info:

This repos on the customers system has had a successful sync in the past - this sync would be a diff of current packages vs newer packages that are available.

Comment 4 Tanya Tereshchenko 2017-01-09 16:31:24 UTC

After some investigation the reported issue was reproduced under very specific circumstances.
To provide a proper fix for the customer case and to have a proof of our hypothesis could we receive a dump of certain collections from customer database?
Those collections should not contain sensitive information, it is list of repositories and the content of errata collection.

One can make a dump with the following command:
mongodump --db database_name --collection collection_name --out directory_to_save_your_dump

Depending on configuration it may be needed to specify some other options like host/port, username/login or some ssl options, check the docs here:
https://docs.mongodb.com/manual/reference/program/mongodump/#iddup.mongodump

The collections we are interested in are:
 - "units_erratum" (it contains errata themselves)
 - "repos" (it contains name of the repositories and number of the units in them, some other general info about repositories)

Name of the database by default is "pulp_database", it can be changed though, so check /etc/pulp/server.conf, database section.

Comment 10 Tanya Tereshchenko 2017-02-05 20:15:49 UTC

Thanks everyone for the dumps, they are extremely helpful!
(Sorry for the late response, DevConf + flu afterwards).

The current issue won't be resolved by https://pulp.plan.io/issues/723, it is a different problem - huge rpm metadata. 
This BZ is only about DocumentTooLarge error for errata and the advisory can't be that big, it's a bug.
Thanks to provided dumps, the root cause is known now and I will add the upstream Pulp issue, see details there.

While waiting for the fix, no good workaround is available, only this one:
Remove all the repositories containing affected erratum, then delete orphaned content, create/sync same repository.
The easiest case - customer has only one repo with affected erratum and no copies, so only this one repo can be deleted, orphaned content cleaned up, repo re-created and synced.

Comment 11 pulp-infra@redhat.com 2017-02-05 21:01:17 UTC

The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 12 pulp-infra@redhat.com 2017-02-05 21:01:21 UTC

The Pulp upstream bug priority is at High. Updating the external tracker on this bug.

Comment 13 pulp-infra@redhat.com 2017-02-05 21:31:15 UTC

The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.

Comment 14 pulp-infra@redhat.com 2017-02-08 02:02:01 UTC

The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 15 pulp-infra@redhat.com 2017-02-10 18:01:19 UTC

The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 16 pulp-infra@redhat.com 2017-02-10 18:31:18 UTC

All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 18 pulp-infra@redhat.com 2017-02-16 20:32:04 UTC

The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.

Comment 19 pulp-infra@redhat.com 2017-02-23 21:01:40 UTC

The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Comment 29 Bryan Kearney 2017-03-06 19:54:33 UTC

MOving to 6.2.9

Comment 36 Michael Hrivnak 2017-03-23 12:36:53 UTC

Requested backporting has been completed.

Comment 39 pulp-infra@redhat.com 2017-03-30 01:33:38 UTC

The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 40 pulp-infra@redhat.com 2017-03-30 01:33:46 UTC

The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 41 pulp-infra@redhat.com 2017-03-30 23:10:09 UTC

The Pulp upstream bug status is at CLOSED - COMPLETE. Updating the external tracker on this bug.

Comment 42 Mike McCune 2017-04-06 19:24:57 UTC

=== HOTFIX INSTRUCTIONS FOR SATELLITE 6.2.8 ONLY ===


See the hotfix package and instructions outlined here:

https://bugzilla.redhat.com/show_bug.cgi?id=1388296#c49

Comment 49 Bruno Rocha 2017-04-28 22:13:37 UTC

Verified in: [root@cloud-qe-09 ~]# rpm -q satellite
satellite-6.2.9-7.0.el7sat.noarch


Steps:


1) 

    I created a new product and a new repo for EPEL [0] and synched.

2) 

   At mongo shell `$ mongo pulp_database`

   a) Saved sample_collection

   var erratum = db.units_erratum.find(
     {"errata_id":"FEDORA-EPEL-2016-f057025262"})[0]
   var sample_collection = erratum.pkglist[0]
   sample_collection.packages.length
   729

   b) Updated
   db.units_erratum.update(
     {"errata_id": "FEDORA-EPEL-2016-f057025262"}, 
     {"$unset": {"pkglist.0._pulp_repo_id": ""}, 
      "$pull": {"pkglist.0.packages": {"name": "kf5-kplotting-devel"}}})  

   db.units_erratum.find(
      {errata_id: "FEDORA-EPEL-2016-f057025262"}[0].pkglist[0].packages.length
   726
    
   c) Made the repo bigger (synched 20 times in UI then used)
   
    for (i = 0; i < 100; i+=1) {db.units_erratum.update({"errata_id": "FEDORA-
    EPEL-2016-f057025262"}, {"$push": {"pkglist": sample_collection}})}
    WriteResult({
	"nMatched" : 0,
	"nUpserted" : 0,
	"nModified" : 0,
	"writeError" : {
		"code" : 10334,
		"errmsg" : "BSONObj size: 16987942 (0x1033726) is invalid. Size 
     must be between 0 and 16793600(16MB) First element: _id: \"1ca18b56-9fe9-
     4180-a893-2e977d3026db\""
	}
     })

3) 

  Resynched many times using UI
  and then checked the size and 
  erratum does not grow after sync

  > db.units_erratum.find(
    {errata_id: "FEDORA-EPEL-2016-f057025262"})[0].pkglist[0].packages.length
  726


No error!




[0] https://dl.fedoraproject.org/pub/epel/7/x86_64/

Comment 51 errata-xmlrpc 2017-05-01 13:57:51 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1191

Comment 52 Michael Hrivnak 2017-05-22 13:19:11 UTC

*** Bug 1446358 has been marked as a duplicate of this bug. ***

Comment 53 Patricia Moeller 2017-06-06 10:32:47 UTC

Still valid on 6.2.9 while trying to sync EPEL 7:

    "progress_report"=>
     {"yum_importer"=>
       {"content"=>
         {"items_total"=>0,
          "state"=>"FINISHED",
          "error_details"=>[],
          "details"=>
           {"rpm_total"=>0, "rpm_done"=>0, "drpm_total"=>0, "drpm_done"=>0},
          "size_total"=>0,
          "size_left"=>0,
          "items_left"=>0},
        "comps"=>{"state"=>"NOT_STARTED"},
        "purge_duplicates"=>{"state"=>"NOT_STARTED"},
        "distribution"=>
         {"items_total"=>0,
          "state"=>"FINISHED",
          "error_details"=>[],
          "items_left"=>0},
        "errata"=>{"state"=>"FAILED", "error"=>"command document too large"},
        "metadata"=>{"state"=>"FINISHED"}}},

Comment 54 Tanya Tereshchenko 2017-06-06 11:39:18 UTC

Take a look at this https://bugzilla.redhat.com/show_bug.cgi?id=1437150#c9 , it will help until the issue is fully fixed

Comment 55 Patricia Moeller 2017-06-06 11:55:32 UTC

Thanks, but I cannot find any solution in there. What is that orphan clean up about? Is it katello::reimport on Sat6?

Comment 56 Patricia Moeller 2017-06-06 12:00:24 UTC

The only 'workaround' that helped here is to delete all repositories with the same reference and recreate them afterwards. But this will lead to the clients repository association to be lost. Is it that what you mean?

Comment 57 Patricia Moeller 2017-06-06 12:08:35 UTC

Sorry, we have verified, even the above workaround does not help. Btw. we have no capsule but a lot of Orgs and Repos.

Comment 58 Tanya Tereshchenko 2017-06-06 12:21:25 UTC

You can apply the temporary solution (to which I linked) to the main Satellite as well. Capsule case is just more common one.

Comment 59 Patricia Moeller 2017-06-06 12:24:30 UTC

As mentioned I cannot find a solution in your link. It only points to the upstrem BZs for re-implementation. It also has no comment9 which you linked or it's not public.

Comment 60 Patricia Moeller 2017-06-06 18:44:43 UTC

So whats the workaround Tanya?

Comment 61 Waldirio M Pinheiro 2017-06-06 19:34:18 UTC

Hello Patricia, good afternoon

Please take a look on kcs below, you will find all information related to the issue and one script to fix it.

https://access.redhat.com/solutions/3071151


Best Regards
Waldirio M Pinheiro | Senior Software Maintenance Engineer

Comment 62 Patricia Moeller 2017-06-06 19:44:27 UTC

Great, Thanks for your help. I have now executed the script and will monitor if it helps. I think it's the right decision to reimplement the whole process. It's a bit sad as the issue is known from the beginning (2 years now) but hopefully we can finally make it right.

Comment 63 Waldirio M Pinheiro 2017-06-06 20:55:59 UTC

Hi Patricia,

Good to hear from you, I believe everything will be ok.

If you need any future assistance, I really recommend open one support case, will be faster and we will help you from the same way.

Btw I recommend you to keep your environment up to date, and to know everything related to fix, access this one [1]

Wish you one amazing day.

Best Regards
Waldirio M Pinheiro | Senior Software Maintenance Engineer

[1]. https://access.redhat.com/articles/1365633

Comment 64 Red Hat Bugzilla 2023-09-15 00:00:57 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days