Bug 2018888

Summary: Actions::Pulp3::CapsuleContent::RefreshDistribution fails with NoMethodError: undefined method `pulp_href' for nil:NilClass
Product: Red Hat Satellite Reporter: Jan Hutař <jhutar>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED DUPLICATE QA Contact: Lai <ltran>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.10.0CC: jsherril, ttereshc
Target Milestone: UnspecifiedKeywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-pulp-rpm-3.14.7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-12 08:39:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Error from first sub-task none

Description Jan Hutař 2021-11-01 08:01:37 UTC
Created attachment 1838806 [details]
Error from first sub-task

Description of problem:
I'm testing sync to capsule and I some of the sub-tasks failed (they were in "error" state) before weekend, but now, at Mon when I refreshed, I can see them as "skipped". Anyway error is still there, so I assume it was just some cleanup job.


Version-Release number of selected component (if applicable):
satellite-6.10.0-3.el7sat.noarch


How reproducible:
Tried capsule sync once, but out of 210 subtasks of that main sync task, 24 times


Steps to Reproduce:
1. Setup with: https://github.com/redhat-performance/satperf/blob/3c5df3ff6175070602ba5ed656dbee09a38f4024/scripts/create-big-setup.sh
2. Run this twice: https://github.com/redhat-performance/satperf/blob/3c5df3ff6175070602ba5ed656dbee09a38f4024/scripts/create-big-setup-update.sh
3. Switch to "Any org" and "Any loc" and trigger optimized sync on a capsule


Actual results:
Id: bdaa5fa5-5c99-469b-8817-01502c8b5d91
Label: Actions::Katello::CapsuleContent::Sync
Status: stopped
Result: warning
Started at: 2021-10-29 23:05:36 UTC
Ended at: 2021-10-30 00:50:44 UTC
3: Actions::Pulp3::ContentGuard::Refresh (success) [ 0.29s / 0.29s ]
5: Actions::Pulp3::Orchestration::Repository::RefreshRepos (success) [ 87.20s / 87.20s ]
8: Actions::Pulp3::CapsuleContent::Sync (success) [ 33.60s / 1.41s ]
10: Actions::Pulp3::CapsuleContent::GenerateMetadata (success) [ 0.09s / 0.09s ]
12: Actions::Pulp3::CapsuleContent::RefreshDistribution (skipped) [ 710.30s / 0.55s ]
14: Actions::Pulp3::CapsuleContent::Sync (success) [ 33.97s / 1.36s ]
16: Actions::Pulp3::CapsuleContent::GenerateMetadata (success) [ 0.06s / 0.06s ]
18: Actions::Pulp3::CapsuleContent::RefreshDistribution (skipped) [ 710.32s / 0.56s ]
20: Actions::Pulp3::CapsuleContent::Sync (success) [ 736.60s / 12.74s ]
22: Actions::Pulp3::CapsuleContent::GenerateMetadata (success) [ 0.02s / 0.02s ]
24: Actions::Pulp3::CapsuleContent::RefreshDistribution (success) [ 2.29s / 1.13s ] 
[...]


Expected results:
All subtasks should pass.


Additional info:
foreman-debug is here: http://perf54.perf.lab.eng.bos.redhat.com/pub/foreman-debug-fHI2j.tar.xz

Comment 1 Tanya Tereshchenko 2021-11-01 09:31:47 UTC
Could this be investigated on the katello side first? Thanks.

Comment 2 Justin Sherrill 2021-11-01 13:20:51 UTC
would you happen to have the foreman-debug from the capsule?

(Or would you be able to give access to the boxes?)

Comment 5 Justin Sherrill 2021-11-01 16:19:18 UTC
It appears that a bunch of syncs failed, which cascaded and caused the traceback you see 'undefined method pulp_href'.  We should try to make this clearer (by not trying to refresh distributions if a sync fails), but the root cause is that the syncs failed with:

pulp [3505d97d-a5e2-44a8-93cc-b2a700f51042]: pulp_rpm.app.tasks.synchronizing:INFO: Synchronizing: repository=7-org4-ccv-rhel8-max-org4-le3-9fa86a96-548a-40ba-b21c-07add284b32
Task 567cdef3-12c1-4e7c-97c6-40b6708717e6 failed (get() returned more than one UpdateRecord  -- it returned 2!)
   File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 317, in _perform_task
 result = func(*args, **kwargs)
 File "/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/synchronizing.py", line 490, in synchronize
 version = dv.create()
 File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py", line 151, in create
 loop.run_until_complete(pipeline)
 File "/usr/lib/python3.6/site-packages/pulpcore/app/models/repository.py", line 963, in __exit__
 repository.finalize_new_version(self)
 File "/usr/lib/python3.6/site-packages/pulp_rpm/app/models/repository.py", line 353, in finalize_new_version
 resolve_advisories(new_version, previous_version)
 File "/usr/lib/python3.6/site-packages/pulp_rpm/app/advisory.py", line 87, in resolve_advisories
 previous_advisory = previous_advisories.get(id=advisory_id)
 File "/usr/lib/python3.6/site-packages/django/db/models/query.py", line 412, in get
 (self.model._meta.object_name, num)


This looks like this issue: https://pulp.plan.io/issues/9519

which is fixed in 3.16 but backported to pulp-rpm 3.14.7.  Looking at your capsule, 3.14.6 is installed.

Comment 6 Justin Sherrill 2021-11-01 16:29:07 UTC
This may be a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=2013320

Comment 7 Jan Hutař 2021-11-03 08:36:18 UTC
Hello. Yes, it is possible this bug is a duplicate of 2013320 as I have hit that as well (in same job) - see bug 2013320 comment #14.

Also number of occurrences is suspiciously similar: bug 2018888 24 times and bug 2013320 12 times. What is strange is that one bug caused error on one sub-tasks, while another bug caused error on different sub-tasks. If they would be dupes, I would expect they would both happen for same sub-task. I might be trying patch from bug 2013320 so if it resolve this, I'll close it.

Comment 8 Jan Hutař 2021-11-12 08:39:29 UTC
Yes, I think this is a duplicate of bug 2013320. Thank you for looking into this!

*** This bug has been marked as a duplicate of bug 2013320 ***