Bug 2031154 - After upgrading to Satellite 6.10, Repository sync randomly fails if a ReservedResource exists in core_taskreservedresource table of pulpcore DB.
Summary: After upgrading to Satellite 6.10, Repository sync randomly fails if a Reserv...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.10.1
Hardware: All
OS: All
high
high
Target Milestone: 6.11.0
Assignee: satellite6-bugs
QA Contact: Lai
URL:
Whiteboard:
: 2033568 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-12-10 16:03 UTC by Sayan Das
Modified: 2022-09-28 17:14 UTC (History)
13 users (show)

Fixed In Version: pulpcore-3.16.5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2059394 (view as bug list)
Environment:
Last Closed: 2022-07-05 14:31:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github pulp pulpcore issues 2101 0 None closed Clean up TaskReservedResources/task-table at migration to new-tasking-system 2022-03-30 17:41:43 UTC
Red Hat Knowledge Base (Solution) 6563341 0 None None None 2021-12-10 16:04:40 UTC
Red Hat Product Errata RHSA-2022:5498 0 None None None 2022-07-05 14:31:13 UTC

Description Sayan Das 2021-12-10 16:03:54 UTC
Description of problem:

After upgrading to Satellite 6.10, Repository sync randomly fails if a ReservedResource exists in core_taskreservedresource table of pulpcore DB.


Version-Release number of selected component (if applicable):

Red Hat Satellite 6.10.1


How reproducible:

By some customers when they have upgraded from Satellite 6.9 to 6.10 and core_taskreservedresource table still holding up some ReservedResource.


Steps to Reproduce:
1. Install Satellite 6.9 and setup content
2. Perform the pulp2-pulp3 migration.
3. Upgrade to satellite 6.10.1 and clear old pulp2 data
4. Perform Repository sync.


Actual results:

The Repositiry sync fails randomly with following traceback\errors.

Dec  3 08:57:32 satellite pulpcore-worker-8: pulp [10f1c08a-0af6-4194-96ba-3cf07640ecb9]: pulp_rpm.app.tasks.synchronizing:INFO: Synchronizing: repository=Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8-76619 remote=Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8-78352
..
..
Dec  3 09:00:13 satellite pulpcore-worker-8: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: Clean offline worker 9156.xx.
Dec  3 09:00:13 satellite pulpcore-api: pulp [10f1c08a-0af6-4194-96ba-3cf07640ecb9]:  - - [03/Dec/2021:14:00:13 +0000] "GET /pulp/api/v3/tasks/f82de63e-6e82-4b0c-b078-1d12b8909b54/ HTTP/1.1" 200 1155 "-" "OpenAPI-Generator/3.14.1/ruby"
Dec  3 09:00:39 satellite pulpcore-worker-5: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: Cleaning up task f82de63e-6e82-4b0c-b078-1d12b8909b54 and marking as failed. Reason: Worker has gone missing.
..
..
Dec  3 09:01:15 satellite pulpcore-worker-8: Process Process-1:
Dec  3 09:01:15 satellite pulpcore-worker-8: Traceback (most recent call last):
Dec  3 09:01:15 satellite pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py", line 151, in create
Dec  3 09:01:15 satellite pulpcore-worker-8: loop.run_until_complete(pipeline)
..
..
Dec  3 09:01:15 satellite pulpcore-worker-8: field.remote_field.on_delete(self, field, sub_objs, self.using)
Dec  3 09:01:15 satellite pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/django/db/models/deletion.py", line 27, in PROTECT
Dec  3 09:01:15 satellite pulpcore-worker-8: sub_objs
..
..
Dec  3 09:01:15 satellite pulpcore-worker-8: django.db.models.deletion.ProtectedError: ("Cannot delete some instances of model 'ReservedResource' because they are referenced through a protected foreign key: 'TaskReservedResource.resource'", <QuerySet [<TaskReservedResource: pk=100093eb-bb01-45fb-ad0e-1994966511d3>]>)


Expected results:

No such errors.


Additional info:

It's happening as we seem to have some zombie worker\resource leftover.


# echo "select pulp_id,resource_id,task_id from core_taskreservedresource;" | su - postgres -c "psql pulpcore"
              pulp_id                |             resource_id              |               task_id
--------------------------------------+--------------------------------------+--------------------------------------
100093eb-bb01-45fb-ad0e-1994966511d3 | 37c3d3b2-e077-49b4-8cfd-fe73cbf16368 | f5f5cc16-dc36-4f54-8ded-336fce5df444
(1 row)

# sudo -u pulp PULP_SETTINGS='/etc/pulp/settings.py' DJANGO_SETTINGS_MODULE='pulpcore.app.settings' pulpcore-manager shell <<EOF
from pulpcore.app.models import ReservedResource, Worker
worker_to_res = {}
for rr in ReservedResource.objects.all():
 worker_to_res[rr.worker_id] = rr.pulp_id
workers = [w.pulp_id for w in Worker.objects.online_workers()]
for rwork in worker_to_res:
 if rwork not in workers:
   print(f'Worker {rwork} owns ReservedResource {worker_to_res[rwork]} and is not in online_workers!!')
EOF

Worker 0b7b2a70-4a1e-4a35-90e3-c1b2922eea45 owns ReservedResource 37c3d3b2-e077-49b4-8cfd-fe73cbf16368 and is not in online_workers!!


Solution: https://access.redhat.com/solutions/6563341

Comment 2 Grant Gainey 2022-01-10 15:55:52 UTC
*** Bug 2033568 has been marked as a duplicate of this bug. ***

Comment 3 Daniel Alley 2022-03-26 02:06:43 UTC
Fixed in pulpcore 3.14.15 (actually the previous release, but 6.11 ought to ship with 3.14.15+)

Comment 4 Daniel Alley 2022-03-26 02:11:46 UTC
(I meant 6.10.5 should have pulpcore 3.14.15+)

Comment 5 Griffin Sullivan 2022-04-26 15:07:05 UTC
Verified in Satellite 6.11 snap 17

Repository sync works after upgrading from 6.9.

Steps to Reproduce:
1. Install Satellite 6.9 and setup content
2. Perform the pulp2-pulp3 migration.
3. Upgrade to satellite 6.10
4. Clear old pulp2 data
5. Update to 6.11
6. Perform Repository sync.

Expected Results:

Repository sync and content view publish work without issues.

Actual Results:

Repository sync and content view publish work without issues.

Comment 8 errata-xmlrpc 2022-07-05 14:31:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498


Note You need to log in before you can comment on or make changes to this bug.