Description of problem:
After upgrading to Satellite 6.10, Repository sync randomly fails if a ReservedResource exists in core_taskreservedresource table of pulpcore DB.
Version-Release number of selected component (if applicable):
Red Hat Satellite 6.10.1
How reproducible:
By some customers when they have upgraded from Satellite 6.9 to 6.10 and core_taskreservedresource table still holding up some ReservedResource.
Steps to Reproduce:
1. Install Satellite 6.9 and setup content
2. Perform the pulp2-pulp3 migration.
3. Upgrade to satellite 6.10.1 and clear old pulp2 data
4. Perform Repository sync.
Actual results:
The Repositiry sync fails randomly with following traceback\errors.
Dec 3 08:57:32 satellite pulpcore-worker-8: pulp [10f1c08a-0af6-4194-96ba-3cf07640ecb9]: pulp_rpm.app.tasks.synchronizing:INFO: Synchronizing: repository=Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8-76619 remote=Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8-78352
..
..
Dec 3 09:00:13 satellite pulpcore-worker-8: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: Clean offline worker 9156.xx.
Dec 3 09:00:13 satellite pulpcore-api: pulp [10f1c08a-0af6-4194-96ba-3cf07640ecb9]: - - [03/Dec/2021:14:00:13 +0000] "GET /pulp/api/v3/tasks/f82de63e-6e82-4b0c-b078-1d12b8909b54/ HTTP/1.1" 200 1155 "-" "OpenAPI-Generator/3.14.1/ruby"
Dec 3 09:00:39 satellite pulpcore-worker-5: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: Cleaning up task f82de63e-6e82-4b0c-b078-1d12b8909b54 and marking as failed. Reason: Worker has gone missing.
..
..
Dec 3 09:01:15 satellite pulpcore-worker-8: Process Process-1:
Dec 3 09:01:15 satellite pulpcore-worker-8: Traceback (most recent call last):
Dec 3 09:01:15 satellite pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py", line 151, in create
Dec 3 09:01:15 satellite pulpcore-worker-8: loop.run_until_complete(pipeline)
..
..
Dec 3 09:01:15 satellite pulpcore-worker-8: field.remote_field.on_delete(self, field, sub_objs, self.using)
Dec 3 09:01:15 satellite pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/django/db/models/deletion.py", line 27, in PROTECT
Dec 3 09:01:15 satellite pulpcore-worker-8: sub_objs
..
..
Dec 3 09:01:15 satellite pulpcore-worker-8: django.db.models.deletion.ProtectedError: ("Cannot delete some instances of model 'ReservedResource' because they are referenced through a protected foreign key: 'TaskReservedResource.resource'", <QuerySet [<TaskReservedResource: pk=100093eb-bb01-45fb-ad0e-1994966511d3>]>)
Expected results:
No such errors.
Additional info:
It's happening as we seem to have some zombie worker\resource leftover.
# echo "select pulp_id,resource_id,task_id from core_taskreservedresource;" | su - postgres -c "psql pulpcore"
pulp_id | resource_id | task_id
--------------------------------------+--------------------------------------+--------------------------------------
100093eb-bb01-45fb-ad0e-1994966511d3 | 37c3d3b2-e077-49b4-8cfd-fe73cbf16368 | f5f5cc16-dc36-4f54-8ded-336fce5df444
(1 row)
# sudo -u pulp PULP_SETTINGS='/etc/pulp/settings.py' DJANGO_SETTINGS_MODULE='pulpcore.app.settings' pulpcore-manager shell <<EOF
from pulpcore.app.models import ReservedResource, Worker
worker_to_res = {}
for rr in ReservedResource.objects.all():
worker_to_res[rr.worker_id] = rr.pulp_id
workers = [w.pulp_id for w in Worker.objects.online_workers()]
for rwork in worker_to_res:
if rwork not in workers:
print(f'Worker {rwork} owns ReservedResource {worker_to_res[rwork]} and is not in online_workers!!')
EOF
Worker 0b7b2a70-4a1e-4a35-90e3-c1b2922eea45 owns ReservedResource 37c3d3b2-e077-49b4-8cfd-fe73cbf16368 and is not in online_workers!!
Solution: https://access.redhat.com/solutions/6563341
Verified in Satellite 6.11 snap 17
Repository sync works after upgrading from 6.9.
Steps to Reproduce:
1. Install Satellite 6.9 and setup content
2. Perform the pulp2-pulp3 migration.
3. Upgrade to satellite 6.10
4. Clear old pulp2 data
5. Update to 6.11
6. Perform Repository sync.
Expected Results:
Repository sync and content view publish work without issues.
Actual Results:
Repository sync and content view publish work without issues.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2022:5498
Description of problem: After upgrading to Satellite 6.10, Repository sync randomly fails if a ReservedResource exists in core_taskreservedresource table of pulpcore DB. Version-Release number of selected component (if applicable): Red Hat Satellite 6.10.1 How reproducible: By some customers when they have upgraded from Satellite 6.9 to 6.10 and core_taskreservedresource table still holding up some ReservedResource. Steps to Reproduce: 1. Install Satellite 6.9 and setup content 2. Perform the pulp2-pulp3 migration. 3. Upgrade to satellite 6.10.1 and clear old pulp2 data 4. Perform Repository sync. Actual results: The Repositiry sync fails randomly with following traceback\errors. Dec 3 08:57:32 satellite pulpcore-worker-8: pulp [10f1c08a-0af6-4194-96ba-3cf07640ecb9]: pulp_rpm.app.tasks.synchronizing:INFO: Synchronizing: repository=Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8-76619 remote=Red_Hat_Enterprise_Linux_8_for_x86_64_-_BaseOS_RPMs_8-78352 .. .. Dec 3 09:00:13 satellite pulpcore-worker-8: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: Clean offline worker 9156.xx. Dec 3 09:00:13 satellite pulpcore-api: pulp [10f1c08a-0af6-4194-96ba-3cf07640ecb9]: - - [03/Dec/2021:14:00:13 +0000] "GET /pulp/api/v3/tasks/f82de63e-6e82-4b0c-b078-1d12b8909b54/ HTTP/1.1" 200 1155 "-" "OpenAPI-Generator/3.14.1/ruby" Dec 3 09:00:39 satellite pulpcore-worker-5: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: Cleaning up task f82de63e-6e82-4b0c-b078-1d12b8909b54 and marking as failed. Reason: Worker has gone missing. .. .. Dec 3 09:01:15 satellite pulpcore-worker-8: Process Process-1: Dec 3 09:01:15 satellite pulpcore-worker-8: Traceback (most recent call last): Dec 3 09:01:15 satellite pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py", line 151, in create Dec 3 09:01:15 satellite pulpcore-worker-8: loop.run_until_complete(pipeline) .. .. Dec 3 09:01:15 satellite pulpcore-worker-8: field.remote_field.on_delete(self, field, sub_objs, self.using) Dec 3 09:01:15 satellite pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/django/db/models/deletion.py", line 27, in PROTECT Dec 3 09:01:15 satellite pulpcore-worker-8: sub_objs .. .. Dec 3 09:01:15 satellite pulpcore-worker-8: django.db.models.deletion.ProtectedError: ("Cannot delete some instances of model 'ReservedResource' because they are referenced through a protected foreign key: 'TaskReservedResource.resource'", <QuerySet [<TaskReservedResource: pk=100093eb-bb01-45fb-ad0e-1994966511d3>]>) Expected results: No such errors. Additional info: It's happening as we seem to have some zombie worker\resource leftover. # echo "select pulp_id,resource_id,task_id from core_taskreservedresource;" | su - postgres -c "psql pulpcore" pulp_id | resource_id | task_id --------------------------------------+--------------------------------------+-------------------------------------- 100093eb-bb01-45fb-ad0e-1994966511d3 | 37c3d3b2-e077-49b4-8cfd-fe73cbf16368 | f5f5cc16-dc36-4f54-8ded-336fce5df444 (1 row) # sudo -u pulp PULP_SETTINGS='/etc/pulp/settings.py' DJANGO_SETTINGS_MODULE='pulpcore.app.settings' pulpcore-manager shell <<EOF from pulpcore.app.models import ReservedResource, Worker worker_to_res = {} for rr in ReservedResource.objects.all(): worker_to_res[rr.worker_id] = rr.pulp_id workers = [w.pulp_id for w in Worker.objects.online_workers()] for rwork in worker_to_res: if rwork not in workers: print(f'Worker {rwork} owns ReservedResource {worker_to_res[rwork]} and is not in online_workers!!') EOF Worker 0b7b2a70-4a1e-4a35-90e3-c1b2922eea45 owns ReservedResource 37c3d3b2-e077-49b4-8cfd-fe73cbf16368 and is not in online_workers!! Solution: https://access.redhat.com/solutions/6563341