Steps to retest: 1) Apply a patch to add auxiliary logging and force worker to sleep at the bad place: --- /usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py.orig 2022-04-11 12:54:11.374669136 +0200 +++ /usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py 2022-04-13 11:39:11.590025573 +0200 @@ -17,6 +17,7 @@ from multiprocessing import Process from tempfile import TemporaryDirectory import django +import time os.environ.setdefault("DJANGO_SETTINGS_MODULE", "pulpcore.app.settings") django.setup() @@ -189,6 +190,9 @@ class NewPulpWorker: r, w, x = select.select( [self.sentinel, connection.connection], [], [], self.heartbeat_period ) + _logger.info(_("bz2050440: Worker %s artificial sleep before heartbeat"), self.name) + time.sleep(2) + _logger.info(_("bz2050440: Worker %s artificial slept before heartbeat"), self.name) self.beat() if connection.connection in r: connection.connection.poll() (you can make the code changes directly to the "pulpcore_worker.py" file and then run "cat /usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py | patch -p1" to apply the patch) 2) Let just 1 worker running (just to ease the reproducer): systemctl stop pulpcore-worker@*.service systemctl start pulpcore-worker 3) "tail -f /var/log/messages" to monitor the "Worker * artificial sle.." logs on another terminal 4) Have prepared a curl command to e.g. create a publication or distribution or delete either of them. certs="--cacert /etc/pki/katello/certs/katello-server-ca.crt --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key" hname=$(hostname -f) # create repo reponame="repo_bz2050440" repohref=$(curl -s -X POST -H "Content-Type: application/json" -d "{\"name\":\"$reponame\"}" $certs https://${hname}/pulp/api/v3/repositories/rpm/rpm/ | json_reformat | grep pulp_href | cut -d\" -f4) 5) Once the *first* log "artificial sleep" is printed, immediately fire the API request that will invoke a task. # prepare below command to fire at step 4 curl -s -H "Accept:application/json" -H "Content-Type: application/json" -X POST -d "{\"repository\": \"${repohref}\"}" $certs https://${hname}/pulp/api/v3/publications/rpm/rpm/ | json_reformat 6) Ensure the task is created while the worker is sleeping, like: (example) Apr 13 12:01:42 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: bz2050440: Worker bz2050440.satotest.gsslab.brq2.redhat.com">28305.gsslab.brq2.redhat.com artificial sleep before heartbeat Apr 13 12:01:43 pmoravec-sat610-bz2050440 pulpcore-api: pulp [6d4ba28511c34d1db0cc8f7c06c2f69f]: - - [13/Apr/2022:10:01:43 +0000] "POST /pulp/api/v3/publications/rpm/rpm/ HTTP/1.1" 202 67 "-" "curl/7.29.0" Apr 13 12:01:44 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: bz2050440: Worker bz2050440.satotest.gsslab.brq2.redhat.com">28305.gsslab.brq2.redhat.com artificial slept before heartbeat 7) Wait for Godot to have the task executed. 8) check that the task is suddenly executed: (example) .. Apr 13 12:03:20 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: bz2050440: Worker bz2050440.satotest.gsslab.brq2.redhat.com">28305.gsslab.brq2.redhat.com artificial slept before heartbeat Apr 13 12:03:25 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: bz2050440: Worker bz2050440.satotest.gsslab.brq2.redhat.com">28305.gsslab.brq2.redhat.com artificial sleep before heartbeat Apr 13 12:03:27 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: bz2050440: Worker bz2050440.satotest.gsslab.brq2.redhat.com">28305.gsslab.brq2.redhat.com artificial slept before heartbeat Apr 13 12:03:27 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [6d4ba28511c34d1db0cc8f7c06c2f69f]: pulpcore.tasking.pulpcore_worker:INFO: Starting task b7110104-a52c-49f3-af78-58d47a662634 Apr 13 12:03:27 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [6d4ba28511c34d1db0cc8f7c06c2f69f]: pulp_rpm.app.tasks.publishing:INFO: Publishing: repository=repo_bz2050440_ts1649842649_737576601, version=0 Apr 13 12:03:27 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [6d4ba28511c34d1db0cc8f7c06c2f69f]: pulp_rpm.app.tasks.publishing:INFO: Publication: 0a35af25-a374-4aa8-912c-bb2d3a4e0e99 created Apr 13 12:03:27 pmoravec-sat610-bz2050440 pulpcore-worker-1: pulp [6d4ba28511c34d1db0cc8f7c06c2f69f]: pulpcore.tasking.pulpcore_worker:INFO: Task completed b7110104-a52c-49f3-af78-58d47a662634 Optionally: 9) check that the task is executed or not with the following, replacing the "tasks/60aac..." with the pulpcore task that is displayed for you curl $certs https://${hname}/pulp/api/v3/tasks/60aac23b-29fd-4ced-9c7d-bdce6d4b880c/ | json_reformat Expected: The task should executed and completed successfully and should not be idle Actual: The task executed and completed successfully and isn't idle. { "pulp_href": "/pulp/api/v3/tasks/60aac23b-29fd-4ced-9c7d-bdce6d4b880c/", "pulp_created": "2022-05-25T22:48:49.752292Z", "state": "completed", "name": "pulp_rpm.app.tasks.publishing.publish", "logging_cid": "8c9c54f73fe24061bdea34373363a82b", "started_at": "2022-05-25T22:48:50.703106Z", "finished_at": "2022-05-25T22:48:50.967578Z", "error": null, "worker": "/pulp/api/v3/workers/b7008577-4fc2-4213-9653-64d3a11aed8e/", "parent_task": null, "child_tasks": [ ], "task_group": null, "progress_reports": [ { "message": "Generating repository metadata", "code": "publish.generating_metadata", "state": "completed", "total": 1, "done": 1, "suffix": null } ], "created_resources": [ "/pulp/api/v3/publications/rpm/rpm/b587447a-4467-4498-9ab5-67dd00b980e4/" ], "reserved_resources_record": [ "/pulp/api/v3/repositories/rpm/rpm/e4623177-8cfe-40de-a340-66ed73630f84/" ] } Verified on 6.10.6 snap 3 with python3-pulpcore-3.14.16-1.el7pc.noarch This is quite difficult to test. It worked out fine and I did also check that the codebase is in the build.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Satellite 6.10.6 Async Bug Fix Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:4850