Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1961779 - Task still hangs after a celery worker process abruptly terminates
Summary: Task still hangs after a celery worker process abruptly terminates
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.9.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: 6.9.3
Assignee: satellite6-bugs
QA Contact: Lai
URL:
Whiteboard:
: 1962815 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-18 16:55 UTC by Tanya Tereshchenko
Modified: 2021-07-01 14:57 UTC (History)
5 users (show)

Fixed In Version: pulp-2.21.5.2-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1962815 (view as bug list)
Environment:
Last Closed: 2021-07-01 14:56:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Pulp Redmine 8251 0 Normal MODIFIED Traceback in 2.21.5 when logging during task.on_failure() 2021-05-18 17:41:13 UTC
Red Hat Product Errata RHBA-2021:2636 0 None Closed IPI installer timing out after 30 minutes 2022-05-03 13:01:21 UTC

Description Tanya Tereshchenko 2021-05-18 16:55:48 UTC
Description of problem:

Not fully fixed in BZ#1889795.
Requires this patch https://github.com/pulp/pulp/pull/4019/files.

Steps to Reproduce:
1. Invoke some bigger CV publish/promote (with more repos inside)
2. While there are pulp celery workers processing the sync/publish tasks, kill some of them via "kill -SIGHUP <pid>"
3. Check /var/log/messages
4. Check CV publish/promote task

The process hangs and never completes the publishing task. Here are the logs:


Feb 11 18:10:34 dhcp-2-174 pulp: py.warnings:WARNING: [ccf4ab5a] (33775-20800)   "MongoClient opened before fork. Create MongoClient "
Feb 11 18:10:34 dhcp-2-174 pulp: py.warnings:WARNING: [ccf4ab5a] (33775-20800)
Feb 11 18:10:34 dhcp-2-174 pulp: pulp.server.async.tasks:INFO: [ccf4ab5a] Task failed : [18ac52ec-9f3e-47bc-b34b-98734bee3656] : Worker terminated abnormally while processing task 18ac52ec-9f3e-47bc-b34
b-98734bee3656.  Check the logs for details
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800) Task pulp.server.async.tasks._release_resource[ccf4ab5a-af22-4d41-84c9-255085b7eded] raised unexpected: UnboundLocalErro
r("local variable 'original_formatted_traceback' referenced before assignment",)
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800) Traceback (most recent call last):
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 367, in trace_task
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)     R = retval = fun(*args, **kwargs)
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 108, in __call__
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)     return super(PulpTask, self).__call__(*args, **kwargs)
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 622, in __protected_call__
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)     return self.run(*args, **kwargs)
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 376, in _release_resource
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)     new_task.on_failure(exception, task_id, (), {}, MyEinfo)
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 779, in on_failure
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800)     _logger.debug(original_formatted_traceback)
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:ERROR: [ccf4ab5a] (33775-20800) UnboundLocalError: local variable 'original_formatted_traceback' referenced before assignment
Feb 11 18:10:34 dhcp-2-174 pulp: celery.worker.strategy:INFO: Received task: pulp.server.async.tasks._release_resource[308a2cd5-f900-434c-8a02-d2aeb3e86992]
Feb 11 18:10:34 dhcp-2-174 pulp: celery.app.trace:INFO: [fe61a18e] Task pulp.server.managers.repo.unit_association.associate_from_repo[fe61a18e-5a65-4a40-a282-eb614e5e64ef] succeeded in 0.0295602829992s
: {'units_successful': [], 'units_failed_signature_filter': []}


Version-Release number of selected component (if applicable):
pulp-server-2.21.5-2.el7sat.noarch

Comment 2 pulp-infra@redhat.com 2021-05-18 17:41:11 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 3 pulp-infra@redhat.com 2021-05-18 17:41:12 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 4 pulp-infra@redhat.com 2021-05-18 18:27:08 UTC
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 6 Brad Buckingham 2021-06-04 17:21:37 UTC
*** Bug 1962815 has been marked as a duplicate of this bug. ***

Comment 7 Lai 2021-06-07 18:44:13 UTC
Steps to test:

1. Invoke some bigger CV publish/promote (with more repos inside)
2. While there are pulp celery workers processing the sync/publish tasks, kill some of them via "kill -SIGHUP <pid>"
3. Check /var/log/messages
4. Verify in the web UI that the foreman task errors out
5. Verify that the task can be resumed and completes successfully

Expected result:
3. log should not show error message or traceback
4. Foreman task should show task error out
5. Task should complete successfull

Actual Result:
3. Log still shows traceback, but does not affect anything.
4. Task does show that it errors out as expected.
5. Task completes successfully

Verified on 6.9.3 with pulp-server-2.21.5.2-1.el7sat.noarch

Comment 13 errata-xmlrpc 2021-07-01 14:56:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Satellite 6.9.3 Async Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2636


Note You need to log in before you can comment on or make changes to this bug.