Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 767271 - Need better error handling for delayed_jobs
Summary: Need better error handling for delayed_jobs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: API
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: Unspecified
Assignee: Tomas Strachota
QA Contact: Katello QA List
URL:
Whiteboard:
Depends On:
Blocks: katello-blockers
TreeView+ depends on / blocked
 
Reported: 2011-12-13 17:15 UTC by Mike McCune
Modified: 2019-09-26 13:26 UTC (History)
3 users (show)

Fixed In Version: katello-cli-0.1.29-1, katello-0.1.145-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-22 18:13:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Mike McCune 2011-12-13 17:15:09 UTC
If an async job like a promotion fails the only thing we have is a record in the task_statues table but no mention in a logfile, a notification or perhaps even an email.

For example, if a user starts a promotion of content and something goes wrong (pulp error, disk full, etc) they have no idea it failed and the only thing they can do is look in the database itself for errors using this procedure:

https://fedorahosted.org/katello/wiki/TaskStatuses

We need a few things to better handle this:

1) First we should dump all errors and exceptions into a logfile:  /var/log/katello/delayed_job.log  would be a fine location.  This would elliminate the need to query the DB for the stacktrace

2) We should consider logging all exceptions as a notification to all users in an Org so they would see a red Error in the UI

if we could get (1) ASAP that would help greatly for debugging

Comment 1 Lukas Zapletal 2011-12-15 08:50:15 UTC
I believe Tomas just fixed this.

Comment 2 Tomas Strachota 2011-12-15 09:25:58 UTC
We are now raising exceptions when any of promotion subtasks fails.

1542d8c2 async tasks - raising exception when a task fails while waiting until it finishes

I just realized that logging of delayed job exceptions is now enabled in development environment only. I'll enable it for production.

One more issue is in cli command 'product promote' where we don't check result of the promotion and blindly print 'success'. I'm taking this one as well.

I'm leaving notifications on UI folks to implement it in their controller.

Comment 3 Tomas Strachota 2011-12-15 16:10:13 UTC
Fixed the above two backend issues in following commits

fc8f2931
767271 - logging for delayed jobs enabled in all environments

cc45e423
767271 - message after 'product promote' takes promotion failure into account


I created new bz for the UI part of this story (#768047).
Moving this one ON_QA.

Comment 4 Mike McCune 2011-12-15 19:42:28 UTC
nice work guys, thanks for the fast turnaround

Comment 6 Corey Welton 2012-02-14 02:59:12 UTC
QA Verified via the UI, in part through verifying bug #768047


Note You need to log in before you can comment on or make changes to this bug.