Bug 1400568 - Canceled jobs get processed anyway
Summary: Canceled jobs get processed anyway
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Candlepin
Classification: Community
Component: candlepin
Version: 0.9.51
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: candlepin-bugs
QA Contact: Katello QA List
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-01 14:01 UTC by Shayne Riley
Modified: 2016-12-08 16:26 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-12-08 16:26:39 UTC


Attachments (Terms of Use)

Description Shayne Riley 2016-12-01 14:01:43 UTC
Description of problem:

If I cancel a job, I do not expect it to run later. However this is exactly what happens.

I discovered this after canceling several thousand jobs in the queue. When I came back later, I looked at the list of jobs and saw that the number of canceled jobs was less than it was previously. I looked up the status of a job I *knew* I canceled, and now it is listed as "FINISHED".


Version-Release number of selected component (if applicable):


How reproducible:

Always, though if your job queue is processed fast enough, it may be hard to cancel a job fast enough before it gets processed.


Steps to Reproduce:
1. Create a job, for example refresh an owner's pools. Make note of the job id.
2. Cancel the job before it gets processed using the job id. The job status should be "CANCELED".
3. Wait for the job queue to catch up. If you retrieve the job details at the right times, the job status will switch to "RUNNING" and then to "FINISHED".

Actual results:

A canceled job will be processed.


Expected results:

A canceled job doesn't get processed, and remains cancelled

Comment 1 Shayne Riley 2016-12-01 14:24:14 UTC
Hmm. On further evaluation, this is not as reproducible as I thought. Perhaps our Candlepin's quartz state was messed up enough that it was processing the canceled jobs.

As it stands, my canceled jobs are now staying canceled, which is good.

If the project maintainer wants to mark this bug as could not reproduce, I'm fine with that for now.

Comment 2 Chris Snyder 2016-12-05 15:20:49 UTC
Do you have any more details on what caused the strange quartz state in this instance? For example, was there heavy load on the candlepin instance? Perhaps there were a lot of jobs created and cancelled quickly?

Comment 3 Shayne Riley 2016-12-05 15:52:23 UTC
This came about because none of the tasks (or at least it seemed like it) were getting processed. Querying the database directly, we saw that several jobs were executing, but never seemed to get finished. Meanwhile, there were 6000+ jobs that were waiting to run, many that were over a week old.

Unfortunately it's a bit of a mystery as to why it got backed up the way it did. I have a (weak) theory that a few orgs (with many subscriptions and consumers) had refresh pools tasks that clogged up the works, but that's a shot in the dark.

What I ended up doing was canceling those several thousand jobs. Eventually, we reset tomcat so that we could enable some advanced logging on the app and in quartz, and then the queue started getting processed... including the canceled ones. Thankfully they all processed fast enough that the queue emptied in a matter of hours.

Comment 4 Kevin Howell 2016-12-08 15:23:03 UTC
I'm inclined to close this without a reproducer. With no easy way to reproduce, there's not much we can do here. I am curious though: which environment was this in (if it was a dev environment, I'm less concerned).

Comment 6 Kevin Howell 2016-12-08 16:26:39 UTC
Since this was a non-prod environment, I'm closing for now. Feel free to reopen if you see this again, or if you come up with a reproducer.


Note You need to log in before you can comment on or make changes to this bug.