Bug 1526437

Summary: Tasks stuck in waiting after restart of pulp services
Product: Red Hat Satellite Reporter: Chris Duryee <cduryee>
Component: PulpAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: jcallaha
Severity: high Docs Contact:
Priority: high    
Version: 6.3.0CC: ajoseph, andrew.schofield, bkearney, bmbouter, brubisch, cduryee, daviddavis, dkliban, ggainey, ipanova, jentrena, kabbott, ktordeur, mhrivnak, mmccune, pcreech, peter.vreman, pmorey, rchan, sthirugn, ttereshc, zhunting
Target Milestone: UnspecifiedKeywords: FieldEngineering, PrioBumpField, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pulp-2.8.7.20-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1552118 (view as bug list) Environment:
Last Closed: 2018-05-21 20:16:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1122832, 1552118    
Attachments:
Description Flags
verification screenshot none

Description Chris Duryee 2017-12-15 13:18:48 UTC
Description of problem:

(cloned from https://pulp.plan.io/issues/2835)

Summary
=======

Restarting a worker that is currently executing a task will leave that workers in a broken state. This issue can reproduced on both Celery 3.1.x and Celery 4.x, but only while using Qpid as a broker. I was not able to reproduce this issue while using RabbitMQ as a broker, using either version of Celery. I was also not able to reproduce this issue on versions of Pulp prior to 2.13. The means of shutting down the workers also does not appear to matter, e.g. "systemctl restart" and "pkill -9 celery; prestart" both work the same.

Repro
=====
1. Start pulp
2. Begin a task (e.g. sync)
3. While the task is running, restart the pulp worker running the task
4. After the worker has restarted, begin another task
5. Observe that the tasks are perpetually stuck in waiting


Version-Release number of selected component (if applicable): 6.3 beta, pulp-server-2.13.4.4


How reproducible: every time

Comment 2 pulp-infra@redhat.com 2017-12-15 13:32:46 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 3 pulp-infra@redhat.com 2017-12-15 13:32:48 UTC
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.

Comment 7 pulp-infra@redhat.com 2017-12-18 15:03:12 UTC
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.

Comment 8 pulp-infra@redhat.com 2018-01-22 18:13:59 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 9 pulp-infra@redhat.com 2018-01-29 17:02:31 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 10 pulp-infra@redhat.com 2018-01-29 17:32:09 UTC
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 12 pulp-infra@redhat.com 2018-02-20 18:32:55 UTC
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.

Comment 13 pulp-infra@redhat.com 2018-02-28 02:32:33 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Comment 14 sthirugn@redhat.com 2018-03-06 14:25:22 UTC
Created https://bugzilla.redhat.com/show_bug.cgi?id=1552118 for 6.3

Comment 21 jcallaha 2018-05-21 04:13:58 UTC
Verified in Satellite 6.2.15 Snap 3.

Started a repo sync.
Waited until it was syncing packages.
performed a    service pulp_workers stop

Observed that the remaining steps were skipped and the task was stopped with a warning that the task was cancelled. See attached screenshot for verification.

Comment 22 jcallaha 2018-05-21 04:14:29 UTC
Created attachment 1439365 [details]
verification screenshot

Comment 25 errata-xmlrpc 2018-05-21 20:16:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1672