Bug 1096289 - Restarting pulp_celerybeat sometimes fails
Restarting pulp_celerybeat sometimes fails
Status: CLOSED CURRENTRELEASE
Product: Pulp
Classification: Community
Component: async/tasks (Show other bugs)
2.4.0
Unspecified Unspecified
high Severity high
: ---
: 2.4.0
Assigned To: Brian Bouterse
Preethi Thomas
: Triaged
Depends On:
Blocks: 950743
  Show dependency treegraph
 
Reported: 2014-05-09 11:14 EDT by Justin Sherrill
Modified: 2014-08-19 23:31 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-08-09 02:55:36 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Traceback (5.39 KB, text/plain)
2014-05-09 11:14 EDT, Justin Sherrill
no flags Details

  None (edit)
Description Justin Sherrill 2014-05-09 11:14:58 EDT
Created attachment 894068 [details]
Traceback

Description of problem:

About 20-30% of the time, pulp_celerybeat will fail to start.

Version-Release number of selected component (if applicable):
pulp-server-2.4.0-0.13.beta.el6.noarch

on Centos 6.5 (assuming it occurs on rhel 6.5 as well).

How reproducible:
20-30% of the time


Steps to Reproduce:
1. Restart the pulp_celerybeat service
2. check 'ps aux | grep celerybeat'


Actual results:
Sometimes it won't be running

Expected results:
It is always running


Additional info:


Attaching the traceback i see when it fails to start
Comment 1 Justin Sherrill 2014-05-09 11:16:44 EDT
Also to note, i NEVER see this error when running the process in the foreground with:

sudo -u apache /usr/bin/python /usr/bin/celery beat --app=pulp.server.async.app --workdir=/var/lib/pulp/celery/ -f /var/log/pulp/celerybeat.log -l INFO --pidfile=/var/run/pulp/celerybeat.pid
Comment 2 Randy Barlow 2014-05-12 10:43:36 EDT
I believe this bug could be related to #1096539[0], but I am not sure. I think Brian would be more effective at investigating this than I am.

For what it's worth, I had a hard time reproducing this problem on Fedora Rawhide, so it could also be a difference in init system or the version of qpidd.

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1096539
Comment 3 Justin Sherrill 2014-05-12 10:54:48 EDT
Randy:  I don't think its related as restarting as when restarting just pulp_celerybeat 10 times in a row, it would fail 2-3 times with that traceback.

The entire time qpid and mongo were all up.  The failures were not consecutive either.
Comment 4 Brian Bouterse 2014-05-15 11:13:44 EDT
I also don't think it's related.  I'd like to retest using this PR:

https://github.com/pulp/pulp/pull/967
Comment 5 Brian Bouterse 2014-05-15 11:27:24 EDT
Merged
Comment 6 Randy Barlow 2014-05-15 13:00:12 EDT
The fix for this bug is included in the pulp-2.4.0-0.14.beta builds.
Comment 7 Preethi Thomas 2014-07-01 06:31:51 EDT
verified
[root@mgmt3 ~]# rpm -q pulp-server
pulp-server-2.4.0-0.23.beta.el6.noarch

[root@mgmt3 ~]# service pulp_celerybeat restart
celery init v10.0.
Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat
Restarting celery periodic task scheduler
Stopping pulp_celerybeat... OK
Starting pulp_celerybeat...
[root@mgmt3 ~]# ps aux | grep celerybeat
apache   21626  5.6  0.3 727540 25512 ?        Sl   06:27   0:00 /usr/bin/python /usr/bin/celery beat --scheduler=pulp.server.async.scheduler.Scheduler --workdir=/var/lib/pulp/celery/ -f /var/log/pulp/celerybeat.log -l INFO --detach --pidfile=/var/run/pulp/celerybeat.pid
root     21635  0.0  0.0 103252   816 pts/3    S+   06:27   0:00 grep celerybeat
[root@mgmt3 ~]#
Comment 8 Randy Barlow 2014-08-09 02:55:36 EDT
This has been fixed in Pulp 2.4.0-1.

Note You need to log in before you can comment on or make changes to this bug.