Bug 1110818

Summary: pulp_* services may fail to startup if mongo is not fully started
Product: [Retired] Pulp Reporter: Justin Sherrill <jsherril>
Component: z_otherAssignee: Brian Bouterse <bmbouter>
Status: CLOSED CURRENTRELEASE QA Contact: Preethi Thomas <pthomas>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 2.4.0CC: bmbouter, pthomas, skarmark
Target Milestone: ---Keywords: Triaged
Target Release: 2.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1110854 (view as bug list) Environment:
Last Closed: 2014-08-09 06:55:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1110854    

Description Justin Sherrill 2014-06-18 13:59:55 UTC
Description of problem:

It appears that on system reboot, mongo starts just before the pulp_* services (celerybeat, workers, resource_manager) but may not be initialized when the pulp services start.

The result is that the server starts up and none of those pulp_* services are running.

In the logs you'll see:


Jun 17 20:00:01 box pulp: celery.beat:CRITICAL: beat raised exception <class 'pymongo.errors.ConnectionFailure'>: ConnectionFailure('could not connect to localhost:27017: [Errno 111] Connection refused',)
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL: Traceback (most recent call last):
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/apps/beat.py", line 112, in start_scheduler
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:     beat.start()
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/beat.py", line 453, in start
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL: humanize_seconds(self.scheduler.max_interval))
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/kombu/utils/__init__.py", line 322, in __get__
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:     value = obj.__dict__[self.__name__] = self.__get(obj)
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/beat.py", line 491, in scheduler
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:     return self.get_scheduler()
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/beat.py", line 486, in get_scheduler
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:     lazy=lazy)
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/utils/imports.py", line 53, in instantiate
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:     return symbol_by_name(name)(*args, **kwargs)
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/pulp/server/async/scheduler.py", line 282, in __init__
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:     super(Scheduler, self).__init__(*args, **kwargs)
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/celery/beat.py", line 184, in __init__
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL: self.setup_schedule()
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/pulp/server/async/scheduler.py", line 328, in setup_schedule
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL: db_connection.initialize()
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 67, in initialize
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:     _CONNECTION = pymongo.MongoClient(seeds, **connection_kwargs)
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:   File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 337, in __init__
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL:     raise ConnectionFailure(str(e))
Jun 17 20:00:01 box pulp: celery.beat:CRITICAL: ConnectionFailure: could not connect to localhost:27017: [Errno 111] Connection refused



Version-Release number of selected component (if applicable):
pulp-server-2.4.0-0.20.beta.el6.noarch

How reproducible:
always

Steps to Reproduce:
1.  On a not so fast box, install and configure pulp
2.  Reboot the box
3.  On startup check to make sure the pulp_* services are running.  IF they are repeat step 2.


Another way:

1.  Shut down mongod
2.  Restart pulp_celerybeat
3.  Start mongo
4.  Check to see if pulp_celerybeat is running

Comment 1 Brian Bouterse 2014-06-18 14:10:45 UTC
I believe httpd also should be looked into in addition to pulp_* services to ensure all services do the following:

1. Start normally without mongod running
2. A resilient against mongod being stopped for long periods of time

Comment 2 Brian Bouterse 2014-06-23 23:00:58 UTC
PR available at:  https://github.com/pulp/pulp/pull/1019

Comment 3 Brian Bouterse 2014-06-24 20:09:32 UTC
Merged to 2.4, but not merged to master because of merge conflict. rbarlow indicated he would resolve conflict and merge 2.4 to master.

Comment 4 Brian Bouterse 2014-06-25 15:27:20 UTC
A new PR was introduced to modify the behavior of PR 1019. The new PR is:

https://github.com/pulp/pulp/pull/1023

PR 1023 has been merged to pulp-2.4, and pulp-2.4 has been merged to master.

Comment 5 Randy Barlow 2014-06-25 23:24:16 UTC
Fixed in 2.4.0-0.23.beta.

Comment 6 Preethi Thomas 2014-06-30 11:36:52 UTC
verified

[root@mgmt3 yum.repos.d]# rpm -qa pulp-server
pulp-server-2.4.0-0.23.beta.el6.noarch
[root@mgmt3 yum.repos.d]# 

[root@mgmt3 yum.repos.d]# service pulp_celerybeat start
celery init v10.0.
Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat
Starting pulp_celerybeat...
[root@mgmt3 yum.repos.d]# service mongod status
mongod is stopped
[root@mgmt3 yum.repos.d]# service httpd start
Starting httpd: [  OK  ]
[root@mgmt3 yum.repos.d]#

Comment 7 Randy Barlow 2014-08-09 06:55:45 UTC
This has been fixed in Pulp 2.4.0-1.