Bug 1098734
Summary: | No available Queue exception when running pulp commands | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Pulp | Reporter: | Preethi Thomas <pthomas> | ||||||
Component: | z_other | Assignee: | Brian Bouterse <bmbouter> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | pulp-qe-list | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 2.4 Beta | CC: | bmbouter, ipanova, mhrivnak | ||||||
Target Milestone: | --- | Keywords: | Triaged | ||||||
Target Release: | 2.4.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2014-05-25 18:20:47 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 896668 [details]
log
I already encountered this issue but in the older builds, since build 2.4.0-0.10.beta have never seen it again... maybe it can be related to this bug https://bugzilla.redhat.com/show_bug.cgi?id=1088060 babysit() was removed from the pulp code with beta 16, so this can't be related to the issue: https://bugzilla.redhat.com/show_bug.cgi?id=1088060 I'm investigating the root cause today. I'll post my findings back on this ticket. In experimenting on the system, I believe that the mongo exception below is occurring some of the time and is causing celerybeat to die. If celerybeat dies then celery events stop being processed, which causes Pulp to believe (in 5 minutes) that workers have gone offline because heartbeats stop arriving. Once new work arrives at the system, the resource manager believes there are no workers, and raises a the exception in the but report. May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: beat raised exception <class 'pymongo.errors.AutoReconnect'>: AutoReconnect('[Errno 9] Bad file descriptor',) May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: Traceback (most recent call last): May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib/python2.6/site-packages/celery/apps/beat.py", line 112, in start_scheduler May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: beat.start() May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib/python2.6/site-packages/celery/beat.py", line 462, in start May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: interval = self.scheduler.tick() May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib/python2.6/site-packages/pulp/server/async/scheduler.py", line 303, in tick May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: ret = super(Scheduler, self).tick() May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib/python2.6/site-packages/celery/beat.py", line 219, in tick May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: for entry in values(self.schedule): May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib/python2.6/site-packages/pulp/server/async/scheduler.py", line 366, in schedule May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: if self._schedule is None or self.schedule_changed: May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib/python2.6/site-packages/pulp/server/async/scheduler.py", line 348, in schedule_changed May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: if utils.get_enabled().count() != self._loaded_from_db_count: May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib64/python2.6/site-packages/pymongo/cursor.py", line 566, in count May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: **command) May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib64/python2.6/site-packages/pymongo/database.py", line 388, in command May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: result = self["$cmd"].find_one(command, **extra_opts) May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib64/python2.6/site-packages/pymongo/collection.py", line 596, in find_one May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: for result in self.find(spec_or_id, *args, **kwargs).limit(-1): May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib64/python2.6/site-packages/pymongo/cursor.py", line 814, in next May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: if len(self.__data) or self._refresh(): May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib64/python2.6/site-packages/pymongo/cursor.py", line 763, in _refresh May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: self.__uuid_subtype)) May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib64/python2.6/site-packages/pymongo/cursor.py", line 700, in __send_message May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: **kwargs) May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib/python2.6/site-packages/pulp/server/db/connection.py", line 166, in _with_end_request May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: return method(*args, **kwargs) May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: File "/usr/lib64/python2.6/site-packages/pymongo/mongo_client.py", line 994, in _send_message_with_response May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: raise AutoReconnect(str(e)) May 19 10:37:10 dell-pe2950-02 pulp: celery.beat:CRITICAL: AutoReconnect: [Errno 9] Bad file descriptor PR available at: https://github.com/pulp/pulp/pull/979 Merged Fixed in 2.4.0-0.17.beta. Closing this one because 1100005 is the same issue and has more details. *** This bug has been marked as a duplicate of bug 1100005 *** |
Created attachment 896667 [details] admin.log Description of problem: Not sure what is causing this. I have run into this exception a couple of times on 2 different servers. May 17 15:47:52 dell-pe2950-02 pulp: celery.worker.job:ERROR: NoAvailableQueues: There are no available queues in the system for reserved task work. [root@dell-pe2950-02 ~]# pulp-admin rpm repo sync run --repo-id errata +----------------------------------------------------------------------+ Synchronizing Repository [errata] +----------------------------------------------------------------------+ An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. [root@dell-pe2950-02 ~]# [root@pulp-24-server ~]# pulp-admin rpm repo copy all -f zoo -t zoo-copy An internal error occurred on the Pulp server. More information can be found in the client log file ~/.pulp/admin.log. Version-Release number of selected component (if applicable): [root@dell-pe2950-02 ~]# rpm -qa pulp-server pulp-server-2.4.0-0.16.beta.el6.noarch [root@dell-pe2950-02 ~]# How reproducible: Random Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: