Bug 689573
| Summary: | pulp-agent running with near 100% cpu | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Retired] Pulp | Reporter: | Preethi Thomas <pthomas> | ||||
| Component: | user-experience | Assignee: | Jeff Ortel <jortel> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Preethi Thomas <pthomas> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | unspecified | CC: | skarmark | ||||
| Target Milestone: | --- | Keywords: | Triaged | ||||
| Target Release: | Sprint 22 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2011-08-16 12:07:12 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 647488 | ||||||
| Attachments: |
|
||||||
|
Description
Preethi Thomas
2011-03-21 19:48:43 UTC
from the agent.log
2011-03-21 17:19:20,359 [ERROR][Actions] __call__() @ action.py:117 - Enqueue capacity threshold exceeded on queue "2f074ce1-199a-42b9-acd9-4940e969ad05:0.0". (JournalImpl.cpp:587)(501)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/gofer/agent/action.py", line 115, in __call__
self.target()
File "/usr/lib/gofer/plugins/pulp.py", line 75, in heartbeat
p.send(topic, ttl=delay, agent=myid, next=delay)
File "/usr/lib/python2.7/site-packages/gofer/messaging/producer.py", line 56, in send
sender = self.session().sender(address)
File "<string>", line 6, in sender
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 579, in sender
sender._ewait(lambda: sender.linked)
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 786, in _ewait
result = self.session._ewait(lambda: self.error or predicate(), timeout)
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 553, in _ewait
result = self.connection._ewait(lambda: self.error or predicate(), timeout)
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 196, in _ewait
self.check_error()
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 189, in check_error
raise self.error
ConnectionError: Enqueue capacity threshold exceeded on queue "2f074ce1-199a-42b9-acd9-4940e969ad05:0.0". (JournalImpl.cpp:587)(501)
2011-03-21 17:19:30,377 [ERROR][Actions] __call__() @ action.py:117 - Enqueue capacity threshold exceeded on queue "2f074ce1-199a-42b9-acd9-4940e969ad05:0.0". (JournalImpl.cpp:587)(501)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/gofer/agent/action.py", line 115, in __call__
self.target()
File "/usr/lib/gofer/plugins/pulp.py", line 75, in heartbeat
p.send(topic, ttl=delay, agent=myid, next=delay)
File "/usr/lib/python2.7/site-packages/gofer/messaging/producer.py", line 56, in send
sender = self.session().sender(address)
File "<string>", line 6, in sender
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 579, in sender
sender._ewait(lambda: sender.linked)
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 786, in _ewait
result = self.session._ewait(lambda: self.error or predicate(), timeout)
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 553, in _ewait
result = self.connection._ewait(lambda: self.error or predicate(), timeout)
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 196, in _ewait
self.check_error()
File "/usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py", line 189, in check_error
raise self.error
ConnectionError: Enqueue capacity threshold exceeded on queue "2f074ce1-199a-42b9-acd9-4940e969ad05:0.0". (JournalImpl.cpp:587)(501)
Created attachment 486698 [details]
bkearney agent.log
bkearney's agent.log
After a bit of code profiling and collecting metrics (using timers), I found that the gofer Producer was leaking qpid senders. As a result, performance decreased and cpu usage increased with every message sent. Although, I haven't reproduced the more commonly reported cases of goferd using 100% of the CPU, the metrics collected before and after applying this patch suggest this was a likely culprit. In this case, I believe the filesystem being full caused errors in send heartbeat messages that was greatly exacerbated by leaking senders. This caused goferd to use 100% cpu. Related to: 689573 Fixed in gofer 0.26. Considering what i am seeing on my box with the gofer 0.26, I am moving it back to assigned. Seems to be related to qpid-cpp-server-store. The 0.158 build removes this dependency. Pulp users should: * uninstall qpid-cpp-server-store * restart qpidd & goferd Build: 0.158 I haven't see this issue since uninstalling qpid-cpp-server-store I am on [root@preethi unit]# rpm -q pulp pulp-0.0.172-1.fc14.noarch moving to verified Closing with Community Release 15 pulp-0.0.223-4. Closing with Community Release 15 pulp-0.0.223-4. |