Bug 1127843
Summary: | can't create consumer connection to qpid after HeartbeatTimeout in heavy workload | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Jon Thomas <jthomas> |
Component: | openstack-nova | Assignee: | Russell Bryant <rbryant> |
Status: | CLOSED NOTABUG | QA Contact: | Ami Jeain <ajeain> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 4.0 | CC: | dyocum, jthomas, ndipanov, yeylon |
Target Milestone: | --- | Flags: | jthomas:
needinfo-
|
Target Release: | 5.0 (RHEL 7) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-08-13 15:07:49 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jon Thomas
2014-08-07 16:40:18 UTC
What is the qpid_heartbeat configuration option set to? It's the default Here's the contents of qpidd.conf cluster-mechanism=ANONYMOUS auth=no port=5672 max-connections=4096 worker-threads=500 connection-backlog=4096 max-negotiate-time=30000 OK, so this is likely 1 of 2 things: 1) The nova service that produced these errors was under very high load. In this case, there's nothing we can really do about that in code. It's a deployment issue (need to scale out). 2) The qpid broker was choking for some reason. Do you have any info on what kind of load qpidd is under? One recommendation I provided elsewhere for OS1 is to set qpid_topology_version=2 across the deployment. Without that, there is a bad leak that nova is causing in qpidd, which will cause it to fall over after some period of time. I suggest deploying this update and then seeing if this happens again. (In reply to Russell Bryant from comment #6) > OK, so this is likely 1 of 2 things: > > 1) The nova service that produced these errors was under very high load. In > this case, there's nothing we can really do about that in code. It's a > deployment issue (need to scale out). > > 2) The qpid broker was choking for some reason. Do you have any info on > what kind of load qpidd is under? One recommendation I provided elsewhere > for OS1 is to set qpid_topology_version=2 across the deployment. Without > that, there is a bad leak that nova is causing in qpidd, which will cause it > to fall over after some period of time. I suggest deploying this update and > then seeing if this happens again. With that said, I don't think there is much else we can do here. Let me know if we can help further. |