Bug 1337655
Summary: | Ceilometer exceptions filling up log | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Chris Dearborn <christopher_dearborn> |
Component: | rhosp-director | Assignee: | Eoghan Glynn <eglynn> |
Status: | CLOSED WORKSFORME | QA Contact: | Arik Chernetsky <achernet> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 8.0 (Liberty) | CC: | arkady_kanevsky, cdevine, christopher_dearborn, dbecker, eglynn, gael_rehault, John_walsh, kasmith, kurt_hey, mburns, morazi, pkilambi, randy_perryman, rhel-osp-director-maint, rsussman, smerrow, sreichar, wayne_allen |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | 8.0 (Liberty) | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-12-12 14:29:03 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1261979 |
Description
Chris Dearborn
2016-05-19 18:18:18 UTC
The exception: May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination [-] Error connecting to coordination backend. May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination Traceback (most recent call last): May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/ceilometer/coordination.py", line 73, in start May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination self._coordinator.start() May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 197, in start May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination self._start() May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 415, in _start May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination self._server_info = self._client.info() May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__ May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination self.gen.throw(type, value, traceback) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 50, in _translate_failures May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination cause=e) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 527, in raise_with_cause May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination excutils.raise_with_cause(exc_cls, message, *args, **kwargs) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 143, in raise_with_cause May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get('cause')) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/six.py", line 692, in raise_from May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination raise value May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination ToozConnectionError: Error while reading from socket: ('Connection closed by server.',) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.870 39518 ERROR ceilometer.coordination May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination [-] Error sending a heartbeat to coordination backend. May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination Traceback (most recent call last): May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/ceilometer/coordination.py", line 101, in heartbeat May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination self._coordinator.heartbeat() May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 479, in heartbeat May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination value=self.STILL_ALIVE) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__ May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination self.gen.throw(type, value, traceback) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 50, in _translate_failures May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination cause=e) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 527, in raise_with_cause May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination excutils.raise_with_cause(exc_cls, message, *args, **kwargs) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 143, in raise_with_cause May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get('cause')) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination File "/usr/lib/python2.7/site-packages/six.py", line 692, in raise_from May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination raise value May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination ToozConnectionError: Error while reading from socket: ('Connection closed by server.',) May 18 13:39:51 r13-controller-0.localdomain ceilometer-alarm-evaluator[39518]: 2016-05-18 13:39:51.873 39518 ERROR ceilometer.coordination Eoghan, do you have someone who can look at this? Prad, any thoughts? hmm looks like the redis coordination backend is not up or reachable? Few things to check: * Make sure redis coordination url is configured correctly in ceilometer.conf * the ip for the redis backend is reachable from the controller node. Also make sure python-redis is installed. We had a bug iirc where it was not installed automatically. Chris: Is this still an issue? Were you ever able to follow Pradeep's suggestions for debugging the issue? Not sure if this is still an issue or not. I saw the tips above, but thought that eglynn or pkilambi was trying them since the needinfo was set to them. As a result, I didn't try them. I'm working on deploying OSP10 now, so I will check that deployment when it finishes. Just checked an OSP10 GA2 deployment, and this issue is no longer occurring. Closing. |