Bug 1573597
| Summary: | Gnocchi unable to keep up with backlog and tracebacks in metricd log | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Leonid Natapov <lnatapov> |
| Component: | openstack-tripleo-heat-templates | Assignee: | Martin Magr <mmagr> |
| Status: | CLOSED ERRATA | QA Contact: | Leonid Natapov <lnatapov> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 13.0 (Queens) | CC: | apannu, apevec, jamsmith, jjoyce, jschluet, lhh, lnatapov, mabaakou, mburns, mmagr, pkilambi, sclewis |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | 13.0 (Queens) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-tripleo-heat-templates-8.0.2-20.el7ost gnocchi-4.2.3-3.el7ost | Doc Type: | Known Issue |
| Doc Text: |
A poorly performing Swift cluster used as a Gnocchi back end can generate 503 errors in the collectd log and "ConnectionError: ('Connection aborted.', CannotSendRequest())" errors in in gnocchi-metricd.conf.
To mitigate the problem, increase the value of the CollectdDefaultPollingInterval parameter or improve the Swift cluster performance.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-06-27 13:54:52 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
I have seen the backlog increase rapidly, and sometimes the connection to swift is dropped with the follow error messages:
2018-05-02 12:48:11,084 [27] ERROR gnocchi.storage: Error processing new measures
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 505, in process_new_measures
self._compute_and_store_timeseries(metric, measures)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 580, in _compute_and_store_timeseries
before_truncate_callback=_map_add_measures)
File "/usr/lib/python2.7/site-packages/gnocchi/carbonara.py", line 344, in set_values
before_truncate_callback(self)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 576, in _map_add_measures
for aggregation in agg_methods))
File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map
return list(executor.map(lambda args: fn(*args), list_of_args))
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator
yield future.result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
return self.__get_result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda>
return list(executor.map(lambda args: fn(*args), list_of_args))
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 413, in _add_measures
oldest_point_to_keep)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 298, in _store_timeserie_split
metric, [key], aggregation)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 235, in _get_measures_and_unserialize
raw_measures = self._get_measures(metric, keys, aggregation)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 133, in _get_measures
for key in keys))
File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map
return list(executor.map(lambda args: fn(*args), list_of_args))
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator
yield future.result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
return self.__get_result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda>
return list(executor.map(lambda args: fn(*args), list_of_args))
File "/usr/lib/python2.7/site-packages/gnocchi/storage/swift.py", line 149, in _get_measures_unbatched
key, aggregation, version))
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1799, in get_object
headers=headers)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
service_token=self.service_token, **kwargs)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1167, in get_object
conn.request(method, path, '', headers)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request
files=files, **self.requests_args)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request
return self.request_session.request(*arg, **kwarg)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 488, in send
raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', CannotSendRequest())
*** Bug 1577839 has been marked as a duplicate of this bug. *** Failed QA. Tested with openstack-tripleo-heat-templates-8.0.2-22.el7ost.noarch. Still get connection pool full. Changed Interval in collectd.yaml to 600 instead of 120 and re-deployed. With interval 600 looks ok. "connection pool full" is not a issue, only "Connection aborted." is. After a while even with the interval of 600, I see the following messages in the gnocchi-metricd.log:
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
service_token=self.service_token, **kwargs)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1167, in get_object
conn.request(method, path, '', headers)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request
files=files, **self.requests_args)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request
return self.request_session.request(*arg, **kwarg)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 488, in send
raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', CannotSendRequest())
2018-05-22 09:40:41,633 [30] ERROR gnocchi.storage: Error processing new measures
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 505, in process_new_measures
self._compute_and_store_timeseries(metric, measures)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 580, in _compute_and_store_timeseries
before_truncate_callback=_map_add_measures)
File "/usr/lib/python2.7/site-packages/gnocchi/carbonara.py", line 344, in set_values
before_truncate_callback(self)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 576, in _map_add_measures
for aggregation in agg_methods))
File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map
return list(executor.map(lambda args: fn(*args), list_of_args))
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator
yield future.result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
return self.__get_result()
File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda>
return list(executor.map(lambda args: fn(*args), list_of_args))
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 413, in _add_measures
oldest_point_to_keep)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 328, in _store_timeserie_split
data, offset=offset)
File "/usr/lib/python2.7/site-packages/gnocchi/storage/swift.py", line 120, in _store_metric_measures
data)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1842, in put_object
response_dict=response_dict)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
service_token=self.service_token, **kwargs)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1330, in put_object
conn.request('PUT', path, contents, headers)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request
files=files, **self.requests_args)
File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request
return self.request_session.request(*arg, **kwarg)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 488, in send
raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', CannotSendRequest())
2018-05-22 09:41:24,963 [33] INFO gnocchi.cli.metricd: 0 measurements bundles across 0 metrics wait to be processed.
No backtraces in the gnocchi-metricd.log. With on swift node it's overloaded but it works. we don't reach 0 in the :openstack metric status", but it's OK, the backlog never go too high. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086 |
Description of problem: Gnocchi unable to keep up with backlog and tracebacks in metricd log in case where both ceilometer and collectd are writing to gnocchi, gnocchi unable to keep up with backlog and tracebacks in metricd log. ----- Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 505, in process_new_measures self._compute_and_store_timeseries(metric, measures) File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 580, in _compute_and_store_timeseries before_truncate_callback=_map_add_measures) File "/usr/lib/python2.7/site-packages/gnocchi/carbonara.py", line 344, in set_values before_truncate_callback(self) File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 576, in _map_add_measures for aggregation in agg_methods)) File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map return list(executor.map(lambda args: fn(*args), list_of_args)) File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator yield future.result() File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result return self.__get_result() File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run result = self.fn(*self.args, **self.kwargs) File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda> return list(executor.map(lambda args: fn(*args), list_of_args)) File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 413, in _add_measures oldest_point_to_keep) File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 298, in _store_timeserie_split metric, [key], aggregation) File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 235, in _get_measures_and_unserialize raw_measures = self._get_measures(metric, keys, aggregation) File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 133, in _get_measures for key in keys)) File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map return list(executor.map(lambda args: fn(*args), list_of_args)) File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator yield future.result() File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result return self.__get_result() File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run result = self.fn(*self.args, **self.kwargs) File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda> return list(executor.map(lambda args: fn(*args), list_of_args)) File "/usr/lib/python2.7/site-packages/gnocchi/storage/swift.py", line 149, in _get_measures_unbatched key, aggregation, version)) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1799, in get_object headers=headers) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry service_token=self.service_token, **kwargs) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1167, in get_object conn.request(method, path, '', headers) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request files=files, **self.requests_args) File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request