Bug 1573597 - Gnocchi unable to keep up with backlog and tracebacks in metricd log
Summary: Gnocchi unable to keep up with backlog and tracebacks in metricd log
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 13.0 (Queens)
Assignee: Martin Magr
QA Contact: Leonid Natapov
URL:
Whiteboard:
: 1577839 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-01 18:33 UTC by Leonid Natapov
Modified: 2023-02-22 23:02 UTC (History)
12 users (show)

Fixed In Version: openstack-tripleo-heat-templates-8.0.2-20.el7ost gnocchi-4.2.3-3.el7ost
Doc Type: Known Issue
Doc Text:
A poorly performing Swift cluster used as a Gnocchi back end can generate 503 errors in the collectd log and "ConnectionError: ('Connection aborted.', CannotSendRequest())" errors in in gnocchi-metricd.conf. To mitigate the problem, increase the value of the CollectdDefaultPollingInterval parameter or improve the Swift cluster performance.
Clone Of:
Environment:
Last Closed: 2018-06-27 13:54:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1771083 0 None None None 2018-05-14 10:40:47 UTC
OpenStack gerrit 568241 0 None MERGED Enable default polling interval override 2020-12-07 10:32:29 UTC
OpenStack gerrit 568677 0 None MERGED Enable default polling interval override 2020-12-07 10:32:56 UTC
Red Hat Product Errata RHEA-2018:2086 0 None None None 2018-06-27 13:55:51 UTC

Description Leonid Natapov 2018-05-01 18:33:44 UTC
Description of problem:

Gnocchi unable to keep up with backlog and tracebacks in metricd log

in case where both ceilometer and collectd are writing to gnocchi, gnocchi unable to keep up with backlog and tracebacks in metricd log.

-----

Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 505, in process_new_measures
    self._compute_and_store_timeseries(metric, measures)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 580, in _compute_and_store_timeseries
    before_truncate_callback=_map_add_measures)
  File "/usr/lib/python2.7/site-packages/gnocchi/carbonara.py", line 344, in set_values
    before_truncate_callback(self)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 576, in _map_add_measures
    for aggregation in agg_methods))
  File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map
    return list(executor.map(lambda args: fn(*args), list_of_args))
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator
    yield future.result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
    return self.__get_result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda>
    return list(executor.map(lambda args: fn(*args), list_of_args))
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 413, in _add_measures
    oldest_point_to_keep)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 298, in _store_timeserie_split
    metric, [key], aggregation)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 235, in _get_measures_and_unserialize
    raw_measures = self._get_measures(metric, keys, aggregation)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 133, in _get_measures
    for key in keys))
  File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map
    return list(executor.map(lambda args: fn(*args), list_of_args))
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator
    yield future.result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
    return self.__get_result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda>
    return list(executor.map(lambda args: fn(*args), list_of_args))
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/swift.py", line 149, in _get_measures_unbatched
    key, aggregation, version))
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1799, in get_object
    headers=headers)
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
    service_token=self.service_token, **kwargs)
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1167, in get_object
    conn.request(method, path, '', headers)
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request
    files=files, **self.requests_args)
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request

Comment 2 Mehdi ABAAKOUK 2018-05-02 12:50:01 UTC
I have seen the backlog increase rapidly, and sometimes the connection to swift is dropped with the follow error messages:

2018-05-02 12:48:11,084 [27] ERROR    gnocchi.storage: Error processing new measures
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 505, in process_new_measures
    self._compute_and_store_timeseries(metric, measures)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 580, in _compute_and_store_timeseries
    before_truncate_callback=_map_add_measures)
  File "/usr/lib/python2.7/site-packages/gnocchi/carbonara.py", line 344, in set_values
    before_truncate_callback(self)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 576, in _map_add_measures
    for aggregation in agg_methods))
  File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map
    return list(executor.map(lambda args: fn(*args), list_of_args))
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator
    yield future.result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
    return self.__get_result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda>
    return list(executor.map(lambda args: fn(*args), list_of_args))
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 413, in _add_measures
    oldest_point_to_keep)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 298, in _store_timeserie_split
    metric, [key], aggregation)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 235, in _get_measures_and_unserialize
    raw_measures = self._get_measures(metric, keys, aggregation)
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 133, in _get_measures
    for key in keys))
  File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map
    return list(executor.map(lambda args: fn(*args), list_of_args))
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator
    yield future.result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
    return self.__get_result()
  File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda>
    return list(executor.map(lambda args: fn(*args), list_of_args))
  File "/usr/lib/python2.7/site-packages/gnocchi/storage/swift.py", line 149, in _get_measures_unbatched
    key, aggregation, version))
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1799, in get_object
    headers=headers)
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
    service_token=self.service_token, **kwargs)
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1167, in get_object
    conn.request(method, path, '', headers)
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request
    files=files, **self.requests_args)
  File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request
    return self.request_session.request(*arg, **kwarg)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 488, in send
    raise ConnectionError(err, request=request)
ConnectionError: ('Connection aborted.', CannotSendRequest())

Comment 3 Martin Magr 2018-05-14 10:40:10 UTC
*** Bug 1577839 has been marked as a duplicate of this bug. ***

Comment 15 Leonid Natapov 2018-05-22 06:23:44 UTC
Failed QA.

Tested with openstack-tripleo-heat-templates-8.0.2-22.el7ost.noarch.

Still get connection pool full.

Changed Interval in collectd.yaml to 600 instead of 120 and re-deployed.

With interval 600 looks ok.

Comment 16 Mehdi ABAAKOUK 2018-05-22 09:06:36 UTC
"connection pool full" is not a issue, only "Connection aborted." is.

Comment 17 Leonid Natapov 2018-05-22 10:16:41 UTC
After a while even with the interval of 600, I see the following messages in the gnocchi-metricd.log:

    File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
        service_token=self.service_token, **kwargs)
      File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1167, in get_object
        conn.request(method, path, '', headers)
      File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request
        files=files, **self.requests_args)
      File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request
        return self.request_session.request(*arg, **kwarg)
      File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request
        resp = self.send(prep, **send_kwargs)
      File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
        r = adapter.send(request, **kwargs)
      File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 488, in send
        raise ConnectionError(err, request=request)
    ConnectionError: ('Connection aborted.', CannotSendRequest())
    2018-05-22 09:40:41,633 [30] ERROR    gnocchi.storage: Error processing new measures
    Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 505, in process_new_measures
        self._compute_and_store_timeseries(metric, measures)
      File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 580, in _compute_and_store_timeseries
        before_truncate_callback=_map_add_measures)
      File "/usr/lib/python2.7/site-packages/gnocchi/carbonara.py", line 344, in set_values
        before_truncate_callback(self)
      File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 576, in _map_add_measures
        for aggregation in agg_methods))
      File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in parallel_map
        return list(executor.map(lambda args: fn(*args), list_of_args))
      File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 605, in result_iterator
        yield future.result()
      File "/usr/lib/python2.7/site-packages/concurrent/futures/_base.py", line 422, in result
        return self.__get_result()
      File "/usr/lib/python2.7/site-packages/concurrent/futures/thread.py", line 62, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/usr/lib/python2.7/site-packages/gnocchi/utils.py", line 312, in <lambda>
        return list(executor.map(lambda args: fn(*args), list_of_args))
      File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 413, in _add_measures
        oldest_point_to_keep)
      File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 328, in _store_timeserie_split
        data, offset=offset)
      File "/usr/lib/python2.7/site-packages/gnocchi/storage/swift.py", line 120, in _store_metric_measures
        data)
      File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1842, in put_object
        response_dict=response_dict)
      File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1691, in _retry
        service_token=self.service_token, **kwargs)
      File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 1330, in put_object
        conn.request('PUT', path, contents, headers)
      File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 439, in request
        files=files, **self.requests_args)
      File "/usr/lib/python2.7/site-packages/swiftclient/client.py", line 422, in _request
        return self.request_session.request(*arg, **kwarg)
      File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 518, in request
        resp = self.send(prep, **send_kwargs)
      File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 639, in send
        r = adapter.send(request, **kwargs)
      File "/usr/lib/python2.7/site-packages/requests/adapters.py", line 488, in send
        raise ConnectionError(err, request=request)
    ConnectionError: ('Connection aborted.', CannotSendRequest())
    2018-05-22 09:41:24,963 [33] INFO     gnocchi.cli.metricd: 0 measurements bundles across 0 metrics wait to be processed.

Comment 20 Leonid Natapov 2018-05-29 10:48:30 UTC
No backtraces in the gnocchi-metricd.log.
With on swift node it's overloaded but it works.
we don't reach 0 in the :openstack metric status", but it's OK, the backlog never go too high.

Comment 22 errata-xmlrpc 2018-06-27 13:54:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.