Description of problem: with python-tooz-1.43.0-1.el7ost the following trace is seen from the metricd.log : 2017-06-26 08:51:51.323 141978 WARNING tooz.drivers.redis [-] Unable to heartbeat lock '<tooz.drivers.redis.RedisLock object at 0x1da2b50>' 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis Traceback (most recent call last): 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 506, in heartbeat 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis lock.heartbeat() 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 102, in heartbeat 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis self._lock.extend(self._lock.timeout) 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__ 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis self.gen.throw(type, value, traceback) 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 54, in _translate_failures 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis cause=e) 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 763, in raise_with_cause 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis excutils.raise_with_cause(exc_cls, message, *args, **kwargs) 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 143, in raise_with_cause 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get('cause')) 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis File "/usr/lib/python2.7/site-packages/six.py", line 692, in raise_from 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis raise value 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis ToozError: Cannot extend an unlocked lock 2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis 2017-06-26 08:52:03.360 141950 WARNING gnocchi.cli [-] Metric processing lagging scheduling rate. It is recommended to increase the number of workers or to lengthen processing interval. This got fixed upstream with [1]. Version-Release number of selected component (if applicable): python-tooz-1.43.0-1.el7ost How reproducible: unknown Steps to Reproduce: 1. unknown Actual results: Expected results: Additional info: [2] https://bugs.launchpad.net/gnocchi/+bug/1557593
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.
python-tooz-1.43.0-2.el7ost doesn't work: 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage [-] Unexpected error during measures processing 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage Traceback (most recent call last): 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 188, in process_background_tasks 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage self.process_new_measures(index, metrics, sync) 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage File "/usr/lib/python2.7/site-packages/gnocchi/storage/_carbonara.py", line 504, in process_new_measures 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage if lock.acquire(blocking=sync): 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 85, in acquire 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage with self._exclusive_access: 2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage AttributeError: 'RedisLock' object has no attribute '_exclusive_access' The backport doesn't work because it depends on an other change https://github.com/openstack/tooz/commit/486524c37fff2c826f934ac40fd07e2003074569
The release upstream 1.43.1 should work instead.
tooz seems to be fixed, no errors in metricd log
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1748
Thanks for the info Nilesh. I'd some more information to have a clue where this can come from. Can you check Redis logs to see if anything in particular is logged there? Do you see any pattern in the error that are raised? I see they are more or less raised every 30s. Does it happen every 30s or so? Are there gap? Does this happen on all controllers nodes or only a few? All the time? Feel free to send a sosreport + gnocchi logs :)
The one in External Tracker: https://review.openstack.org/#/c/478131/