1465385 – gnocchi-metricsd unable to extend redis lock

Bug 1465385 - gnocchi-metricsd unable to extend redis lock

Summary: gnocchi-metricsd unable to extend redis lock

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	python-tooz
Sub Component:
Version:	10.0 (Newton)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	z5
Target Release:	10.0 (Newton)
Assignee:	RHOS Maint
QA Contact:	Sasha Smolyak
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1472407
TreeView+	depends on / blocked

Reported:	2017-06-27 11:01 UTC by Martin Schuppert
Modified:	2022-08-09 14:04 UTC (History)
CC List:	14 users (show)
Fixed In Version:	python-tooz-1.43.1-1.el7ost
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1472407 (view as bug list)
Environment:
Last Closed:	2017-10-05 07:27:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Launchpad	1557593	None	None	None	2017-06-27 11:01:20 UTC
OpenStack gerrit	478131	None	MERGED	redis: fix concurrent access on acquire()	2020-10-08 03:58:48 UTC
Red Hat Issue Tracker	OSP-8582	None	None	None	2022-08-09 14:04:14 UTC
Red Hat Knowledge Base (Solution)	3114671	None	None	None	2017-07-14 09:49:06 UTC
Red Hat Product Errata	RHBA-2017:1748	normal	SHIPPED_LIVE	Red Hat OpenStack Platform 10 Bug Fix and Enhancement Advisory	2017-07-12 18:07:30 UTC

Description Martin Schuppert 2017-06-27 11:01:20 UTC

Description of problem:

with python-tooz-1.43.0-1.el7ost the following trace is seen from the metricd.log :

2017-06-26 08:51:51.323 141978 WARNING tooz.drivers.redis [-] Unable to heartbeat lock '<tooz.drivers.redis.RedisLock object at 0x1da2b50>'
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis Traceback (most recent call last):
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis   File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 506, in heartbeat
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis     lock.heartbeat()
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis   File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 102, in heartbeat
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis     self._lock.extend(self._lock.timeout)
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis   File "/usr/lib64/python2.7/contextlib.py", line 35, in __exit__
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis     self.gen.throw(type, value, traceback)
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis   File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 54, in _translate_failures
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis     cause=e)
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis   File "/usr/lib/python2.7/site-packages/tooz/coordination.py", line 763, in raise_with_cause
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis     excutils.raise_with_cause(exc_cls, message, *args, **kwargs)
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 143, in raise_with_cause
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis     six.raise_from(exc_cls(message, *args, **kwargs), kwargs.get('cause'))
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis   File "/usr/lib/python2.7/site-packages/six.py", line 692, in raise_from
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis     raise value
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis ToozError: Cannot extend an unlocked lock
2017-06-26 08:51:51.323 141978 ERROR tooz.drivers.redis 
2017-06-26 08:52:03.360 141950 WARNING gnocchi.cli [-] Metric processing lagging scheduling rate. It is recommended to increase the number of workers or to lengthen processing interval.

This got fixed upstream with [1].

Version-Release number of selected component (if applicable):
python-tooz-1.43.0-1.el7ost

How reproducible:
unknown

Steps to Reproduce:
1. unknown

Actual results:


Expected results:


Additional info:
[2] https://bugs.launchpad.net/gnocchi/+bug/1557593

Comment 1 Red Hat Bugzilla Rules Engine 2017-06-27 11:06:13 UTC

This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 8 Mehdi ABAAKOUK 2017-07-10 08:33:14 UTC

python-tooz-1.43.0-2.el7ost doesn't work:

2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage [-] Unexpected error during measures processing
2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage Traceback (most recent call last):
2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/gnocchi/storage/__init__.py", line 188, in process_background_tasks
2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage     self.process_new_measures(index, metrics, sync)
2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/gnocchi/storage/_carbonara.py", line 504, in process_new_measures
2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage     if lock.acquire(blocking=sync):
2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage   File "/usr/lib/python2.7/site-packages/tooz/drivers/redis.py", line 85, in acquire
2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage     with self._exclusive_access:
2017-07-09 15:21:28.977 131558 ERROR gnocchi.storage AttributeError: 'RedisLock' object has no attribute '_exclusive_access'

The backport doesn't work because it depends on an other change https://github.com/openstack/tooz/commit/486524c37fff2c826f934ac40fd07e2003074569

Comment 9 Julien Danjou 2017-07-10 15:05:04 UTC

The release upstream 1.43.1 should work instead.

Comment 12 Sasha Smolyak 2017-07-12 13:16:56 UTC

tooz seems to be fixed, no errors in metricd log

Comment 14 errata-xmlrpc 2017-07-12 14:07:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:1748

Comment 21 Julien Danjou 2017-07-25 13:38:09 UTC

Thanks for the info Nilesh. I'd some more information to have a clue where this can come from.

Can you check Redis logs to see if anything in particular is logged there?

Do you see any pattern in the error that are raised? I see they are more or less raised every 30s. Does it happen every 30s or so? Are there gap?
Does this happen on all controllers nodes or only a few? All the time?

Feel free to send a sosreport + gnocchi logs :)

Comment 35 Julien Danjou 2017-10-04 08:58:18 UTC

The one in External Tracker: https://review.openstack.org/#/c/478131/

Note You need to log in before you can comment on or make changes to this bug.