1203706 – ceilometer group partitioning coordination with tooz+redis+sentinel fails to failover to new master

Bug 1203706 - ceilometer group partitioning coordination with tooz+redis+sentinel fails to failover to new master

Summary: ceilometer group partitioning coordination with tooz+redis+sentinel fails to ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	python-tooz
Sub Component:
Version:	6.0 (Juno)
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	async
Target Release:	6.0 (Juno)
Assignee:	Pradeep Kilambi
QA Contact:	Yurii Prokulevych
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1210500 (view as bug list)
Depends On:
Blocks:	1200129
TreeView+	depends on / blocked

Reported:	2015-03-19 13:50 UTC by Chris Dent
Modified:	2023-02-22 23:02 UTC (History)
CC List:	9 users (show)
Fixed In Version:	python-tooz-0.12.1-1.el7ost
Doc Type:	Rebase: Bug Fixes and Enhancements
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-07-20 19:52:43 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Launchpad	1434043	0	None	None	None	Never
Red Hat Product Errata	RHBA-2016:1472	0	normal	SHIPPED_LIVE	Red Hat Enterprise Linux OpenStack Platform 6 Bug Fix and Enhancement Advisory	2016-07-20 23:52:07 UTC

Description Chris Dent 2015-03-19 13:50:25 UTC

(More details on the linked tracker)

When using tooz configured with multiple sentinels to coordinate group membership for the ceilometer central (and other) agents the coordinator fails to update to use a new master redis server.

This is due to lack of retry logic in tooz.

Comment 4 Ami Jeain 2015-03-19 14:17:01 UTC

wasn't this bug targeted for RHOS6 instead of RHOS7?

Comment 5 Chris Dent 2015-03-23 10:48:43 UTC

The upstream fix in tooz that gets failover working has merged: https://git.openstack.org/cgit/openstack/tooz/commit/?id=54d6bb1c94270d2794ecefbcaf3f8832010e3d58

No info yet on when a new tooz release will happen.

Comment 7 Rafael Rosa 2015-04-14 14:04:44 UTC

There was tooz release on April 13th, 0.13.2, it should solve it.

Comment 8 Eoghan Glynn 2015-04-14 14:28:07 UTC

Punted to A4 as discussed on IRC.

Comment 15 Lon Hohberger 2015-12-17 21:21:17 UTC

*** Bug 1210500 has been marked as a duplicate of this bug. ***

Comment 17 Lon Hohberger 2015-12-18 16:30:19 UTC

Tooz requires 2 other rebases to work.  It's probably safer to fix this with a backport.

Comment 20 nlevinki 2016-07-19 07:40:46 UTC

automation passed
https://rhos-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHOS/view/RHOS6/job/rhos-jenkins-rhos-6.0-puddle-rhel-7.2-all-in-one-packstack-nova-flatdhcp-rabbitmq-tempest-git-all/28/
and we verify the fix is in.
since the fix is working on RHOS7 and this was a backport job, I am moving this ticket to verify

Comment 22 errata-xmlrpc 2016-07-20 19:52:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1472

Note You need to log in before you can comment on or make changes to this bug.