919397 – DIST: ISPN000208: No live owners found for segment at server shutdown

Bug 919397 - DIST: ISPN000208: No live owners found for segment at server shutdown

Summary: DIST: ISPN000208: No live owners found for segment at server shutdown

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	JBoss Enterprise Application Platform 6
Classification:	JBoss
Component:	Clustering
Sub Component:
Version:	6.1.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	EAP 6.3.0
Assignee:	Paul Ferraro
QA Contact:	Jitka Kozana
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-03-08 11:01 UTC by Jitka Kozana
Modified:	2014-04-23 02:04 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2013-12-09 14:44:07 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)

Description Jitka Kozana 2013-03-08 11:01:27 UTC

EAP 6.1.0.ER2:

At server shutdown, we are seeing the following errors:
perf20:
13:34:02,045 ERROR [org.infinispan.statetransfer.StateConsumerImpl] (OOB-60,shared=udp) ISPN000208: No live owners found for segment 17 of cache default-host/clusterbench. Current owners are:  [perf21/web]. Faulty owners: [perf21/web]

perf21 was at the time shutting down too and complaining about the same thing too (No live owners found).

See the server logs here:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EAP6/view/EAP6-Clustering/view/EAP6-Failover/job/eap-6x-failover-http-session-jvmkill-dist-async/35/artifact/report/config/jboss-perf20/server.log
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EAP6/view/EAP6-Clustering/view/EAP6-Failover/job/eap-6x-failover-http-session-jvmkill-dist-async/35/artifact/report/config/jboss-perf21/server.log

This upstream jira seems to be dealing with something similar: https://issues.jboss.org/browse/ISPN-1239

Comment 1 Jitka Kozana 2013-05-13 14:02:32 UTC

Seen again in 6.1.0.ER8.

Link to server log:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-async/28/artifact/report/config/jboss-perf18/server.log

Link to job:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-async/28/

Comment 2 Ladislav Thon 2013-08-26 12:02:49 UTC

Still seeing this with EAP 6.1.1.ER7. For example:

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/28/
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-jvmkill-dist-sync/47/

Comment 3 Paul Ferraro 2013-12-09 14:44:07 UTC

This happens when all of the owners of a given leave the group, so that data can no longer be rebalanced.  When using DIST, you need to either:
* Increase the number of owners according to the expected availability
* Stagger shutdown of nodes according to the number of owners
* Ignore exceptions like this when you plan to shutdown everything

Comment 4 Scott Mumford 2014-04-23 02:04:08 UTC

Marking for exclusion from Release Notes documentation as not a bug.

Note You need to log in before you can comment on or make changes to this bug.