1190001 – Avoid invalid topology

Bug 1190001 - Avoid invalid topology

Summary: Avoid invalid topology

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Data Grid 6
Classification:	JBoss
Component:	Server
Sub Component:
Version:	6.3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	CR1
Target Release:	6.4.1
Assignee:	Galder Zamarreño
QA Contact:	Martin Gencur
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-02-06 03:31 UTC by Takayoshi Kimura
Modified:	2020-06-11 12:39 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2015-04-02 12:14:09 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)
Reproducer (917 bytes, text/plain) 2015-03-23 07:44 UTC, Matej Čimbora	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	ISPN-5208	0	Major	Resolved	Avoid invalid topology	2018-05-23 16:52:34 UTC

Description Takayoshi Kimura 2015-02-06 03:31:18 UTC

We've seen some invalid topology propagated to client and it causes ArrayIndexOutOfBoundsException:

Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
  	at org.infinispan.client.hotrod.impl.transport.tcp.RoundRobinBalancingStrategy.getServerByIndex(RoundRobinBalancingStrategy.java:68) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.impl.transport.tcp.RoundRobinBalancingStrategy.nextServer(RoundRobinBalancingStrategy.java:44) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.nextServer(TcpTransportFactory.java:220) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.getTransport(TcpTransportFactory.java:194) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.impl.operations.FaultTolerantPingOperation.getTransport(FaultTolerantPingOperation.java:27) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:48) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.impl.RemoteCacheImpl.ping(RemoteCacheImpl.java:535) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.RemoteCacheManager.ping(RemoteCacheManager.java:635) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.RemoteCacheManager.createRemoteCache(RemoteCacheManager.java:616) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
  	at org.infinispan.client.hotrod.RemoteCacheManager.getCache(RemoteCacheManager.java:527) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]
    at org.infinispan.client.hotrod.RemoteCacheManager.getCache(RemoteCacheManager.java:523) [infinispan-client-hotrod-6.1.0.Final-redhat-4.jar:6.1.0.Final-redhat-4]

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
    at org.infinispan.client.hotrod.impl.consistenthash.SegmentConsistentHash.getServer(SegmentConsistentHash.java:33)
    at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransportFactory.getTransport(TcpTransportFactory.java:204)
    at org.infinispan.client.hotrod.impl.operations.AbstractKeyOperation.getTransport(AbstractKeyOperation.java:40)
    at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:48)
    at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:237)
    at org.infinispan.client.hotrod.impl.RemoteCacheSupport.put(RemoteCacheSupport.java:79)
    at sample.Main.main(Main.java:16)

It happens on both Hot Rod 2 and 1.3 clients.

It's really hard to reproduce this state and we don't have a consistent way to reproduce it. However when this happens there is always view change happening so it's related to view change.

Judging from the stack trace, the client receives numOwners=0 or numSegments=0 topology from the server.

Also we are unable to find to recover this situation. Rebooting random nodes don't help and keep getting this exceptions on client side.

Until we can find the root cause, I think it's better to add a guard to avoid this kind invalid topology stored in the server side and propagated to the clients.

Comment 6 JBoss JIRA Server 2015-03-13 13:01:55 UTC

Galder Zamarreño <galder.zamarreno> updated the status of jira ISPN-5208 to Coding In Progress

Comment 8 Matej Čimbora 2015-03-23 07:42:43 UTC

Problem still persists, i.e. I'm getting exception as described in linked JIRA.

java.lang.ArrayIndexOutOfBoundsException: 0
	at org.infinispan.client.hotrod.impl.consistenthash.SegmentConsistentHash.getServer(SegmentConsistentHash.java:33)
...

I managed to create reproducer for this (attached).

1. In clustered.xml, modify numOwners attribute for 'default' cache to 1.
2. Start 2 servers with clustered.xml configuration. 
3. Start attached reproducer (client).
4. Kill one of the servers.
5. Start it again. Exception should appear in client log.

This probably relates to losing segments for a subset of keys.

Comment 9 Matej Čimbora 2015-03-23 07:44:44 UTC

Created attachment 1005229 [details]
Reproducer

Comment 10 Sebastian Łaskawiec 2015-03-23 15:07:51 UTC

PR: https://github.com/infinispan/jdg/pull/575

Note You need to log in before you can comment on or make changes to this bug.