Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 745873 (EDG-94)

Summary:	HotRod should react to server shutdown event - results in TransportException: Problems writing data to stream
Product:	[JBoss] JBoss Data Grid 5	Reporter:	Radoslav Husar <rhusar>
Component:	Infinispan	Assignee:	Default User <jbpapp-maint>
Status:	CLOSED NEXTRELEASE	QA Contact:
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	EAP 5.1.0 EDG TP	CC:	galder.zamarreno, rhusar
Target Milestone:	---
Target Release:	EAP 5.1.0 EDG TP
Hardware:	Unspecified
OS:	Unspecified
URL:	http://jira.jboss.org/jira/browse/EDG-94
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-04-04 08:52:15 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Radoslav Husar 2011-03-25 16:06:44 UTC

project_key: EDG

I have 4 servers, 3 of them are gracefully shut down and HotRod client doesn't adapt to topology update. 

{code}
2011/03/25 11:11:09:533 EDT [WARN ][Runner - 30] HOST perf01.mw.lab.eng.bos.redhat.com:rootProcess:E - Error sampling data:  <org.infinispan.client.hotrod.exceptions.TransportException:: Problems writing data to stream>
        org.infinispan.client.hotrod.exceptions.TransportException:: Problems writing data to stream
	at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.writeByte(TcpTransport.java:101)
	at org.infinispan.client.hotrod.impl.operations.HotRodOperation.writeHeader(HotRodOperation.java:52)
	at org.infinispan.client.hotrod.impl.operations.AbstractKeyValueOperation.sendPutOperation(AbstractKeyValueOperation.java:36)
	at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:27)
	at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:38)
	at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:178)
	at org.infinispan.CacheSupport.put(CacheSupport.java:28)
	at org.jboss.smartfrog.edg.adapter.HotRodAdapter$HotRodRemoteCacheAdapter.put(HotRodAdapter.java:257)
	at org.jboss.smartfrog.edg.loaddriver.CacheRequestProcessorFactory$InfinispanRequestProcessor.processRequest(CacheRequestProcessorFactory.java:163)
	at org.jboss.smartfrog.loaddriver.CompoundRequestProcessorFactoryImpl$CompoundRequestProcessor.processRequest(CompoundRequestProcessorFactoryImpl.java:51)
	at org.jboss.smartfrog.loaddriver.Runner.run(Runner.java:87)
	at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Broken pipe
	at sun.nio.ch.FileDispatcher.write0(Native Method)
	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
	at sun.nio.ch.IOUtil.write(IOUtil.java:43)
	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
	at java.nio.channels.Channels.writeFullyImpl(Channels.java:59)
	at java.nio.channels.Channels.writeFully(Channels.java:81)
	at java.nio.channels.Channels.access$000(Channels.java:47)
	at java.nio.channels.Channels$1.write(Channels.java:155)
	at java.io.OutputStream.write(OutputStream.java:58)
	at java.nio.channels.Channels$1.write(Channels.java:136)
	at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.writeByte(TcpTransport.java:96)
	... 11 more
{code}

Comment 1 Radoslav Husar 2011-03-25 16:07:06 UTC

Link: Added: This issue is incorporated by JBPAPP-6054

Comment 2 Galder Zamarreño 2011-03-28 10:01:21 UTC

If you got that exception it means that no other servers are left on the client side to try. TRACE logging on the client is needed here again to figure out how many nodes the client was aware of.

Comment 3 Radoslav Husar 2011-03-29 12:08:10 UTC

Hi Galder,

TRACE logging in place, lets start off here and this might fix other issues as well.

http://hudson.qa.jboss.com/hudson/view/EDG/job/edg-51x-elasticity-client-size4-hotrod/5/

Since the logging is very verbose, its few gigs of data, so it can get hard to go through it. perf2 to perf8 each has 1 client logging, the clients cant update their topology when the cluster scaled down/in as seen in the stats:
http://hudson.qa.jboss.com/hudson/view/EDG/job/edg-51x-elasticity-client-size4-hotrod/5/artifact/report/log.txt

Logs: http://reports.qa.atl.jboss.com/~rhusar/ispn-logs/

Comment 4 Galder Zamarreño 2011-03-29 12:52:07 UTC

Ok, just had a look at some client logs from that run. Client in perf09 shows:

{code}2011-03-29 06:56:47,509 12    TRACE [org.infinispan.client.hotrod.impl.transport.tcp.RoundRobinBalancingStrategy] (RMI TCP Connection(14)-10.16.88.179:) New server list is: [perf18/10.16.90.54:11222, perf19/10.16.90.56:11222, perf17/10.16.90.52:11222, perf20/10.16.90.58:11222]. Resetting index to 0
...
2011-03-29 06:56:48,098 601   INFO  [org.infinispan.client.hotrod.impl.operations.HotRodOperation] (Runner - 0:) New topology: {perf18/10.16.90.54:11222=3439, perf20/10.16.90.58:11222=7122}{code}

Shortly after starting this client receives from perf18/10.16.90.54 a new formed of perf20/10.16.90.58:11222 and perf18/10.16.90.54, so when only perf17 (.52) is left up, no wonder the client starts behaving like this.

THe client receives this as part of an initial ping on construction to get all nodes in the cluster, but was the cluster fully formed at the time? Maybe it was not yet fully formed, let's see what the perf18 server log says...

Comment 5 Galder Zamarreño 2011-03-29 12:59:05 UTC

Indeed, that's your problem, this is what perf18 is showing at the beginning:

{code}[JBoss] 06:54:03,263 INFO  [JGroupsTransport] Received new cluster view: [perf18-33748|1] [perf18-33748, perf20-14766]{code}

I didn't download the entire file, but did the cluster eventually merge by any chance? Rado, you guys might wanna make sure the clusters are formed before running your tests.

Comment 6 Galder Zamarreño 2011-03-29 13:01:08 UTC

Also, are the clocks of the different perf machines synchronized?

Comment 7 Radoslav Husar 2011-03-29 13:12:15 UTC

The server logs says the cluster is formed 
{code}[JBoss] 06:54:38,317 INFO  [JGroupsTransport] Received new cluster view: [perf18-33748|3] [perf18-33748, perf20-14766, perf19-36123, perf17-22498]{code}

But! The hot rod server didn't deploy, so thats the root cause JBPAPP-6132 for this issue then :-(
{code}[JBoss] DEPLOYMENTS IN ERROR:
[JBoss]   Deployment "HotRodServer" is in error due to the following reason(s): org.infinispan.loaders.CacheLoaderException: Response length is always 0 or 1, received: {code}

Comment 8 Galder Zamarreño 2011-04-04 08:52:15 UTC

Now that ISPN-1010 (underlying issue of JBPAPP-6132) this should be fixed.