Bug 745873 (EDG-94) - HotRod should react to server shutdown event - results in TransportException: Problems writing data to stream
Summary: HotRod should react to server shutdown event - results in TransportException:...
Keywords:
Status: CLOSED NEXTRELEASE
Alias: EDG-94
Product: JBoss Data Grid 5
Classification: JBoss
Component: Infinispan
Version: EAP 5.1.0 EDG TP
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: EAP 5.1.0 EDG TP
Assignee: Default User
QA Contact:
URL: http://jira.jboss.org/jira/browse/EDG-94
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-03-25 16:06 UTC by Radoslav Husar
Modified: 2011-10-11 17:06 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-04-04 08:52:15 UTC
Type: Bug


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker EDG-94 0 Blocker Closed HotRod should react to server shutdown event - results in TransportException: Problems writing data to stream 2013-05-16 15:13:21 UTC

Description Radoslav Husar 2011-03-25 16:06:44 UTC
project_key: EDG

I have 4 servers, 3 of them are gracefully shut down and HotRod client doesn't adapt to topology update. 

{code}
2011/03/25 11:11:09:533 EDT [WARN ][Runner - 30] HOST perf01.mw.lab.eng.bos.redhat.com:rootProcess:E - Error sampling data:  <org.infinispan.client.hotrod.exceptions.TransportException:: Problems writing data to stream>
        org.infinispan.client.hotrod.exceptions.TransportException:: Problems writing data to stream
	at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.writeByte(TcpTransport.java:101)
	at org.infinispan.client.hotrod.impl.operations.HotRodOperation.writeHeader(HotRodOperation.java:52)
	at org.infinispan.client.hotrod.impl.operations.AbstractKeyValueOperation.sendPutOperation(AbstractKeyValueOperation.java:36)
	at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:27)
	at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:38)
	at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:178)
	at org.infinispan.CacheSupport.put(CacheSupport.java:28)
	at org.jboss.smartfrog.edg.adapter.HotRodAdapter$HotRodRemoteCacheAdapter.put(HotRodAdapter.java:257)
	at org.jboss.smartfrog.edg.loaddriver.CacheRequestProcessorFactory$InfinispanRequestProcessor.processRequest(CacheRequestProcessorFactory.java:163)
	at org.jboss.smartfrog.loaddriver.CompoundRequestProcessorFactoryImpl$CompoundRequestProcessor.processRequest(CompoundRequestProcessorFactoryImpl.java:51)
	at org.jboss.smartfrog.loaddriver.Runner.run(Runner.java:87)
	at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Broken pipe
	at sun.nio.ch.FileDispatcher.write0(Native Method)
	at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
	at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:72)
	at sun.nio.ch.IOUtil.write(IOUtil.java:43)
	at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
	at java.nio.channels.Channels.writeFullyImpl(Channels.java:59)
	at java.nio.channels.Channels.writeFully(Channels.java:81)
	at java.nio.channels.Channels.access$000(Channels.java:47)
	at java.nio.channels.Channels$1.write(Channels.java:155)
	at java.io.OutputStream.write(OutputStream.java:58)
	at java.nio.channels.Channels$1.write(Channels.java:136)
	at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.writeByte(TcpTransport.java:96)
	... 11 more
{code}

Comment 1 Radoslav Husar 2011-03-25 16:07:06 UTC
Link: Added: This issue is incorporated by JBPAPP-6054


Comment 2 Galder Zamarreño 2011-03-28 10:01:21 UTC
If you got that exception it means that no other servers are left on the client side to try. TRACE logging on the client is needed here again to figure out how many nodes the client was aware of.

Comment 3 Radoslav Husar 2011-03-29 12:08:10 UTC
Hi Galder,

TRACE logging in place, lets start off here and this might fix other issues as well.

http://hudson.qa.jboss.com/hudson/view/EDG/job/edg-51x-elasticity-client-size4-hotrod/5/

Since the logging is very verbose, its few gigs of data, so it can get hard to go through it. perf2 to perf8 each has 1 client logging, the clients cant update their topology when the cluster scaled down/in as seen in the stats:
http://hudson.qa.jboss.com/hudson/view/EDG/job/edg-51x-elasticity-client-size4-hotrod/5/artifact/report/log.txt

Logs: http://reports.qa.atl.jboss.com/~rhusar/ispn-logs/

Comment 4 Galder Zamarreño 2011-03-29 12:52:07 UTC
Ok, just had a look at some client logs from that run. Client in perf09 shows:

{code}2011-03-29 06:56:47,509 12    TRACE [org.infinispan.client.hotrod.impl.transport.tcp.RoundRobinBalancingStrategy] (RMI TCP Connection(14)-10.16.88.179:) New server list is: [perf18/10.16.90.54:11222, perf19/10.16.90.56:11222, perf17/10.16.90.52:11222, perf20/10.16.90.58:11222]. Resetting index to 0
...
2011-03-29 06:56:48,098 601   INFO  [org.infinispan.client.hotrod.impl.operations.HotRodOperation] (Runner - 0:) New topology: {perf18/10.16.90.54:11222=3439, perf20/10.16.90.58:11222=7122}{code}

Shortly after starting this client receives from perf18/10.16.90.54 a new formed of perf20/10.16.90.58:11222 and perf18/10.16.90.54, so when only perf17 (.52) is left up, no wonder the client starts behaving like this.

THe client receives this as part of an initial ping on construction to get all nodes in the cluster, but was the cluster fully formed at the time? Maybe it was not yet fully formed, let's see what the perf18 server log says...

Comment 5 Galder Zamarreño 2011-03-29 12:59:05 UTC
Indeed, that's your problem, this is what perf18 is showing at the beginning:

{code}[JBoss] 06:54:03,263 INFO  [JGroupsTransport] Received new cluster view: [perf18-33748|1] [perf18-33748, perf20-14766]{code}

I didn't download the entire file, but did the cluster eventually merge by any chance? Rado, you guys might wanna make sure the clusters are formed before running your tests.

Comment 6 Galder Zamarreño 2011-03-29 13:01:08 UTC
Also, are the clocks of the different perf machines synchronized?

Comment 7 Radoslav Husar 2011-03-29 13:12:15 UTC
The server logs says the cluster is formed 
{code}[JBoss] 06:54:38,317 INFO  [JGroupsTransport] Received new cluster view: [perf18-33748|3] [perf18-33748, perf20-14766, perf19-36123, perf17-22498]{code}

But! The hot rod server didn't deploy, so thats the root cause JBPAPP-6132 for this issue then :-(
{code}[JBoss] DEPLOYMENTS IN ERROR:
[JBoss]   Deployment "HotRodServer" is in error due to the following reason(s): org.infinispan.loaders.CacheLoaderException: Response length is always 0 or 1, received: {code}

Comment 8 Galder Zamarreño 2011-04-04 08:52:15 UTC
Now that ISPN-1010 (underlying issue of JBPAPP-6132) this should be fixed.


Note You need to log in before you can comment on or make changes to this bug.