Bug 1298119

Summary: ClientEventsOOMTest.testOOM fails on RHEL 7
Product: [JBoss] JBoss Data Grid 6 Reporter: Roman Macor <rmacor>
Component: ServerAssignee: Galder Zamarreño <galder.zamarreno>
Status: NEW --- QA Contact: Martin Gencur <mgencur>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.6.0CC: jdg-bugs, vjuranek
Target Milestone: ---   
Target Release: 7.0.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Roman Macor 2016-01-13 10:04:27 UTC
Description of problem:

org.infinispan.client.hotrod.event.ClientEventsOOMTest.testOOM fails with:

Error Message

java.net.SocketTimeoutException

Stacktrace

org.infinispan.client.hotrod.exceptions.TransportException:: java.net.SocketTimeoutException
	at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.readByte(TcpTransport.java:184)
	at org.infinispan.client.hotrod.impl.protocol.Codec20.readMagic(Codec20.java:305)
	at org.infinispan.client.hotrod.impl.protocol.Codec20.readHeaderOrEvent(Codec20.java:205)
	at org.infinispan.client.hotrod.impl.operations.AddClientListenerOperation.executeOperation(AddClientListenerOperation.java:92)
	at org.infinispan.client.hotrod.impl.operations.AddClientListenerOperation.executeOperation(AddClientListenerOperation.java:25)
	at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:56)
	at org.infinispan.client.hotrod.impl.RemoteCacheImpl.addClientListener(RemoteCacheImpl.java:516)
	at org.infinispan.client.hotrod.event.ClientEventsOOMTest.testOOM(ClientEventsOOMTest.java:86)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:80)
	at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
	at org.testng.TestRunner.privateRun(TestRunner.java:767)
	at org.testng.TestRunner.run(TestRunner.java:617)
	at org.testng.SuiteRunner.runTest(SuiteRunner.java:334)
	at org.testng.SuiteRunner.access$000(SuiteRunner.java:37)
	at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:368)
	at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException
	at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
	at org.infinispan.client.hotrod.impl.transport.tcp.TcpTransport.readByte(TcpTransport.java:179)
	... 27 more

Comment 3 Vojtech Juranek 2016-01-14 21:06:37 UTC
So here is quick summary what I did:
I took one RHEL7 machine from our Jenkins lab and also provision one fresh RHEL 7 machine from Beaker.

* on Beaker machine test never fails (run 10-15 time with JDK8, didn't fail even with JDK 7)

* on Jenkins machine failed in cca 80% (again 10-15 runs, all with JDK8), but never failed when run with trace logging

* I suspected it's some environment issue, so I setup same ulimits on both machine and also sync some kernel parameters (otherwise JDK versions were same, also same kernels), but the result was same

* I took thread dumps on both machines during registering listeners, but didn't spot anything suspicious (as far as I can tell, both stack traces were very similar)

Comment 4 Martin Gencur 2016-01-15 09:29:28 UTC
Nice analysis, Vojta! It looks like the issue is both random and depending on environment setup.

Comment 5 JBoss JIRA Server 2016-03-07 10:17:39 UTC
Sebastian Łaskawiec <slaskawi> updated the status of jira JDG-15 to Coding In Progress

Comment 6 JBoss JIRA Server 2016-03-11 09:55:14 UTC
Vaclav Dedik <vdedik> updated the status of jira JDG-15 to Resolved