Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 913507 - OutOfMemory in SpecJMS2007 satellite driver
OutOfMemory in SpecJMS2007 satellite driver
Status: CLOSED CURRENTRELEASE
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: HornetQ (Show other bugs)
6.1.0
Unspecified Unspecified
unspecified Severity urgent
: ER6
: EAP 6.1.0
Assigned To: Clebert Suconic
: TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-21 06:50 EST by Miroslav Novak
Modified: 2014-05-26 21:28 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
When NIO was enabled on a connector and a very high number of threads was active, an *OutOfMemory* error could occur. This occurred because NioWorker was not shutdown correctly and so the threads and the memory they were using was not released. The underlying problem with thread handling has been corrected, reducing the risk of an *OutOfMemory* error. It is recommended that the parameter "use-nio-global-worker-pool" be enabled to minimize the risk of an OutOfMemory error. An example connector configuration is as follows: [source,XML] ---- <netty-connector name="netty" socket-binding="messaging"> <param key="use-nio" value="true"/> <param key="use-nio-global-worker-pool" value="true"/> </netty-connector> ----
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
got threadump, heapdump and histogram. (39.82 MB, application/zip)
2013-02-21 15:12 EST, Miroslav Novak
no flags Details

  None (edit)
Description Miroslav Novak 2013-02-21 06:50:18 EST
Description of problem:
When running SpecJMS2007 against EAP 6.1.0.DR4 then sattelite driver throws OOM. It happens when load is set higher than 150 in messaging lab. This is regression against EAP 6.0.1.

Issue is under investigation.

OOM stacktrace:
[satellite(bash)] OUT >      [java] java.lang.OutOfMemoryError: unable to create new native thread
[satellite(bash)] OUT >      [java] 	at java.lang.Thread.start0(Native Method)
[satellite(bash)] OUT >      [java] 	at java.lang.Thread.start(Thread.java:640)
[satellite(bash)] OUT >      [java] 	at java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727)
[satellite(bash)] OUT >      [java] 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:657)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.util.VirtualExecutorService.execute(VirtualExecutorService.java:155)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.util.internal.DeadLockProofWorker.start(DeadLockProofWorker.java:38)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:343)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:51)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:45)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:152)
[satellite(bash)] OUT >      [java] 	at org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:134)
[satellite(bash)] OUT >      [java] 	at org.hornetq.core.remoting.impl.netty.NettyConnector.start(NettyConnector.java:309)
[satellite(bash)] OUT >      [java] 	at org.hornetq.core.client.impl.ClientSessionFactoryImpl.getConnection(ClientSessionFactoryImpl.java:1218)
[satellite(bash)] OUT >      [java] 	at org.hornetq.core.client.impl.ClientSessionFactoryImpl.getConnectionWithRetry(ClientSessionFactoryImpl.java:1071)
[satellite(bash)] OUT >      [java] 	at org.hornetq.core.client.impl.ClientSession
[satellite(bash)] OUT >      [java] FactoryImpl.connect(ClientSessionFactoryImpl.java:246)
[satellite(bash)] OUT >      [java] 	at org.hornetq.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:826)
[satellite(bash)] OUT >      [java] 	at org.hornetq.jms.client.HornetQConnectionFactory.createConnectionInternal(HornetQConnectionFactory.java:583)
[satellite(bash)] OUT >      [java] 	at org.hornetq.jms.client.HornetQConnectionFactory.createConnection(HornetQConnectionFactory.java:107)
[satellite(bash)] OUT >      [java] 	at org.hornetq.jms.client.HornetQConnectionFactory.createConnection(HornetQConnectionFactory.java:102)
[satellite(bash)] OUT >      [java] 	at org.spec.perfharness.jms.providers.AbstractJMSProvider.getConnection_internal(AbstractJMSProvider.java:191)
[satellite(bash)] OUT >      [java] 	at org.spec.perfharness.jms.providers.AbstractJMSProvider.getConnection(AbstractJMSProvider.java:297)
[satellite(bash)] OUT >      [java] 	at org.spec.jms.agents.connectionpool.JMSConnectionFactory.makeObject(JMSConnectionFactory.java:68)
[satellite(bash)] OUT >      [java] 	at org.spec.jms.agents.connectionpool.SharedKeyedObjectPool.borrowObject(SharedKeyedObjectPool.java:204)
[satellite(bash)] OUT >      [java] 	at org.spec.jms.agents.SPECWorkerThread.getConnection(SPECWorkerThread.java:341)
[satellite(bash)] OUT >      [java] 	at org.spec.jms.eventhandler.sm.SM_ShipArrEH.buildJMSResources(SM_ShipArrEH.java:56)
[satellite(bash)] OUT >      [java] 	at org.spec.jms.agents.SPECWorkerThread.run(SPECWorkerThread.java:731)
[satellite(bash)] OUT >      [java] 	at org.spec.jms.eventhandler.sm.SM_ShipArrEH.run(SM_ShipArrEH.java:74)
Comment 1 Miroslav Novak 2013-02-21 15:12:04 EST
Created attachment 700810 [details]
got threadump, heapdump and histogram.
Comment 2 Miroslav Novak 2013-02-21 15:15:05 EST
Threadump, heapdump and histogram attached.
Comment 3 Miroslav Novak 2013-03-01 10:15:19 EST
The same result with JVM_OPTIONS="-Xms15G -Xmx15G -Xss1048K"
Comment 4 Miroslav Novak 2013-03-04 04:33:27 EST
This seems to be a thread leak. Original OutOfMemoryError exception was caused by limited value of max user processes (ulimit -u):
max user processes              (-u) 8192

This value was sufficient for EAP 6.0.1 with 3x time higher load.

There are thousands of threads like:

Stack trace: 
 sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
   - locked sun.nio.ch.Util$2@5644c8ca
   - locked java.util.Collections$UnmodifiableSet@3dda7205
   - locked sun.nio.ch.EPollSelectorImpl@5073c5fc
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
org.jboss.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:64)
org.jboss.netty.channel.socket.nio.AbstractNioSelector.select(AbstractNioSelector.java:409)
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:206)
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
org.jboss.netty.util.VirtualExecutorService$ChildExecutorRunnable.run(VirtualExecutorService.java:175)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:662)
Comment 5 Norman Maurer 2013-03-04 05:17:11 EST
Sounds for me like the NioWorker is not shutdown correctly and so the threads are not released. Will check with clebert
Comment 6 Norman Maurer 2013-03-05 11:57:20 EST
currently try to reproduce with Miroslav Novak
Comment 7 Miroslav Novak 2013-03-27 06:14:52 EDT
Just updating keywords.
Comment 8 Miroslav Novak 2013-04-15 09:25:57 EDT
Can't verify because of bz#951556. Moving to ER5.
Comment 9 Miroslav Novak 2013-04-29 10:36:02 EDT
HornetQ was not updated in EAP 6.1.0.ER5 - will be verified with ER6.
Comment 10 Miroslav Novak 2013-05-04 01:37:33 EDT
Issue is fixed in EAP 6.1.0.ER6 (HQ 2.3.0.Final). Thanks to all involved!

Note for documentation:
This issue was related only to NIO connectors. When NIO is enabled on the connector then it's also recommended enable parameter "use-nio-global-worker-pool" to avoid situation where too many threads is created on a machine which could possible lead to OOM. Example of configuration of such connector:

<netty-connector name="netty" socket-binding="messaging">
   <param key="use-nio" value="true"/>
   <param key="use-nio-global-worker-pool" value="true"/>	
</netty-connector>
Comment 11 Russell Dickenson 2013-05-08 00:46:36 EDT
I have edited the proposed Release Notes text for this ticket. Please review it for technical accuracy.

I find it strange that although this ticket is marked 'Resolved', we are recommending a parameter be set to minimize the risk of an OOM error. Is the setting of this parameter merely a safety measure or is it *required*?


Attention: Miroslav

Do I understand correctly that for this issue you're recommending this text also appear in the appropriate EAP book, or only in the Release Notes?
Comment 12 Russell Dickenson 2013-05-13 08:55:11 EDT
Note from ataylor:

"ataylor: the new parameter will use a shared pool of workers for netty per connection
[10:49pm] ataylor: rather that a pool per connection"
Comment 13 Miroslav Novak 2013-05-17 07:52:51 EDT
@ Russel
There are some bad characters in the note for this bz in:
http://documentation-devel.engineering.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/6.1/html-single/6.1.0_Release_Notes/index.html#idp1786488

Text:
&lt;netty-connector name="netty" socket-binding="messaging"&gt; &lt;param key="use-nio" value="true"/&gt; &lt;param key="use-nio-global-worker-pool" value="true"/&gt; &lt;/netty-connector&gt; 

Can you take a look at it, please?

Thanks,

Mirek

Note You need to log in before you can comment on or make changes to this bug.