Red Hat Bugzilla – Bug 913507
OutOfMemory in SpecJMS2007 satellite driver
Last modified: 2014-05-26 21:28:16 EDT
Description of problem: When running SpecJMS2007 against EAP 6.1.0.DR4 then sattelite driver throws OOM. It happens when load is set higher than 150 in messaging lab. This is regression against EAP 6.0.1. Issue is under investigation. OOM stacktrace: [satellite(bash)] OUT > [java] java.lang.OutOfMemoryError: unable to create new native thread [satellite(bash)] OUT > [java] at java.lang.Thread.start0(Native Method) [satellite(bash)] OUT > [java] at java.lang.Thread.start(Thread.java:640) [satellite(bash)] OUT > [java] at java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727) [satellite(bash)] OUT > [java] at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:657) [satellite(bash)] OUT > [java] at org.jboss.netty.util.VirtualExecutorService.execute(VirtualExecutorService.java:155) [satellite(bash)] OUT > [java] at org.jboss.netty.util.internal.DeadLockProofWorker.start(DeadLockProofWorker.java:38) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:343) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:51) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:45) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:152) [satellite(bash)] OUT > [java] at org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:134) [satellite(bash)] OUT > [java] at org.hornetq.core.remoting.impl.netty.NettyConnector.start(NettyConnector.java:309) [satellite(bash)] OUT > [java] at org.hornetq.core.client.impl.ClientSessionFactoryImpl.getConnection(ClientSessionFactoryImpl.java:1218) [satellite(bash)] OUT > [java] at org.hornetq.core.client.impl.ClientSessionFactoryImpl.getConnectionWithRetry(ClientSessionFactoryImpl.java:1071) [satellite(bash)] OUT > [java] at org.hornetq.core.client.impl.ClientSession [satellite(bash)] OUT > [java] FactoryImpl.connect(ClientSessionFactoryImpl.java:246) [satellite(bash)] OUT > [java] at org.hornetq.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:826) [satellite(bash)] OUT > [java] at org.hornetq.jms.client.HornetQConnectionFactory.createConnectionInternal(HornetQConnectionFactory.java:583) [satellite(bash)] OUT > [java] at org.hornetq.jms.client.HornetQConnectionFactory.createConnection(HornetQConnectionFactory.java:107) [satellite(bash)] OUT > [java] at org.hornetq.jms.client.HornetQConnectionFactory.createConnection(HornetQConnectionFactory.java:102) [satellite(bash)] OUT > [java] at org.spec.perfharness.jms.providers.AbstractJMSProvider.getConnection_internal(AbstractJMSProvider.java:191) [satellite(bash)] OUT > [java] at org.spec.perfharness.jms.providers.AbstractJMSProvider.getConnection(AbstractJMSProvider.java:297) [satellite(bash)] OUT > [java] at org.spec.jms.agents.connectionpool.JMSConnectionFactory.makeObject(JMSConnectionFactory.java:68) [satellite(bash)] OUT > [java] at org.spec.jms.agents.connectionpool.SharedKeyedObjectPool.borrowObject(SharedKeyedObjectPool.java:204) [satellite(bash)] OUT > [java] at org.spec.jms.agents.SPECWorkerThread.getConnection(SPECWorkerThread.java:341) [satellite(bash)] OUT > [java] at org.spec.jms.eventhandler.sm.SM_ShipArrEH.buildJMSResources(SM_ShipArrEH.java:56) [satellite(bash)] OUT > [java] at org.spec.jms.agents.SPECWorkerThread.run(SPECWorkerThread.java:731) [satellite(bash)] OUT > [java] at org.spec.jms.eventhandler.sm.SM_ShipArrEH.run(SM_ShipArrEH.java:74)
Created attachment 700810 [details] got threadump, heapdump and histogram.
Threadump, heapdump and histogram attached.
The same result with JVM_OPTIONS="-Xms15G -Xmx15G -Xss1048K"
This seems to be a thread leak. Original OutOfMemoryError exception was caused by limited value of max user processes (ulimit -u): max user processes (-u) 8192 This value was sufficient for EAP 6.0.1 with 3x time higher load. There are thousands of threads like: Stack trace: sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210) sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked sun.nio.ch.Util$2@5644c8ca - locked java.util.Collections$UnmodifiableSet@3dda7205 - locked sun.nio.ch.EPollSelectorImpl@5073c5fc sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) org.jboss.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:64) org.jboss.netty.channel.socket.nio.AbstractNioSelector.select(AbstractNioSelector.java:409) org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:206) org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) org.jboss.netty.util.VirtualExecutorService$ChildExecutorRunnable.run(VirtualExecutorService.java:175) java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) java.lang.Thread.run(Thread.java:662)
Sounds for me like the NioWorker is not shutdown correctly and so the threads are not released. Will check with clebert
currently try to reproduce with Miroslav Novak
Just updating keywords.
Can't verify because of bz#951556. Moving to ER5.
HornetQ was not updated in EAP 6.1.0.ER5 - will be verified with ER6.
Issue is fixed in EAP 6.1.0.ER6 (HQ 2.3.0.Final). Thanks to all involved! Note for documentation: This issue was related only to NIO connectors. When NIO is enabled on the connector then it's also recommended enable parameter "use-nio-global-worker-pool" to avoid situation where too many threads is created on a machine which could possible lead to OOM. Example of configuration of such connector: <netty-connector name="netty" socket-binding="messaging"> <param key="use-nio" value="true"/> <param key="use-nio-global-worker-pool" value="true"/> </netty-connector>
I have edited the proposed Release Notes text for this ticket. Please review it for technical accuracy. I find it strange that although this ticket is marked 'Resolved', we are recommending a parameter be set to minimize the risk of an OOM error. Is the setting of this parameter merely a safety measure or is it *required*? Attention: Miroslav Do I understand correctly that for this issue you're recommending this text also appear in the appropriate EAP book, or only in the Release Notes?
Note from ataylor: "ataylor: the new parameter will use a shared pool of workers for netty per connection [10:49pm] ataylor: rather that a pool per connection"
@ Russel There are some bad characters in the note for this bz in: http://documentation-devel.engineering.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/6.1/html-single/6.1.0_Release_Notes/index.html#idp1786488 Text: <netty-connector name="netty" socket-binding="messaging"> <param key="use-nio" value="true"/> <param key="use-nio-global-worker-pool" value="true"/> </netty-connector> Can you take a look at it, please? Thanks, Mirek