1032552 – [QE] (6.3.0) OOM due lots of org.apache.tomcat.util.net.JIoEndpoint$Poller objects

Bug 1032552 - [QE] (6.3.0) OOM due lots of org.apache.tomcat.util.net.JIoEndpoint$Poller objects

Summary: [QE] (6.3.0) OOM due lots of org.apache.tomcat.util.net.JIoEndpoint$Poller ob...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Enterprise Application Platform 6
Classification:	JBoss
Component:	Web, Remoting
Sub Component:
Version:	6.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	EAP 6.3.0
Assignee:	Rémy Maucherat
QA Contact:	Radim Hatlapatka
Docs Contact:	Russell Dickenson
URL:
Whiteboard:
Duplicates (1):	1035787 (view as bug list)
Depends On:
Blocks:	1062298 1065333
TreeView+	depends on / blocked

Reported:	2013-11-20 11:46 UTC by Radim Hatlapatka
Modified:	2014-06-28 15:38 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Clones:	1065333 (view as bug list)
Environment:
Last Closed:	2014-06-28 15:38:08 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1035787	0	unspecified	CLOSED	Thread leak in JBossWeb after :reload	2021-02-22 00:41:40 UTC

Internal Links: 1035787

Description Radim Hatlapatka 2013-11-20 11:46:15 UTC

Description of problem:
When running testsuite for testing EAP management operations via CLI I am getting OOM. I have created heapdumps, which I have put to nfs-01.eng.brq.redhat.com:/exports/scratch/rhatlapa/heapdump-oom-manyJIoEndpoints*.hprof. The most memory is taken by org.apache.tomcat.util.net.JIoEndpoint$Poller objects which are relatively big (1.5 - 2.0 MB each).

It seems, that the HttpProcessor releases the JIoEndpoint objects after some time and the server continues to run.


Version-Release number of selected component (if applicable):
I have tested it with jbossweb 7.2.2.Final (version in EAP 6.2.0.CR2) and jbossweb 7.3.0.Final


How reproducible:


Steps to Reproduce:
1. 
2.
3.

Actual results:
OOM exception occurs

Expected results:
no OOM exception occurs


Additional info:

Comment 1 Radim Hatlapatka 2013-11-21 18:11:18 UTC

This is error which is shown when the OOM error occurs:

19:04:52,257 ERROR [org.xnio.listener] (Remoting "insignia:MANAGEMENT" read-1) A channel event listener threw an exception: java.lang.OutOfMemoryError: unable to create new native thread
	at java.lang.Thread.start0(Native Method) [rt.jar:1.7.0_45]
	at java.lang.Thread.start(Thread.java:713) [rt.jar:1.7.0_45]
	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949) [rt.jar:1.7.0_45]
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360) [rt.jar:1.7.0_45]
	at org.xnio.XnioWorker.execute(XnioWorker.java:577) [xnio-api-3.0.7.GA-redhat-1.jar:3.0.7.GA-redhat-1]
	at org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:218) [jboss-remoting-3.2.18.GA-redhat-1.jar:3.2.18.GA-redhat-1]
	at org.jboss.remoting3.remote.RemoteReadListener.handleEvent(RemoteReadListener.java:45) [jboss-remoting-3.2.18.GA-redhat-1.jar:3.2.18.GA-redhat-1]
	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.7.GA-redhat-1.jar:3.0.7.GA-redhat-1]
	at org.xnio.channels.TranslatingSuspendableChannel.handleReadable(TranslatingSuspendableChannel.java:189) [xnio-api-3.0.7.GA-redhat-1.jar:3.0.7.GA-redhat-1]
	at org.xnio.channels.TranslatingSuspendableChannel$1.handleEvent(TranslatingSuspendableChannel.java:103) [xnio-api-3.0.7.GA-redhat-1.jar:3.0.7.GA-redhat-1]
	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:72) [xnio-api-3.0.7.GA-redhat-1.jar:3.0.7.GA-redhat-1]
	at org.xnio.nio.NioHandle.run(NioHandle.java:90)
	at org.xnio.nio.WorkerThread.run(WorkerThread.java:187)

Comment 2 Rémy Maucherat 2013-11-22 09:01:27 UTC

Some connector components use "a lot" of memory, but will last for the lifetime of the connector and are used to keep track of the connections state. The test could be doing a lot of stop/start of the connector, which is allowed but obviously not very common use.

The OOM actually occurs in another component (remoting, which uses xnio for its IO).

Comment 3 Radim Hatlapatka 2013-11-22 09:29:16 UTC

Yes, that probably will be the case and yes this is definitely not a very common use case.

The scenario is that I run lots of management operations via CLI and verify whether the changed configuration was correctly handled by the server (mostly via deployed simple app. Many changes require reload, which is probably the place where the connectors stops/starts are done leading to the OOM.

The question is, is it correct behavior of connectors and memory when server reload is done? Or may I somehow influence its behavior to prevent OOM when running many CLI operations => many operations through native interface?

Comment 4 Radim Hatlapatka 2013-11-22 12:43:28 UTC

No I've got this OOM error [1], this time it occurred in Web subsystem component.

Isn't there option to limit amount of threads used for connectors?

[1]
ERROR [org.apache.tomcat.util.net] (http-/127.0.0.1:8080-Acceptor-0) JBWEB003011: Error allocating socket processor: java.lang.OutOfMemoryError: unable to create new native thread
  at java.lang.Thread.start0(Native Method) [rt.jar:1.7.0_45]
  at java.lang.Thread.start(Thread.java:713) [rt.jar:1.7.0_45]
  at org.apache.tomcat.util.net.JIoEndpoint$Worker.start(JIoEndpoint.java:940) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1]
  at org.apache.tomcat.util.net.JIoEndpoint.newWorkerThread(JIoEndpoint.java:1162) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1]
  at org.apache.tomcat.util.net.JIoEndpoint.createWorkerThread(JIoEndpoint.java:1141) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1]
  at org.apache.tomcat.util.net.JIoEndpoint.getWorkerThread(JIoEndpoint.java:1173) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1]
  at org.apache.tomcat.util.net.JIoEndpoint.processSocket(JIoEndpoint.java:1211) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1]
  at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:312) [jbossweb-7.2.2.Final-redhat-1.jar:7.2.2.Final-redhat-1]
  at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_45]

Comment 5 Rémy Maucherat 2013-11-27 08:38:52 UTC

I don't think the connector can be immediately GCed right after stopping, as the threads may not all be interrupted etc. So there could be a limit on what you can do here. How long does it take to GC in practice ?

Comment 6 Radim Hatlapatka 2013-11-27 17:17:25 UTC

In my case I am getting the OOM after 13 minutes of testing with multiple CLI operations.

Comment 7 Radim Hatlapatka 2013-11-28 14:13:03 UTC

This is most likely related to BZ#1035787, I have applied patch proposed by lthon and the OOM no longer occurs.

Comment 8 Rémy Maucherat 2013-12-02 11:51:58 UTC

I committed a fix to 7.3 and 7.4. Good catch (it leaks two to three threads per restart of the connector).

Comment 9 Radim Hatlapatka 2013-12-19 09:36:16 UTC

*** Bug 1035787 has been marked as a duplicate of this bug. ***

Comment 10 Rémy Maucherat 2014-01-10 14:43:10 UTC

It was r2313 in 7.4 and r2315 in 7.3.

Comment 11 Radim Hatlapatka 2014-03-18 09:06:20 UTC

I have verified that the issue is fixed in EAP 6.3.0.DR4

Comment 12 Scott Mumford 2014-04-23 05:21:54 UTC

Adding Release Notes text and marking for inclusion in the 6.3.0 Release Notes document.

Note You need to log in before you can comment on or make changes to this bug.