Bug 1039585

Summary: [GSS] (6.3.0) Clustered session memory leaking
Product: [JBoss] JBoss Enterprise Application Platform 6 Reporter: Shay Matasaro <smatasar>
Component: ClusteringAssignee: Paul Ferraro <paul.ferraro>
Status: CLOSED CURRENTRELEASE QA Contact: Jitka Kozana <jkudrnac>
Severity: unspecified Docs Contact: Russell Dickenson <rdickens>
Priority: unspecified    
Version: 6.1.0CC: aogburn, bmaxwell, dereed, ihands, kkhan, myarboro, paul.ferraro, rhusar, rjanik, sdodson, smumford
Target Milestone: DR0   
Target Release: EAP 6.3.0   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previous versions of JBoss EAP 6 contained a bug that could lead to an OutOfMemoryException in distributed web sessions. The exception was encountered if a web session expired without the lock objects created by the session manager being released or destroyed. As web sessions continued to expire, the residual lock objects accumulated in memory. Eventually, this would lead to an OutOfMemoryException. The only recourse was to redeploy the web application. In this release of the product the lock objects are properly released and the OutOfMemory no longer presents
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-28 15:40:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1030681, 1051591, 1130600    

Description Shay Matasaro 2013-12-09 14:38:55 UTC
Stale sessions are not getting removed from memory.
SharedLocalYieldingLockManager.LocalLock   hogs an increasing amount of heap space.

original case data https://bugzilla.redhat.com/show_bug.cgi?id=1030681

Possible source is https://issues.jboss.org/browse/WFLY-898, which was closed when https://issues.jboss.org/browse/WFLY-406  was submitted

Looking to port a fix for EAP 6 , but the fix for WFLY-406 is a complete rewrite and involves 321 files.

Is there a smaller scope fix that could be ported or a possible workaround?

Comment 1 Paul Ferraro 2013-12-09 15:09:00 UTC

Might this just be a product of the number of sessions in your application?
The SharedLocalYieldingClusterLockManager hold a LocalLock for every session "owned" by the local node.
However, if a session passivates, the local lock for that session stays in memory.  Thus is it possible to exhaust the heap via the number of session even when using passivation.  It might be a good idea to release/destroy the local lock on passivation too.
The LocalLock is meant to be destroyed when a session invalidates, expires, or when some other node takes the lock.  While it's certainly possible that one of these cases is not correctly destroying the LocalLock - I want to rule out the above cause first.

Comment 2 dereed 2013-12-31 16:15:18 UTC
This is definitely a bug in EAP.  It can be easily reproduced, and the leak occurs every time a session times out.  (It does not occur if session.invalidate is called).

Comment 3 Paul Ferraro 2014-01-03 21:48:10 UTC

Comment 4 Shay Matasaro 2014-01-10 16:03:35 UTC
fix checked in for 6.2.0

Comment 5 Richard Janík 2014-02-19 15:21:00 UTC
Verified for 6.3.0.DR0.

Comment 6 Scott Dodson 2014-03-24 14:41:40 UTC
I'm working with the Linux performance teams on their efforts to do proactive performance analysis based on collection of sosreports. One of the things they hi-lighted is that our nodes are using a large amount of memory in Slabcache, in particular negative dentry cache entries.

We wrote a systemtap script and we've found that JBOSS is responsible for all of the negative dentry items, could this be related to this bug as well? What we see is JBOSS attempting to look for thousands of files that do not exist. My thinking is that perhaps it's looking for files associated with expired sessions 

[root@gss-webjava01 ~]# stap dentry.stp

Comment 7 dereed 2014-03-24 14:54:10 UTC
> My thinking is that perhaps it's looking for files associated with expired sessions

That is the directory where the sessions are passivated.
So expired session IDs is the most likely cause.

That wouldn't be related to this BZ though.
That would just be the normal expected behavior for an expired session ID.

Comment 8 Scott Mumford 2014-05-05 05:02:05 UTC
Paul, can you provide some details in the Doc Text field as to what was causing this leak, and how you fixed it, so we can get it into the 6.3.0 Release Notes.


Comment 9 Paul Ferraro 2014-05-14 15:24:20 UTC
Added release note text.

Comment 10 Scott Mumford 2014-05-14 23:41:24 UTC
Refactored Doc Text into prose form.

Thanks Paul.