Bug 1110794 - Concurrency problem in UberFireSecurityFilter causes HTTP worker threads to get stuck in an infinite loop, adding to the CPU load
Summary: Concurrency problem in UberFireSecurityFilter causes HTTP worker threads to g...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss BRMS Platform 6
Classification: Retired
Component: Business Central
Version: 6.0.2
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: CR2
: 6.0.2
Assignee: Alexandre Porcelli
QA Contact: Jiri Locker
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-06-18 13:07 UTC by Jiri Locker
Modified: 2014-09-22 11:50 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-08-06 19:53:33 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Heap screenshot illustrating the pointer loop in HashMap internal table (167.14 KB, image/png)
2014-06-19 13:38 UTC, Jiri Locker
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1145105 0 high CLOSED Kie module build synchronization leak resulting in HashMap infinite loop 2021-02-22 00:41:40 UTC

Internal Links: 1145105

Description Jiri Locker 2014-06-18 13:07:08 UTC
Description of problem:
When working with Business Central it sometimes happen that EAP java process ends up consuming 100% CPU forever, without performing any additional operations in Business Central.

When I made a thread dump using jstack utility I could see 26 threads being busy (RUNNABLE state) at exactly the same point (at java.util.HashMap.getEntry(HashMap.java:347)):

"http-localhost/127.0.0.1:8080-20" daemon prio=6 tid=0x0000000006783000 nid=0xfb0 runnable [0x0000000012c9e000]
   java.lang.Thread.State: RUNNABLE
	at java.util.HashMap.getEntry(HashMap.java:347)
	at java.util.HashMap.containsKey(HashMap.java:335)
	at java.util.HashSet.contains(HashSet.java:184)
	at org.uberfire.security.server.URLResourceManager.requiresAuthentication(URLResourceManager.java:93)
	at org.uberfire.security.impl.authz.DefaultAuthorizationManager.authorize(DefaultAuthorizationManager.java:62)
	at org.uberfire.security.server.HttpSecurityManagerImpl.authorize(HttpSecurityManagerImpl.java:228)
	at org.uberfire.security.server.UberFireSecurityFilter.authorize(UberFireSecurityFilter.java:327)
	at org.uberfire.security.server.UberFireSecurityFilter.doFilter(UberFireSecurityFilter.java:263)
	at ... (tomcat stuff)

I made a heap dump to see what's wrong with the HashSet, found a cycle in HasMap's internal pointers, looked into HashMap implementation and finally realized that there is a *single* URLResourceManager holding the single *thread unsafe* HashSet. And that the single HashSet is accessed by all worker threads serving HTTP requests with no synchronization. This eventually corrupts the HashMap's state after which point some threads end up trapped in an infinite loop inside HashMap.getEntry().


Version-Release number of selected component (if applicable):
6.0.2.CR1

How reproducible:
Hardly, race condition.

Steps to Reproduce:
Unknown.

Actual results:
Number of server-side threads being stuck in an infinite loop, generating high CPU load forever.

Expected results:
Synchronized access to excludeCache HashSet in URLResourceManager (https://github.com/uberfire/uberfire/blob/0.3.1.Final/uberfire-security/uberfire-security-server/src/main/java/org/uberfire/security/server/URLResourceManager.java#L55).

Additional info:

Comment 1 Alexandre Porcelli 2014-06-18 14:57:20 UTC
Great catch! Thanks for you report, following commit aims to fix the issue:

(0.3.x) http://github.com/uberfire/uberfire/commit/7f622d8e8
(master) http://github.com/uberfire/uberfire/commit/6ef82de96

Comment 2 Michael 2014-06-18 18:09:15 UTC
cherry-picked to 6.0.2.ER3: 1d75d0f23ad322878aa348c51cda5134448efed5

Comment 3 Jiri Locker 2014-06-19 13:38:53 UTC
Created attachment 910399 [details]
Heap screenshot illustrating the pointer loop in HashMap internal table


Note You need to log in before you can comment on or make changes to this bug.