Bug 993041 - RuntimeException in org.jboss.as.web.session.ClusteredSession.access
RuntimeException in org.jboss.as.web.session.ClusteredSession.access
Status: VERIFIED
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Clustering (Show other bugs)
6.1.1,6.2.0
Unspecified Unspecified
unspecified Severity high
: DR1
: EAP 6.4.0
Assigned To: Radoslav Husar
Jitka Kozana
:
: 1029010 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-05 09:42 EDT by Ladislav Thon
Modified: 2017-10-09 20:18 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previous versions of JBoss EAP 6, when a distributed web session was being accessed while another node was abruptly leaving the cluster, a lock acquisition could, in some instances, fail. When this occurred it resulted in the following exception: ---- RuntimeException: JBAS018060: Exception acquiring ownership of <session-id> ---- The root cause of this issue was that the lock acquisition did not take into account that a cluster node might leave the cluster at exactly the same time, resulting in the lock acquisition failure. This issue has been addressed and the exception no longer presents.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Ladislav Thon 2013-08-05 09:42:38 EDT
Seen this NPE during EAP 6.1.1.ER4 testing:

17:47:33,687 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-11,shared=udp) ISPN000094: Received new cluster view: [perf21/ejb|8] [perf21/ejb, perf18/ejb, perf19/ejb]
17:48:03,487 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host]] (ajp-perf19/10.16.90.56:8009-21) JBWEB000233: Exception sending request initialized lifecycle event to listener instance of class org.jboss.weld.servlet.WeldListener: java.lang.RuntimeException: JBAS018060: Exception acquiring ownership of 8mQ3yfXHmqy9n5omqSATdkZ7
	at org.jboss.as.web.session.ClusteredSession.access(ClusteredSession.java:498) [jboss-as-web-7.2.1.Final-redhat-5.jar:7.2.1.Final-redhat-5]
	at org.apache.catalina.connector.Request.doGetSession(Request.java:2616) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.apache.catalina.connector.Request.getSession(Request.java:2361) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:790) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.jboss.weld.context.beanstore.http.LazySessionBeanStore.getSession(LazySessionBeanStore.java:73) [weld-core-1.1.13.Final-redhat-1.jar:1.1.13.Final-redhat-1]
	at org.jboss.weld.context.beanstore.http.LazySessionBeanStore.<init>(LazySessionBeanStore.java:59) [weld-core-1.1.13.Final-redhat-1.jar:1.1.13.Final-redhat-1]
	at org.jboss.weld.context.http.HttpSessionContextImpl.associate(HttpSessionContextImpl.java:31) [weld-core-1.1.13.Final-redhat-1.jar:1.1.13.Final-redhat-1]
	at org.jboss.weld.context.http.HttpSessionContextImpl.associate(HttpSessionContextImpl.java:16) [weld-core-1.1.13.Final-redhat-1.jar:1.1.13.Final-redhat-1]
	at org.jboss.weld.servlet.WeldListener.requestInitialized(WeldListener.java:134) [weld-core-1.1.13.Final-redhat-1.jar:1.1.13.Final-redhat-1]
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:134) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:97) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:102) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:336) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:488) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:420) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:920) [jbossweb-7.2.1.Final-redhat-1.jar:7.2.1.Final-redhat-1]
	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_45]
Caused by: org.jboss.as.clustering.lock.TimeoutException: JBAS010213: Cannot acquire lock default-host/clusterbench/8mQ3yfXHmqy9n5omqSATdkZ7 from cluster
	at org.jboss.as.clustering.lock.SharedLocalYieldingClusterLockManager.lock(SharedLocalYieldingClusterLockManager.java:439)
	at org.jboss.as.clustering.web.infinispan.DistributedCacheManager.acquireSessionOwnership(DistributedCacheManager.java:400)
	at org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:524) [jboss-as-web-7.2.1.Final-redhat-5.jar:7.2.1.Final-redhat-5]
	at org.jboss.as.web.session.ClusteredSession.access(ClusteredSession.java:496) [jboss-as-web-7.2.1.Final-redhat-5.jar:7.2.1.Final-redhat-5]
	... 16 more

This happens on a machine called perf19 while on perf20, the application is being undeployed. This warning from perf20 looks related:

17:48:03,676 WARN  [org.infinispan.transaction.TransactionTable] (MSC service thread 1-3) ISPN000100: Stopping, but there are 1 local transactions and 0 remote transactions that did not finish in time.

Logs:

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/26/console-perf19/
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-http-session-undeploy-dist-sync/26/console-perf20/

I'm not entirely sure what kind of problem this is (maybe it's expected with DIST?), but given that this happens in a highly exposed part of code (ClusteredSession, search keyword "WeldListener"), I figured I'd rather file a bug.
Comment 1 Jitka Kozana 2013-08-05 10:13:11 EDT
For the sake of the similar looking BZ 958252: The application (un)deployed at this case, is not secured in any case. All access to the methods is unrestricted.
Comment 2 Radoslav Husar 2013-08-05 11:58:04 EDT
Could you check if this is a regression or maybe just a different manifestation of a previous problem?

Quickly looking, the issue is preceeded by [1] which might be hinting at a problem in Infinispan.

[1] 17:48:03,761 WARN  [org.infinispan.topology.CacheTopologyControlCommand] (OOB-445,shared=udp) ISPN000071: Caught exception when handling command CacheTopologyControlCommand{cache=dist, type=REBALANCE_CONFIRM, sender=perf18/web, joinInfo=null, topologyId=18, currentCH=null, pendingCH=null, throwable=null, viewId=8}: org.infinispan.CacheException: Received invalid rebalance confirmation from perf18/web for cache dist, we don't have a rebalance in progress
Comment 6 Paul Ferraro 2013-12-12 10:57:28 EST
SharedLocalYieldingClusterLockManager needs to be more resilient to timeouts and suspect exceptions.  Lock acquisition should retry in this case.
Comment 7 Paul Ferraro 2013-12-12 11:53:53 EST
*** Bug 1029010 has been marked as a duplicate of this bug. ***
Comment 8 Aleksandar Kostadinov 2013-12-13 07:29:22 EST
removing needinfo request because question has been answered
Comment 11 Ladislav Thon 2014-07-08 09:43:26 EDT
Still an issue (though rare), moving to 6.4.
Comment 13 Ladislav Thon 2014-09-26 09:39:51 EDT
During EAP 6.3 test cycle, this bug most often happened in the eap-6x-failover-http-session-undeploy-repl-sync test. Hence, I ran this test with EAP 6.4.0.DR2 10 times (see [1]) and didn't observe it at all. Marking as verified.

[1] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/lthon___bz993041___eap-6x-failover-http-session-undeploy-repl-sync/

Note You need to log in before you can comment on or make changes to this bug.