Bug 990102
Summary: | Concurrent access timeout -- could not obtain lock within 5000 MILLISECONDS | ||
---|---|---|---|
Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | Ladislav Thon <lthon> |
Component: | EJB | Assignee: | David M. Lloyd <david.lloyd> |
Status: | CLOSED EOL | QA Contact: | Michal Vinkler <mvinkler> |
Severity: | high | Docs Contact: | |
Priority: | urgent | ||
Version: | 6.1.1 | CC: | anmiller, cdewolf, dandread, david.lloyd, ehugonne, jawilson, jkudrnac, jmartisk, lthon, myarboro, paul.ferraro, rhusar, thofman |
Target Milestone: | --- | ||
Target Release: | EAP 6.4.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Known Issue | |
Doc Text: |
This release of JBoss EAP 6 carries a bug that could produce a `concurrent access timeout` when an EJB client invoking a method on a stateful bean in a "forwarding" cluster; this bean forwards the call to stateful beans in a "target" cluster, and then back again. Invocations are serial; the client will not invoke a method on a bean until it got a response to previous invocation. When one of the servers in the cluster is shut down, the error occurs.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2019-08-19 12:44:29 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Ladislav Thon
2013-07-30 11:38:36 UTC
Is this a regression from 6.1.0 GA? After some investigation, I think there is a bug in StatefulSessionSynchronizationInterceptor, specifically in the registered Synchronization responsible for releasing the SFSB instance lock. If the Synchronization callbacks throw an exception, the Lock.unlock() happens outside the context of a lock owner. This would cause a subsequent Lock.tryLock(...) for a new owner to fail. This seems to be an EJB bug and doesn't appear to be directly related to clustering. Paul, thank you for the investigation. Reassigning to Jaikiran to take a look. @Jaikiran I've opened and upstream jira for this as well: https://issues.jboss.org/browse/WFLY-1810 The upstream JIRA (https://issues.jboss.org/browse/WFLY-1810) contains a link to a pull request against WildFly (https://github.com/wildfly/wildfly/pull/4876). I took the patch, applied it to EAP and ran the test again. No good, the issue is still there :-( The run: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-2clusters-ejbremote-shutdown-repl-async-timeout-investigation/13/ Doh! I'm sorry, I incidentally used pure 6.1.1.ER4 without the fix in the build from comment 10. I did another run, this time with the fix for real, but the problem persists. See https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-2clusters-ejbremote-shutdown-repl-async-timeout-investigation/14/ I'm wondering if the issue is not created by the server trying to recover the transactions during shutdown. Assigning jpai EJB issues to david.lloyd. Please re-assign to Cheng or others as needed. @Láďo, I'm trying to get rid of old open bugzillas for EJB. Can you check that this still occurs in EAP 6.3? If yes, we can propose it for 6.4. Yes, we're still seeing this, see e.g. https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/eap-6x-failover-ejb-2clusters-ejbremote-shutdown-repl-async/39/ for EAP 6.3.0.ER9. Proposing for 6.4. Ladislav, David, Could one of you please provide a draft release note in the Doc Text field above? I'm having trouble parsing the circumstances/consequences of the issue. Much appreciated. There are 2 clusters, each of them having 2 nodes. |