Bug 1204055
Summary: | EJB client regression in 6.4.0.CR1 | ||
---|---|---|---|
Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | Ladislav Thon <lthon> |
Component: | Clustering, EJB | Assignee: | Tomas Hofman <thofman> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Jitka Kozana <jkudrnac> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.4.0 | CC: | cdewolf, jmartisk, kkhan, mvinkler, myarboro, rachmato, thofman, wfink |
Target Milestone: | CR2 | Keywords: | Regression |
Target Release: | EAP 6.4.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-08-19 12:43:25 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1206597 |
Description
Ladislav Thon
2015-03-20 09:44:56 UTC
Been on PTO for the past two days. Let me have a look this afternoon. Ladislav Could you please run the test again with "debug" level logging turned on for these classes: org.jboss.ejb.client.remoting.MaxAttemptsReconnectHandler org.jboss.ejb.client.EJBClientContext org.jboss.ejb.client.ClusterContext This should give us some more information on what is happening when the EJBReceiver is being selected. To be safe, add this on both the original client as well as the forwarder. The original runs mentioned above already have debug logging enabled for the org.jboss.ejb.client package, but only for the in-server EJB client. Here's a run with the same debug logging enabled for the standalone client as well: https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/lthon___debug___eap-6x-failover-ejb-2clusters-ejbremote-shutdown-repl-async/1/ Note that I had to run it on different machines: here, perf29 is the standalone client, perf30-31 is the target cluster and perf32-33 is the forwarder cluster (so it's just like the original + 12). I can see where the problem is. When the nodes {perf30,perf31} are taken down individually, module un-availability reports are sent to the clients {perf32, perf33}. For some reason, the module un-availability reports for clusterbench-ee6, clusterbench-ee6-we, and clusterbench-ee6-web-granular arrive in together (on perf32, for example), yet the un-availability report for clusterbench-ee6-ejb (the app receiving the invocations and causing the problems) arrives 33 seconds later. This means that perf32 thinks the module is available during that time, when it is not - it has been undeployed or is in the process of being undeployed. The perf32 client receives exceptional return results to that effect when the invocation fails and, at this stage, it should try the other node perf31. But it doesn't - it repeatedly tries perf30. After the unavailability report arrives, things return to normal. Investigating further. Ladislav I could really benefit from being able to pass in my own build (with changes to the EJB client libraries) and run the test case against that build. Would it be possible to set up a copy of the SmartFrog Jenkins job which runs this test so that I can pass in my own copy of either the whole build or just a replacement EJBClient jar? How busy is the lab these days for running such jobs? If I could get a run in every couple of hours, that would be fine. I'll be up early tomorrow so that I might have more overlap with Europe. Upgrade to ejb client 1.0.30 BZ1206597 backs out the fix for BZ1192471 and so this regression should be gone in CR2 Verified with EAP 6.4.0.CR2. |