Bug 1002957
| Summary: | Hot Rod client doesn't retry operation on RemoteException | ||
|---|---|---|---|
| Product: | [JBoss] JBoss Data Grid 6 | Reporter: | Michal Linhard <mlinhard> |
| Component: | Infinispan | Assignee: | Tristan Tarrant <ttarrant> |
| Status: | CLOSED UPSTREAM | QA Contact: | Nobody <nobody> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 6.2.0 | CC: | jdg-bugs, mgencur, nobody |
| Target Milestone: | ER4 | ||
| Target Release: | 6.2.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Known Issue | |
| Doc Text: |
There are certain types of remote-side exceptions that the Java Hot Rod client will not recover from by retrying, even though it should be possible. This also happens when the org.infinispan.remoting.RemoteException is thrown in the server. The HotRodClientException is thrown on the client side, the operation does not retry and the issue needs to be handled by the user code.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2025-02-10 03:28:26 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1017190 | ||
|
Description
Michal Linhard
2013-08-30 11:01:55 UTC
is this a GA blocker? there's no data loss here and there's a workaround: client can retry. Galder Zamarreño <galder.zamarreno> updated the status of jira ISPN-3454 to Coding In Progress Michal Linhard <mlinhard> made a comment on jira ISPN-3454 TRACE logs https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/RESILIENCE/job/jdg-func-resilience-dist-4-3/90/artifact/report/clientlogs.zip https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/RESILIENCE/job/jdg-func-resilience-dist-4-3/90/artifact/report/serverlogs.zip Galder Zamarreño <galder.zamarreno> made a comment on jira ISPN-3454 This is a side effect of ISPN-2737, where we introduced RemoteException in order to report application-level exceptions (i.e. lock timeout). What's happening here is that it seems that on occasion, [SuspectException is treated as a RemoteException|https://gist.github.com/galderz/bd1a71a06fa24efa7792], and others it just bubbles up as [SuspectException|https://gist.github.com/galderz/a121b1727a7f4ac1b78c]. The bug here is that SuspectExceptions should not be treated as RemoteException. RemoteException instances on the client should be reported by to the client, as it happens here, since they are application-level errors. Galder Zamarreño <galder.zamarreno> made a comment on jira ISPN-3454 The reason this is happening is because node02 has been suspected. When SuspectException is returned, node01 has suspected node02 [directly|https://gist.github.com/galderz/a121b1727a7f4ac1b78c]. SuspectException is wrapped in a RemoteExcetion when node01 receives a response from node04 saying that node02 has been [suspected|https://gist.github.com/galderz/bd1a71a06fa24efa7792]. RemoteException handling logic should check the type of the exception returned in the response. Fixing this should be relatively straightforward. What's much harder to do is have a unit test to verify this. I'll see what can be done here. Galder Zamarreño <galder.zamarreno> made a comment on jira ISPN-3454 Try [branch|https://github.com/galderz/infinispan/tree/t_3454] with possible fix in resilience jobs. Michal Linhard <mlinhard> made a comment on jira ISPN-3454 [~galder.zamarreno] I've got some compilation failure on both jenkins and my laptop, do you know what could be the problem ? https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-qe-build-infinispan60-specific/6/console {code} [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 18.679s [INFO] Finished at: Wed Nov 06 11:17:09 CET 2013 [INFO] Final Memory: 45M/513M [INFO] ------------------------------------------------------------------------ [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project infinispan-core: Compilation failure [ERROR] /home/mlinhard/dev/source/jdg-branch/t_3454/infinispan/core/src/main/java/org/infinispan/util/logging/Log.java:[1025,21] [ERROR] -> [Help 1] [ERROR] {code} When I open the file in Eclipse I can't see any error, but the maven build fails.. Galder Zamarreño <galder.zamarreno> made a comment on jira ISPN-3454 There was a couple of errors in the message logging part, which was detected by the annotation processor. I've updated the branch. Michal Linhard <mlinhard> made a comment on jira ISPN-3454 I've ran the 4 node dist mode resilience test 2 times with your branch and didn't spot the RemoteException https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/RESILIENCE/job/jdg-func-resilience-dist-4-3/91/artifact/report/loganalysis/client/index.html https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/JDG/view/RESILIENCE/job/jdg-func-resilience-dist-4-3/97/artifact/report/loganalysis/client/index.html Galder Zamarreño <galder.zamarreno> made a comment on jira ISPN-3454 Thanks [~mlinhard] for running these tests. Glad to hear that issue appears to be gone. I'll submit a PR asap. Galder Zamarreño <galder.zamarreno> made a comment on jira ISPN-3454 ISPN-2737 caused this issue. Verified for 6.2.0.ER4 This product has been discontinued or is no longer tracked in Red Hat Bugzilla. |