Bug 881080 - Silence SuspectExceptions
Summary: Silence SuspectExceptions
Keywords:
Status: ASSIGNED
Alias: None
Product: JBoss Data Grid 6
Classification: JBoss
Component: Infinispan
Version: 6.1.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 6.2.0
Assignee: Tristan Tarrant
QA Contact: Nobody
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-28 15:32 UTC by Michal Linhard
Modified: 2023-03-02 08:27 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
In Red Hat JBoss Data Grid, <literal>SuspectExceptions</literal> are routinely raised when nodes shut down because they are unresponsive as they shut down. As a result, a <literal>SuspectException</literal> error is added to the logs. The <literal>SuspectExceptions</literal> do not affect data integrity. This is a known issue in JBoss Data Grid 6.4 and no workaround is currently available for this issue.
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker ISPN-2577 0 Critical Resolved Silence SuspectExceptions 2018-01-22 05:24:23 UTC

Description Michal Linhard 2012-11-28 15:32:22 UTC
There are lots of SuspectExceptions in resilience test run:

https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/view/EDG-REPORTS-RESILIENCE/job/edg-60-resilience-dist-4-3/54/artifact/report/serverlogs.zip

We managed to keep these under cover for 5.1.x branch.
This is of course a cosmetic problem, but it polutes the log with unnecessary ERRORs.

Parsed logs here:
http://www.qa.jboss.com/~mlinhard/test_results/run54-parsed/

Comment 1 Michal Linhard 2012-12-03 08:37:01 UTC
Created ISPN JIRA

Comment 2 JBoss JIRA Server 2013-01-15 15:46:59 UTC
Mircea Markus <mmarkus> made a comment on jira ISPN-2577

This has more of a cosmetic impact, but nice to have for the final.

Comment 3 JBoss JIRA Server 2013-01-25 11:10:37 UTC
Michal Linhard <mlinhard> made a comment on jira ISPN-2577

I didn't want to create a new JIRA this kind of cosmetic task, but if we're doing this SuspectException silencing, let's sync it with behaviour on hotrod client. If we have a RetryOnFailureOperation should it log an error (in the retry mode) ? My proposal would be switching this to warning.

{code}
09:25:19,154 ERROR [org.jboss.smartfrog.jdg.loaddriver.DriverThread] (DriverThread-378) Error doing: PUT key655878 to node node0003, took 1395 ms
org.infinispan.client.hotrod.exceptions.RemoteNodeSuspecException:Request for message id[1981607] returned server error (status=0x85): org.infinispan.remoting.transport.jgroups.SuspectException: One or more nodes have left the cluster while replicating command SingleRpcCommand{cacheName='testCache', command=PutKeyValueCommand{key=ByteArrayKey{data=ByteArray{size=12, hashCode=727c8fa, array=0x033e096b65793635..}}, value=CacheValue{data=ByteArray{size=1029, array=0x034304002106020a..}, version=9007207845383422}, flags=[IGNORE_RETURN_VALUES], putIfAbsent=false, lifespanMillis=-1, maxIdleTimeMillis=-1, successful=true}}
	at org.infinispan.client.hotrod.impl.protocol.Codec10.checkForErrorsInResponseStatus(Codec10.java:153)
	at org.infinispan.client.hotrod.impl.protocol.Codec10.readHeader(Codec10.java:110)
	at org.infinispan.client.hotrod.impl.operations.HotRodOperation.readHeaderAndValidate(HotRodOperation.java:78)
	at org.infinispan.client.hotrod.impl.operations.AbstractKeyValueOperation.sendPutOperation(AbstractKeyValueOperation.java:72)
	at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:52)
	at org.infinispan.client.hotrod.impl.operations.PutOperation.executeOperation(PutOperation.java:41)
	at org.infinispan.client.hotrod.impl.operations.RetryOnFailureOperation.execute(RetryOnFailureOperation.java:68)
	at org.infinispan.client.hotrod.impl.RemoteCacheImpl.put(RemoteCacheImpl.java:231)
	at org.infinispan.CacheSupport.put(CacheSupport.java:53)
	at org.jboss.qa.jdg.adapter.HotRodAdapter$HotRodRemoteCacheAdapter.put(HotRodAdapter.java:247)
	at org.jboss.qa.jdg.adapter.HotRodAdapter$HotRodRemoteCacheAdapter.put(HotRodAdapter.java:232)
	at org.jboss.smartfrog.jdg.loaddriver.DriverThreadImpl.makeRequest(DriverThreadImpl.java:236)
	at org.jboss.smartfrog.jdg.loaddriver.DriverThreadImpl.run(DriverThreadImpl.java:331)

{code}

Comment 4 JBoss JIRA Server 2013-01-28 16:35:56 UTC
Galder Zamarreño <galder.zamarreno> made a comment on jira ISPN-2577

We should also silence situations like ISPN-2752. In that JIRA, a node is trying to establish a new view sending a cache topology control view, but the node is stopping.

Comment 8 Misha H. Ali 2013-05-07 03:25:22 UTC
Nominated for 6.2 release notes.

Comment 10 Michal Linhard 2013-08-27 07:05:18 UTC
Still present in JDG 6.2.0.DR3

Comment 11 JBoss JIRA Server 2013-11-21 09:47:47 UTC
Anuj Shah <anujshahwork> made a comment on jira ISPN-2577

There's a discussion here:
https://community.jboss.org/message/846155

This issue may be more than cosmetic

Comment 12 JBoss JIRA Server 2013-11-23 10:07:50 UTC
Michal Linhard <mlinhard> made a comment on jira ISPN-2577

This issue is about silencing the SuspectExceptions - not displaying them as error level message, since it's not an exceptional situation that a node is suspected during topology changes.

Of course currently SuspectExceptions may accompany other serious errors, but those should be logged separately.

Comment 13 Michal Linhard 2013-11-23 10:08:20 UTC
Still present in 6.2.0.ER4

Comment 16 JBoss JIRA Server 2015-05-15 09:24:10 UTC
Dan Berindei <dberinde> updated the status of jira ISPN-2577 to Resolved


Note You need to log in before you can comment on or make changes to this bug.