Bug 847809 - Cluster with non-shared JDBC cache store has too many entries after node failure
Cluster with non-shared JDBC cache store has too many entries after node failure
Status: ASSIGNED
Product: JBoss Data Grid 6
Classification: JBoss
Component: Infinispan (Show other bugs)
unspecified
Unspecified Unspecified
unspecified Severity high
: GA
: 7.0.0
Assigned To: Tristan Tarrant
Martin Gencur
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-08-13 11:48 EDT by Radim Vansa
Modified: 2015-06-03 19:10 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
During a node restart, Red Hat JBoss Data Grid may not start correctly because of duplicate entries in a cache. As a workaround, use a shared cache store instead of local cache stores. Using this workaround, JBoss Data Grid works correctly across restarts and the cache store does not contain duplicate entries.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
JDG standalone configuration file (14.34 KB, text/xml)
2012-08-13 11:48 EDT, Radim Vansa
no flags Details
Trace output (1.02 MB, text/plain)
2012-08-14 10:21 EDT, Radim Vansa
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker ISPN-2198 Major Resolved Cluster with non-shared JDBC cache store has too much entries after node failure 2016-03-31 13:59 EDT

  None (edit)
Description Radim Vansa 2012-08-13 11:48:49 EDT
Created attachment 604033 [details]
JDG standalone configuration file

Description of problem:

In resilience test with 4-node cluster where one node is killed a weird situation appears. Before the node kill have this number of entries:

210602;215820;209400;203038 = 838860 entries

After the kill the number of entries changes for a while:

210602;null;209400;203038
250602;null;269400;243038
290602;null;269400;273038
300602;null;289400;293038
300602;null;289400;293038
321218;null;296035;293038

But then it stabilizes on 

326899;null;305039;314165 = 946103 entries

When the node02 is restarted it complains about duplicit entries:

ERROR [org.infinispan.loaders.jdbc.stringbased.JdbcStringBasedCacheStore] (OOB-124,null) ISPN008024: Error while storing string key to database; key: '8Az4Ia2V5NzYzNDI=', buffer size of value: 1050 bytes: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '?8Az4Ia2V5NzYzNDI=' for key 'PRIMARY'

Is this a bug or wrong configuration?


Version-Release number of selected component (if applicable):

6.0.1.ER2
Comment 1 JBoss JIRA Server 2012-08-14 07:34:14 EDT
Galder Zamarreño <galder.zamarreno@redhat.com> made a comment on jira ISPN-2198

Hmmm, odd. The key should be '8Az4Ia2V5NzYzNDI=', but the exception says that the key is 'PRIMARY'?

@Radim, can you replicate this in a smaller scale and generate some logs with TRACE on for org.infinispan package?

Tristan/Mircea, can either of you have a look to this?
Comment 2 JBoss JIRA Server 2012-08-14 10:06:05 EDT
Radim Vansa <rvansa@redhat.com> made a comment on jira ISPN-2198

I have used only 26 entries (with 2 owners each) and one client asking for the entries (still there's enough jabber).
The sfout.txt contains org.infinispan TRACE log (together with the test log), the cache_entries shows that originaly the cluster has 18+10+13+11=52 entries and after the kill it's 22+16+20=58.
Comment 3 JBoss JIRA Server 2012-08-14 10:06:51 EDT
Radim Vansa <rvansa@redhat.com> made a comment on jira ISPN-2198

I have used only 26 entries (with 2 owners each) and one client asking for the entries (still there's enough jabber).
The sfout.txt contains org.infinispan TRACE log (together with the test log), the cache_entries.csv shows that originaly the cluster has 18+10+13+11=52 entries and after the kill it's 22+16+20=58.
Comment 4 Radim Vansa 2012-08-14 10:21:56 EDT
Created attachment 604320 [details]
Trace output
Comment 5 JBoss JIRA Server 2012-08-17 09:52:29 EDT
Mircea Markus <mmarkus@redhat.com> made a comment on jira ISPN-2198

Couldn't reproduce the issue locally, through a unit test. Waiting from Radim to upload the trace log files from his environment.
Comment 6 JBoss JIRA Server 2012-08-17 10:12:27 EDT
Radim Vansa <rvansa@redhat.com> made a comment on jira ISPN-2198

As requested by mmarkus, I enclose more logs.
Comment 7 JBoss JIRA Server 2012-08-20 06:25:57 EDT
Mircea Markus <mmarkus@redhat.com> made a comment on jira ISPN-2198

@Radim - the attached logs files were produced with DEBUG level enabled. This is not good for me, as it doesn't highlight individual key added to the cache. Can you please reproduce with TRACE level?
Comment 8 JBoss JIRA Server 2012-08-20 19:08:19 EDT
Mircea Markus <mmarkus@redhat.com> made a comment on jira ISPN-2198

Looking at the attached logs I can see that a put takes place on the node 4 *at the same* time when the other server(number 2) is shutdown (time 12:51:17,827).
My understanding of the problem is that there's no (put) activity *during*  and after the shutdown - otherwise the increasing number of entries might simply be explained by the addition of more entries to the system. Can you please confirm this?
I also didn't see any size() being invoked on all the caches in the cluster (e.g on node 3) - how was the size of each individual cache obtained?
Comment 9 JBoss JIRA Server 2012-08-21 03:16:48 EDT
Radim Vansa <rvansa@redhat.com> made a comment on jira ISPN-2198

The client thread is doing puts and gets all the time, however, the set of keys it uses is static and therefore no other keys should be added to the cache.
The statistics are obtained through JMX on jboss.infinispan:type=Cache,name="testCache(dist_sync)",manager="default",component=Statistics, querying attribute numberOfEntries
Comment 10 mark yarborough 2012-08-22 09:37:24 EDT
Moving to 6.1 since is not regression.
Comment 11 mark yarborough 2012-08-22 09:37:24 EDT
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
CCFR from mmarkus
Comment 12 Mircea Markus 2012-08-22 10:55:52 EDT
The root couse of this problem needs to be still analysed. As a workaround, using a shared cache store (vs local cache stores) should work.
Comment 15 Misha H. Ali 2013-05-06 23:44:03 EDT
Set flag to nominate for 6.2 release notes.
Comment 17 Misha H. Ali 2014-07-14 05:17:03 EDT
Not required for release notes.
Comment 18 JBoss JIRA Server 2015-06-03 09:22:58 EDT
Dan Berindei <dberinde@redhat.com> updated the status of jira ISPN-2198 to Resolved

Note You need to log in before you can comment on or make changes to this bug.