Created attachment 920324 [details]
mvn projects and server.logs
I am running a 2 node DIST Cache (also tested 4 nodes) in which each node has access to its own JDBC Cache Store.
node 1 has 3.5M entries with the key range (1,3500000)
node 2 has 3.5M entries with the key range (3500001,7000000)
Each node starts and preloads 3.5M keys - correct
Sometimes the nodes do slight rebalancing and sometimes they do not rebalance.
However, after the startup, cache.size() on each node returns 3500000 not 7000000.
I have a servlet that gets the first 100 entries (key range 1-100).
If I run the servlet on the first node it will work. If I run the servlet on the second node it will return null for all 100 entries since both nodes only seem to see the entries they preloaded.
It's acting as if each node is its own cache, even though its in DIST mode and sometimes they do some sort of rebalance.
I am attaching the code/configs and log files. I have this running on a VM inside the RH network, so please ask to see a demo of this issue.
Not sure if this is a docs issue. Copying in Divya. Divya, could you comment on whether this should be a dev issue (which is what it looks like to me) rather than just something we need to fix in docs.
the description suggests that each node has its own JDBC cache store defined. Does it mean that they use a "shared" JDBC cache store and all nodes load the data from a single database? Or are there two databases (each node loading from a different database)? In that case, the shared="true" attribute (which I've seen in the configuration) would not make sense.
John, after taking another look at the configuration, I think I found the problem.
You have fetchInMemoryState="false" which means that a node that is started does not fetch in-memory state from the other nodes. So it only has what was loaded from the JDBC cache store. And since cache.size() only returns the number of elements stored on that single node, you see only entries which you loaded from the database. However, when you call a get() operation, it should load the entry from the other nodes, you should not get null. Can you confirm that setting fetchInMemoryState back to default "true" fixes your problem?
Note that keySet() operation in library mode returns only the locally stored entries. This operation does not fetch (AFAIK) entries from other nodes.
I'll check this week and update the case. Sorry for the delay.
In the test I just ran, I had two nodes, each loading 3.5M entries.
When I set fetchInMemoryState="true" the node entries start transferring.
On node 1 none of the nodes transferred and then I received timeout ERRORs, despite increasing the timeouts by 10-100x from the defaults.
On node 2 they were transferring but they are very slow. I am receiving roughly 12.5K entries (each entry is 1K so 12.5 MB) about every 45 seconds. That is transfer rate of about 277KB/s. The VMs are on the same host and if I run scp I get transfer rate of about 120MB/s.
How quickly should I see entries transferred?
What should my timeout values be when running several GBs of data?
After 30 minutes. Only 1M entries (1GB) had been transferred to node 2.
Node 1 was not working properly and the logs are filled with timeout errors.
I have a couple of questions for you about this report.
1) Can you explain what you mean by "the entries start transferring"? Do you mean that you are able to call egt() on entries that exist on the other node or do you mean that the number of entries on each node is changing?
2) When you stop and restart the EAP nodes, do you do them in the same order each time?
OK, you can ignore the questions in Comment 10: https://bugzilla.redhat.com/show_bug.cgi?id=1122662#c10
I think I have determined that the configuration you are using is not going to work well. (i.e. using a shared JDBC cache store with preloading enabled on each node and separate database tables for each node) Let me describe the problem that I ran into. Here are the steps I used:
1) Start two EAP nodes that have no tables in Postgress
2) Add 100 entries on each node
3) Traverse the entries on both nodes (200 entries are found on both nodes.)
4) Shut down both EAP instances
5) Start them back up and make sure they are clustered
6) Repeat step (3) (On the node that is started first, only the number of keys in the table for that node are found. On the second node, all 200 keys are found.)
When each node starts, it preloads the entries from the database into a local mode cache. When the second node starts, state transfer sends entries from node 1 to node 2. That's why the second node can traverse all entries in the cache. Infinispan expects that in a cluster configured with a shared cache store that one node in the cluster will load all entries when preload is enabled. The other issue with this configuration is that if a node fails, then the entries from that node will not be available to the other nodes in the cluster.
I'm going to talk to engineering about whether this configuration should be legal, but I wanted to let you know what I had found out so far.
This is the response that I got from Dan on the engineering team:
"Nope, it's not legal. If shared="true" then it means both nodes should have access to the same data in the cache store. However, John's problems are not really related to using shared="true", the behaviour with shared="false" would be the same.
When node 2 starts, it will become an owner for roughly half of the keys/hash segments.
Node 1 assumes that it had access to all the entries from the beginning. So it never does a remote get on node 2, and returns null if it doesn't find the key in memory or in the cache store.
Node 2 will receive the entries of node 1 through state transfer (assuming fetchInMemoryState = true), because it thinks it doesn't have any data. It assumes the preloaded entries are stale, so it will override any preloaded entry if it receives an entry with the same key from node 1. It does not delete the preloaded entries however, so it will have all the entries locally, which adds more confusion."
I'm going to close this as NOTABUG, and I'm going to open a JIRA that logs a warning if Infinispan detects this configuration.