Bug 1086180

Summary: REST Rolling uprades (since 6.4) can't fetch data from REST store which has disabled compatibility mode
Product: [JBoss] JBoss Data Grid 6 Reporter: Jakub Markos <jmarkos>
Component: ServerAssignee: Tristan Tarrant <ttarrant>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Gencur <mgencur>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.4.0, 6.3.1CC: afield, jdg-bugs, slaskawi, tsykora, ttarrant
Target Milestone: CR1   
Target Release: 6.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-01-28 13:27:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jakub Markos 2014-04-10 10:00:31 UTC
See the linked JIRA, please.

Comment 4 Martin Gencur 2014-08-27 10:36:57 UTC
I've put a comment in the JIRA. I don't think we should be fixing this. It's a special use case when there's a REST cache store but the data in the "source" and "target" clusters are written/read via HotRod. The normal use case when there are REST clients work fine. And the REST cache store was created enable rolling-upgrades for REST clients.
A proper solution would be complex, with low benefit.

Comment 5 Alan Field 2014-09-03 08:41:46 UTC
Removing from 6.3.1, since there is a discussion about the fix right now

Comment 6 Tomas Sykora 2015-01-07 13:11:25 UTC
I did new investigations with regars to this issue. 

I have found out that since 6.4 (for NEW server/cluster) we need to have (source/old) node started with <compatibility enabled="true">.

ExampleConfig test is using standalone-compatibility-mode.xml for mimicking an OLD server.

When we try to use plain standalone.xml, which has disabled compatibility mode, we meet:

14:00:18,048 SEVERE [org.jboss.resteasy.core.SynchronousDispatcher] (http-/127.0.0.1:8180-2) Failed executing GET /rest/default/: org.jboss.resteasy.spi.WriterException: java.lang.ClassCastException: [B cannot be cast to java.lang.String

again. 

NOTE: don't forget to disable security elements for rest endpoint in standalone.xml when trying to rerun tests.

What's interesting -- it is working for example for rest rolling upgrades from 6.2.1 (standalone.xml, compatibility disabled) to 6.3.2 (standalone-rest-rolling-upgrades.xml, compatibility explicitly!! disabled [it's enabled by default in that config]) 

What changed in the server and/or rest remote store from 6.3.2 to 6.4.0? 

Why the new server needs its rest-remote cache store to have enabled compatibility mode in order to fetch/understand entries correctly?    

One more thing, I've tries to explicitly disable ( <compatibility enabled="false"/> ) in standalone-rest-rolling-upgrades.xml example config file and try "old" server with pure standalone.xml. No luck, same problem. 

Simply, we need the only cache (which is rest cache store for a new server) with compatibility mode enabled. (since 6.4.0)

Tristan? Please? Any ideas, input? Am I missing something? Or is that desired behaviour? 
Thank you!

Comment 7 Martin Gencur 2015-01-07 13:28:13 UTC
How do we store data in the old cluster?  Via HotRod or REST? I made a comment a long time ago wrt what I think about storing via HotRod and using REST rolling upgrades.

Comment 8 Tomas Sykora 2015-01-07 15:54:58 UTC
Storing via REST, tests were explicitly adjusted to do it this way:

RESTHelper.addServer(s2.server.getRESTEndpoint().getInetAddress().getHostName(), s2.server.getRESTEndpoint().getContextPath());

post(fullPathKey(0, DEFAULT_CACHE_NAME, "key1", PORT_OFFSET), "data", "text/html");
get(fullPathKey(0, DEFAULT_CACHE_NAME, "key1", PORT_OFFSET), "data");

using RESTHelper: org.infinispan.server.test.client.rest.RESTHelper

I will dig more into it, just wanted to know whether there couldn't possibly be any "hidden" and related change I am not aware of. Between 6.3.2 and 6.4.0. Will see.

Comment 9 Tomas Sykora 2015-01-08 13:47:14 UTC
We (with mgencur's 1st-class help) found the route cause of our problem. 

It seems like there is a kind of pollution between migrators. 
(For a refference: https://issues.jboss.org/browse/ISPN-3823)

As a workaround we have to explicitly OMIT calling of recordKnownGlobalKeyset operation when doing REST rolling upgrades. This operation does not do anything and is ambiguous and can (have to) be omitted. 

Then, without it, the whole process is running smoothly. 

We need to document this workaround, just simply remove that particular step from the _REST_ rolling upgrade process in our documentation.

Comment 10 Tomas Sykora 2015-01-08 14:45:20 UTC
PR sent: https://github.com/infinispan/jdg/pull/428

Setting milestone to CR1, if we can't make it into CR1, nothing happens. This is test fix (workaround).

Comment 11 Tomas Sykora 2015-01-12 11:08:16 UTC
This workaround works like a charm.

Thanks for quick integration PROD guys! 

I've also created a BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1181082 to reflect current state in the documentation. Maybe we would like to also asynchronously release previous docs to warn users.

However, we need to make sure that our effort in including one waring message + deleting one step from REST roll ups process in just not too much for last 4 releases. 

Will discuss with Misha.