Michal Abaffy <mabaffy> updated the status of jira ISPN-2186 to Reopened
Michal Abaffy <mabaffy> made a comment on jira ISPN-2186 I have run benchmark elasticity 2-4 test with radargun on my localhost with infinispan-core:5.1.7-Final-redhat-1 several times and the problem seems to still be there. Example log: 11:25:13,355 WARN [org.infinispan.transaction.TransactionTable] (pool-1-thread-1) ISPN000100: Stopping, but there are 5 local transactions and 0 remote transactions that did not finish in time. 11:25:13,363 DEBUG [org.infinispan.cacheviews.CacheViewsManagerImpl] (CacheViewInstaller-1,mabaffy-14775) Installing new view CacheView{viewId=8, members=[mabaffy-5803, mabaffy-30789]} for cache x 11:25:13,371 DEBUG [org.infinispan.cacheviews.CacheViewsManagerImpl] (CacheViewInstaller-1,mabaffy-14775) Cache x view CacheView{viewId=8, members=[mabaffy-5803, mabaffy-30789]} installation was interrupted because the coordinator is shutting down
I have run this test also on jenkins machines, so you could see whole logs and configuration of test. See Build Artifacts->report->stdout.zip->slave2.log and time 05:09:35 in https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/view/EDG-REPORTS-PERF/job/jdg60-benchmark-elasticity-02-04-radargun/2/ That warning is also in slave4.log and is missing in slave1 and slave3
Dan Berindei <dberinde> made a comment on jira ISPN-2186 I see, I misread the description of the bug a bit. I fixed the part where the cache view installation commands from the old coordinator reach the new coordinator and break the new coordinator's cache view installation (potentially making it hang). I did not fix the old coordinator attempting to install a new cache view, because CacheViewsManagerImpl only finds out about the shutdown after all the local caches have been already stopped.
Dan Berindei <dberinde> updated the status of jira ISPN-2186 to Resolved
Dan Berindei <dberinde> made a comment on jira ISPN-2186 A proper fix would require the global components to know that they will be stopped before we start shutting down the caches. But I don't think it's worth doing it just to avoid a DEBUG log message. We should instead focus on adding a method to gracefully shut down the entire cluster: ISPN-1239
Not a regression and does not affect data integrity. Setting severity to LOW and moving to 6.1.
Set flag to nominate this bug for 6.2 release notes.