Created attachment 741837 [details] engine.log * DC with 3 hosts, 2 domains on different storage servers. * Block connections between 2 HSMs and the non-master domain. * 2 HSMs should become Non Operational, but only 1 becomes Non Operational. * The following exception is seen: 2013-04-30 13:55:28,112 ERROR [org.ovirt.engine.core.bll.eventqueue.EventQueueMonitor] (pool-7-thread-50) Exception during process of events for p ool 5849b030-626e-47cb-ad90-3ce782d831b3, error is java.util.concurrent.ExecutionException: java.util.ConcurrentModificationException: java.util.c oncurrent.ExecutionException: java.util.ConcurrentModificationException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) [rt.jar:1.7.0_b147-icedtea] at java.util.concurrent.FutureTask.get(FutureTask.java:111) [rt.jar:1.7.0_b147-icedtea] at org.ovirt.engine.core.bll.eventqueue.EventQueueMonitor$InternalEventQueueThread.run(EventQueueMonitor.java:157) [engine-bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:71) [engine-utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_b147-icedtea] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_b147-icedtea] at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_b147-icedtea] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) [rt.jar:1.7.0_b147-icedtea] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) [rt.jar:1.7.0_b147-icedtea] at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_b147-icedtea] Caused by: java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:806) [rt.jar:1.7.0_b147-icedtea] at java.util.HashMap$KeyIterator.next(HashMap.java:841) [rt.jar:1.7.0_b147-icedtea] at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.ProcessDomainRecovery(IrsBrokerCommand.java:1284) [engine-vdsbr oker.jar:] at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.access$600(IrsBrokerCommand.java:121) [engine-vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData$6.call(IrsBrokerCommand.java:1222) [engine-vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData$6.call(IrsBrokerCommand.java:1216) [engine-vdsbroker.jar:] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_b147-icedtea] at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_b147-icedtea] ... 6 more * Block the same domain from the 3rd host (SPM). * The domain does not become Inactive, remains Active.
Created attachment 741838 [details] vdsm.log
Bug is easy, this is not race, it is a simple wrong code, we are trying to go through collection and in the same time we are trying to modified it. Simple java. I will provide patch soon.
After connectivity lost to storage, both HSM's become non-operational. After that, when blocking the storage to SPM, the domain become inactive. Verified on RHEVM-3.2 - SF16: rhevm-3.2.0-10.25.beta3.el6ev.noarch vdsm-4.10.2-18.0.el6ev.x86_64
3.2 has been released