Bug 1518598 - Cannot synchronize a storage domain's LUN if its storage domain contains a shared disk between two VMs
Summary: Cannot synchronize a storage domain's LUN if its storage domain contains a sh...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.1.9
: ---
Assignee: Idan Shaby
QA Contact: Lilach Zitnitski
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-29 09:39 UTC by Idan Shaby
Modified: 2018-01-24 10:40 UTC (History)
6 users (show)

Fixed In Version: ovirt-engine-4.1.9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-24 10:40:50 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: ovirt-4.2+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 84862 0 master MERGED backend: avoid inserting a duplicate key to a Map 2020-04-27 22:15:13 UTC
oVirt gerrit 84876 0 ovirt-engine-4.1 MERGED backend: avoid inserting a duplicate key to a Map 2020-04-27 22:15:13 UTC

Description Idan Shaby 2017-11-29 09:39:56 UTC
Description of problem:
When SyncLunsInfoForBlockStorageDomainCommand calls to GetDeviceList and detects that there's a new LUN in the storage that is a part of the storage domain, or that there's an update for an existing LUN that is a part of the storage domain, it tries to update the DB.
However, if the storage domain contains a shared disk between two VMs, an exception is thrown and the storage domain is not synchronized (the DB is not updated).

Version-Release number of selected component (if applicable):
9bb30a5c970c79ca58c26625f0999417ea2bf05d

How reproducible:
100%

Steps to Reproduce:
Basically, this corner case happens when the LUN in the storage has a property that we need to update in the DB. For example, its size.
Here's one way to reproduce it:
1. Have a block storage domain with a shared disk between two VMs.
2. Take it down to maintenance.
3. Update the luns table in the DB so that at least one of the storage domain's LUNs will have a wrong size.
This query sets the size to be 0 for all the storage domains' LUNs:
UPDATE luns SET device_size = 0 WHERE volume_group_id = '<sd_vg_id>';
In my env, the LUNs with the following IDs were updated - 3514f0c5a51600489, 3514f0c5a5160048a.
4. Activate the storage domain.

Actual results:
2017-11-29 08:52:10,524+02 ERROR [org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand] (EE-ManagedThreadFactory-engine-Thread-50) [7a674124] Command 'org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand' failed: Failed managing transaction
2017-11-29 08:52:10,525+02 ERROR [org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand] (EE-ManagedThreadFactory-engine-Thread-50) [7a674124] Exception: java.lang.RuntimeException: Failed managing transaction
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:224) [utils.jar:]
	at org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand.executeCommand(SyncLunsInfoForBlockStorageDomainCommand.java:86) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1205) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1345) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1987) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1405) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:412) [bll.jar:]
	at org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:509) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:491) [bll.jar:]
	at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:434) [bll.jar:]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_151]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_151]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_151]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_151]
	at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
	at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:78)
	at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:88)
	at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:101)
	at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) [wildfly-ee-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:40)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
	at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:264) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:379) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:244) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
	at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
	at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67) [wildfly-ejb3-11.0.0.Final.jar:11.0.0.Final]
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:60)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:438)
	at org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:609)
	at org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:57)
	at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
	at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
	at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:198)
	at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:185)
	at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:81)
	at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view2.runInternalAction(Unknown Source) [bll.jar:]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [rt.jar:1.8.0_151]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [rt.jar:1.8.0_151]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_151]
	at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_151]
	at org.jboss.weld.util.reflection.Reflections.invokeAndUnwrap(Reflections.java:433) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
	at org.jboss.weld.bean.proxy.EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.java:127) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
	at org.jboss.weld.bean.proxy.EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
	at org.jboss.weld.bean.proxy.InjectionPointPropagatingEnterpriseTargetBeanInstance.invoke(InjectionPointPropagatingEnterpriseTargetBeanInstance.java:67) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
	at org.jboss.weld.bean.proxy.ProxyMethodHandler.invoke(ProxyMethodHandler.java:100) [weld-core-impl-2.4.3.Final.jar:2.4.3.Final]
	at org.ovirt.engine.core.bll.BackendCommandObjectsHandler$BackendInternal$BackendLocal$2049259618$Proxy$_$$_Weld$EnterpriseProxy$.runInternalAction(Unknown Source) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.connection.ISCSIStorageHelper.syncDomainInfo(ISCSIStorageHelper.java:311) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.domain.ActivateStorageDomainCommand.syncStorageDomainInfo(ActivateStorageDomainCommand.java:86) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.domain.ActivateStorageDomainCommand.executeCommand(ActivateStorageDomainCommand.java:115) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1205) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1345) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1987) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1405) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:412) [bll.jar:]
	at org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.executeValidatedCommand(PrevalidatingMultipleActionsRunner.java:204) [bll.jar:]
	at org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.runCommands(PrevalidatingMultipleActionsRunner.java:176) [bll.jar:]
	at org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.lambda$invokeCommands$3(PrevalidatingMultipleActionsRunner.java:182) [bll.jar:]
	at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:96) [utils.jar:]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_151]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_151]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_151]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_151]
	at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151]
	at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent-1.0.jar:]
	at org.jboss.as.ee.concurrent.service.ElytronManagedThreadFactory$ElytronManagedThread.run(ElytronManagedThreadFactory.java:78)
Caused by: java.lang.IllegalStateException: Duplicate key org.ovirt.engine.core.common.businessentities.storage.DiskVmElement@410a2573
	at java.util.stream.Collectors.lambda$throwingMerger$0(Collectors.java:133) [rt.jar:1.8.0_151]
	at java.util.HashMap.merge(HashMap.java:1254) [rt.jar:1.8.0_151]
	at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1320) [rt.jar:1.8.0_151]
	at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) [rt.jar:1.8.0_151]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1380) [rt.jar:1.8.0_151]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) [rt.jar:1.8.0_151]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) [rt.jar:1.8.0_151]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) [rt.jar:1.8.0_151]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) [rt.jar:1.8.0_151]
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) [rt.jar:1.8.0_151]
	at org.ovirt.engine.core.bll.storage.utils.BlockStorageDiscardFunctionalityHelper.vmDiskWithPassDiscardAndWadExists(BlockStorageDiscardFunctionalityHelper.java:174) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.utils.BlockStorageDiscardFunctionalityHelper.getLunsThatBreakPassDiscardSupport(BlockStorageDiscardFunctionalityHelper.java:189) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.utils.BlockStorageDiscardFunctionalityHelper.logIfLunsBreakStorageDomainDiscardFunctionality(BlockStorageDiscardFunctionalityHelper.java:130) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand.updateLunsInDb(SyncLunsInfoForBlockStorageDomainCommand.java:219) [bll.jar:]
	at org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand.lambda$executeCommand$0(SyncLunsInfoForBlockStorageDomainCommand.java:87) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:202) [utils.jar:]
	... 99 more

Expected results:
The following log should appear:
2017-11-29 11:26:27,435+02 INFO  [org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand] (EE-ManagedThreadFactory-engine-Thread-211) [16178042] Updated LUNs information, IDs '3514f0c5a51600489, 3514f0c5a5160048a'.

Comment 1 Yaniv Kaul 2017-11-29 13:17:25 UTC
Severity?

Comment 2 Idan Shaby 2017-11-30 06:07:02 UTC
It's a corner case, but the patch is really small and not risky, so it can be easily backported.

Comment 3 Allon Mureinik 2017-11-30 07:32:09 UTC
(In reply to Idan Shaby from comment #2)
> It's a corner case, but the patch is really small and not risky, so it can
> be easily backported.
And the failure is nasty when it happens...

Comment 4 Lilach Zitnitski 2018-01-11 12:45:45 UTC
--------------------------------------
Tested with the following code:
----------------------------------------
rhevm-4.1.9-0.2.el7.noarch
vdsm-4.19.44-1.el7ev.x86_64

Tested with the following scenario:

Steps to Reproduce:
1. Have a block storage domain with a shared disk between two VMs.
2. Take it down to maintenance.
3. Update the luns table in the DB so that at least one of the storage domain's LUNs will have a wrong size.

Actual results:
2018-01-11 14:43:36,048+02 INFO  [org.ovirt.engine.core.bll.storage.domain.SyncLunsInfoForBlockStorageDomainCommand] (org.ovirt.thread.pool-7-thread-5) [3e6285f4] Updated LUNs information, IDs '3514f0c5a5160048c'.

Expected results:

Moving to VERIFIED!

Comment 5 Sandro Bonazzola 2018-01-24 10:40:50 UTC
This bugzilla is included in oVirt 4.1.9 release, published on Jan 24th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.1.9 release, published on Jan 24th 2018, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.