Bug 1447912 - Attaching/migrating a V3 (4.0) storage domain to a 4.1 DC fails -> SD locked forever
Summary: Attaching/migrating a V3 (4.0) storage domain to a 4.1 DC fails -> SD locked ...
Keywords:
Status: CLOSED DUPLICATE of bug 1446878
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.2
: ---
Assignee: Maor
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-05-04 08:31 UTC by Avihai
Modified: 2017-05-04 11:34 UTC (History)
2 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-05-04 11:34:34 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+


Attachments (Terms of Use)
engine & vdsm logs (862.40 KB, application/x-gzip)
2017-05-04 08:31 UTC, Avihai
no flags Details

Description Avihai 2017-05-04 08:31:49 UTC
Created attachment 1276181 [details]
engine & vdsm logs

Description of problem:
Attaching a V3 (4.0) storage domain to a 4.1 DC fails -> SD locked forever 

Version-Release number of selected component (if applicable):
Engine:
ovirt-engine-restapi-4.1.2-0.1.el7.noarch

VDSM:
4.19.10.1-1

How reproducible:
Did it once for the first time 


Steps to Reproduce:
1. Create DC1+ cluster(c1) with compatibility level = 4.0 
2. Create 2 storage domains(NFS) SD1 & SD2 on DC1 
  (SD1 is MASTER domain & both SD's are on V3) .
3. Create VM + disk + snapshot on SD1 .
4. Detach SD1 (SD2 becomes master )
5. Upgrade c1 & DC1 to compatibility level = 4.1 
6. Try to attach SD1 to DC1  

Actual results:
Attaching V3 storage domain to a 4.1 DC fails -> SD locked forever (about an hour so far)

Expected results:
Attaching V3 storage domain to a 4.1 DC should succeed.


Additional info:
Timeline:
- Detach SD1 (SD2 becomes master ) - May 4, 2017 09:04
- Upgrade cluster 			 		  - May 4, 2017 09:07
- Upgrade DC 							  - May 4, 2017 10:23
- Try to attach SD1 to DC1  		  - May 4, 2017 10:25
- Issue seen 							  - May 4, 2017 10:30



Events:
May 4, 2017 10:30:29 AM Failed to attach Storage Domain sd1 to Data Center dc1. (User: admin@internal-authz)

Engine log exception :
2017-05-04 10:30:29,886+03 ERROR [org.ovirt.engine.core.bll.job.ExecutionHandler] (org.ovirt.thread.pool-6-thread-1) [8e15d24] Exception: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exc
eption is java.sql.SQLException: javax.resource.ResourceException: IJ000460: Error checking for a transaction
        at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80) [spring-jdbc.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:615) [spring-jdbc.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:680) [spring-jdbc.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:712) [spring-jdbc.jar:4.2.4.RELEASE]
        at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:762) [spring-jdbc.jar:4.2.4.RELEASE]
        at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.executeCallInternal(PostgresDbEngineDialect.java:152) [dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.doExecute(PostgresDbEngineDialect.java:118) [dal.jar:]
        at org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:198) [spring-jdbc.jar:4.2.4.RELEASE]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:135) [dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadList(SimpleJdbcCallsHandler.java:105) [dal.jar:]
        at org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeRead(SimpleJdbcCallsHandler.java:97) [dal.jar:]
        at org.ovirt.engine.core.dao.JobDaoImpl.checkIfJobHasTasks(JobDaoImpl.java:149) [dal.jar:]
        at org.ovirt.engine.core.bll.job.ExecutionHandler.checkIfJobHasTasks(ExecutionHandler.java:893) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1474) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:397) [bll.jar:]
        at org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:511) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:493) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:697) [bll.jar:]
        at sun.reflect.GeneratedMethodAccessor128.invoke(Unknown Source) [:1.8.0_121]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_121]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_121]
        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437)
        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:70) [wildfly-weld-7.0.5.GA-redhat-2.jar:7.0.5.GA-redhat-2]
        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:80) [wildfly-weld-7.0.5.GA-redhat-2.jar:7.0.5.GA-redhat-2]
        at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:93) [wildfly-weld-7.0.5.GA-redhat-2.jar:7.0.5.GA-redhat-2]
        at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [wildfly-ejb3-7.0.5.GA-redhat-2.jar:7.0.5.GA-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:437)
        at org.jboss.weld.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:73) [weld-core-impl.jar:2.3.3.Final-redhat-1]
        at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:83) [wildfly-weld-7.0.5.GA-redhat-2.jar:7.0.5.GA-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) [wildfly-ee-7.0.5.GA-redhat-2.jar:7.0.5.GA-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:340)
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61)
        at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52)
        
** From VDSM :
2017-05-04 10:24:31,477+0300 ERROR (jsonrpc/6) [storage.HSM] Could not disconnect from storageServer (hsm:2476)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2472, in disconnectStorageServer
    conObj.disconnect()
  File "/usr/share/vdsm/storage/storageServer.py", line 387, in disconnect
    return self._mountCon.disconnect()
  File "/usr/share/vdsm/storage/storageServer.py", line 185, in disconnect
    self._mount.umount(True, True)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/mount.py", line 198, in umount
    timeout=timeout)
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 53, in __call__
    return callMethod()
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm.py", line 51, in <lambda>
    **kwargs)
  File "<string>", line 2, in umount
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)

Comment 1 Avihai 2017-05-04 10:50:24 UTC
Another scenario I see similar issue occur is when importing V3 storage domain to V4 DC .

Logs attached also to that scenario (issue occurred at 2017-05-04 13:39:22)

scenario 2 details :
1. Create DC1+ cluster(c1) with compatibility level = 4.0 
2. Create 2 storage domains(NFS) SD1 & SD2 on DC1 
  (SD1 is MASTER domain & both SD's are on V3) .
3. Create VM + disk + snapshot on SD1 .
4. Detach SD1 (SD2 becomes master ) & remove it (without format)
5. Upgrade c1 & DC1 to compatibility level = 4.1 
6. Try import SD1 back to DC1

Comment 2 Tal Nisan 2017-05-04 11:34:34 UTC

*** This bug has been marked as a duplicate of bug 1446878 ***


Note You need to log in before you can comment on or make changes to this bug.