This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1001584 - After restarting “ovirt-engine” service, Storage Domain enters “Inactive” mode during DetachStorageDomain command
After restarting “ovirt-engine” service, Storage Domain enters “Inactive” mod...
Status: CLOSED WORKSFORME
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.3.0
x86_64 Linux
unspecified Severity medium
: ovirt-3.6.0-rc3
: 3.6.0
Assigned To: Liron Aravot
Aharon Canan
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-27 06:47 EDT by vvyazmin@redhat.com
Modified: 2016-02-10 13:11 EST (History)
9 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-08 06:47:16 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm (5.62 MB, application/x-gzip)
2013-08-27 06:47 EDT, vvyazmin@redhat.com
no flags Details
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm (8.22 MB, application/x-gzip)
2013-09-17 04:07 EDT, vvyazmin@redhat.com
no flags Details
engine.log (143.10 KB, application/x-gzip)
2015-11-03 07:21 EST, Natalie Gavrielov
no flags Details

  None (edit)
Description vvyazmin@redhat.com 2013-08-27 06:47:02 EDT
Created attachment 790886 [details]
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm

Description of problem:
After restart “ovirt-engine” service, Storage Domain enter to “Inactive” mode - during DetachStorageDomain command

Version-Release number of selected component (if applicable):
RHEVM 3.3 - IS11 environment:

RHEVM:  rhevm-3.3.0-0.16.master.el6ev.noarch
PythonSDK:  rhevm-sdk-python-3.3.0.11-1.el6ev.noarch
VDSM:  vdsm-4.12.0-72.git287bb7e.el6ev.x86_64
LIBVIRT:  libvirt-0.10.2-18.el6_4.9.x86_64
QEMU & KVM:  qemu-kvm-rhev-0.12.1.2-2.355.el6_4.5.x86_64
SANLOCK:  sanlock-2.8-1.el6.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Create Data Center with one host and 2 Storage Domains (SD)
2. Maintenance (DetachStorageDomain) non master SD.
3. During DetachStorageDomain command, restart “ovirt-engine” service

Actual results:
SD enter in “Inactive” Mode

Expected results:
Succeed maintenance SD. 

Impact on user:
Failed maintenance SD

Workaround:
Activate and then Deactivate same SD again

Additional info:

/var/log/ovirt-engine/engine.log

2013-08-26 16:26:56,238 INFO  [org.ovirt.engine.core.bll.storage.DetachStorageDomainFromPoolCommand] (pool-5-thread-47) [77ebff34] Running command: DetachStorageDomainFromPoolCo
mmand internal: false. Entities affected :  ID: 5aa0e6b6-6969-4c81-b676-db85d548249a Type: Storage
2013-08-26 16:26:56,239 INFO  [org.ovirt.engine.core.bll.storage.DetachStorageDomainFromPoolCommand] (pool-5-thread-47) [77ebff34] Start detach storage domain
2013-08-26 16:26:56,294 INFO  [org.ovirt.engine.core.bll.storage.DetachStorageDomainFromPoolCommand] (pool-5-thread-47) [77ebff34]  Detach storage domain: before connect
2013-08-26 16:26:56,307 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-5-thread-48) [77ebff34] START, ConnectStorageServerVDSCommand(HostName = tigris01.scl.lab.tlv.redhat.com, HostId = 9576d8ca-4466-46e6-bebc-ccd922075ac6, storagePoolId = 29479ada-c628-410c-8705-808beb06e92f, storageType = ISCSI, connectionList = [{ id: f7e66fe5-e840-4987-a339-03234a63d57a, connection: 10.35.160.7, iqn: iqn.2008-05.com.xtremio:001e675b8ee0, vfsType: null, mountOptions: null, nfsVersion: null, nfsRetrans: null, nfsTimeo: null };]), log id: 2e312b1a
2013-08-26 16:26:56,998 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (pool-5-thread-48) [77ebff34] FINISH, ConnectStorageServerVDSCommand, return: {f7e66fe5-e840-4987-a339-03234a63d57a=0}, log id: 2e312b1a
2013-08-26 16:26:56,999 INFO  [org.ovirt.engine.core.bll.storage.DetachStorageDomainFromPoolCommand] (pool-5-thread-47) [77ebff34]  Detach storage domain: after connect
2013-08-26 16:26:57,000 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.DetachStorageDomainVDSCommand] (pool-5-thread-47) [77ebff34] START, DetachStorageDomainVDSCommand( storagePoolId = 29479ada-c628-410c-8705-808beb06e92f, ignoreFailoverLimit = false, storageDomainId = 5aa0e6b6-6969-4c81-b676-db85d548249a, masterDomainId = 00000000-0000-0000-0000-000000000000, masterVersion = 1, force = false), log id: 1d1b473a
2013-08-26 16:26:58,710 ERROR [org.ovirt.engine.core.utils.timer.SchedulerUtilQuartzImpl] (DefaultQuartzScheduler_Worker-6) Failed to invoke scheduled method OnTimer: java.lang.reflect.InvocationTargetException
        at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) [:1.7.0_25]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_25]
        at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_25]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60) [scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz.jar:]
Caused by: org.jboss.as.ejb3.component.EJBComponentUnavailableException: JBAS014559: Invocation cannot proceed as component is shutting down
        at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]


2013-08-27 10:33:24,093 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (pool-5-thread-50) Domain 5aa0e6b6-6969-4c81-b676-db85d548249a:SD-e-02 was reported by
 all hosts in status UP as problematic. Moving the domain to NonOperational.

vdsClient -s 0 getStorageDomainInfo 5aa0e6b6-6969-4c81-b676-db85d548249a 

	uuid = 5aa0e6b6-6969-4c81-b676-db85d548249a
	vguuid = PGGe3n-bhe5-f4iR-uBe0-5eRf-DSmI-4eq9KP
	lver = -1
	state = OK
	version = 3
	role = Regular
	pool = ['29479ada-c628-410c-8705-808beb06e92f']
	spm_id = -1
	type = ISCSI
	class = Data
	master_ver = 0
	name = SD-e-02




/var/log/vdsm/vdsm.log
Comment 1 vvyazmin@redhat.com 2013-09-17 04:05:39 EDT
Failed, tested on RHEVM 3.3 - IS14 environment:
Tested on FCP Data Centers

Host OS: RHEL 6.5

RHEVM:  rhevm-3.3.0-0.21.master.el6ev.noarch
PythonSDK:  rhevm-sdk-python-3.3.0.13-1.el6ev.noarch
VDSM:  vdsm-4.12.0-127.gitedb88bf.el6ev.x86_64
LIBVIRT:  libvirt-0.10.2-23.el6.bz964359.eblake.1.x86_64
QEMU & KVM:  qemu-kvm-rhev-0.12.1.2-2.401.el6.x86_64
SANLOCK:  sanlock-2.8-1.el6.x86_64
Comment 2 vvyazmin@redhat.com 2013-09-17 04:07:06 EDT
Created attachment 798659 [details]
## Logs rhevm, vdsm, libvirt, thread dump, superVdsm
Comment 5 Ayal Baron 2013-12-18 04:23:33 EST
Tal, update on this one?
Comment 7 Allon Mureinik 2015-07-07 09:46:14 EDT
Aharon - does QA have a test case for this? Does it still happen in 3.6.0?
Comment 8 Aharon Canan 2015-07-07 11:17:18 EDT
Probably we do but I do not think we ran it in the last year.
We can re-test
Comment 9 Yaniv Lavi (Dary) 2015-10-22 04:18:44 EDT
Did you retest?
Comment 10 Aharon Canan 2015-11-01 10:38:37 EST
Natalie - please do.
Comment 11 Natalie Gavrielov 2015-11-03 07:20:31 EST
Ran the following scenario (a few times):

1. Moving an SD (not a master) to maintenance.
2. During the "locked state" (during the maintenance operation) perform engine restart.

Configuration: 
2 hosts (one is in "maintenance", a few SD's, the one that was put in maintenance state was not a master.

Environment: 
rhevm-3.6.0.2-0.1.el6.noarch

Result:
After the restart SD was in maintenance mode.
Comment 12 Natalie Gavrielov 2015-11-03 07:21 EST
Created attachment 1088953 [details]
engine.log
Comment 13 Allon Mureinik 2015-11-08 03:04:36 EST
(In reply to Natalie Gavrielov from comment #11)
> Ran the following scenario (a few times):
> 
> 1. Moving an SD (not a master) to maintenance.
> 2. During the "locked state" (during the maintenance operation) perform
> engine restart.
> 
> Configuration: 
> 2 hosts (one is in "maintenance", a few SD's, the one that was put in
> maintenance state was not a master.
> 
> Environment: 
> rhevm-3.6.0.2-0.1.el6.noarch
> 
> Result:
> After the restart SD was in maintenance mode.

To sum up - you moved a domain to maintenance, restarted the engine, and the domain still went to maintenance.
Doesn't this mean the BZ should be VERIFIED on the version you tested it with?
Comment 14 Aharon Canan 2015-11-08 06:35:46 EST
(In reply to Allon Mureinik from comment #13)
> (In reply to Natalie Gavrielov from comment #11)
> > Ran the following scenario (a few times):
> > 
> > 1. Moving an SD (not a master) to maintenance.
> > 2. During the "locked state" (during the maintenance operation) perform
> > engine restart.
> > 
> > Configuration: 
> > 2 hosts (one is in "maintenance", a few SD's, the one that was put in
> > maintenance state was not a master.
> > 
> > Environment: 
> > rhevm-3.6.0.2-0.1.el6.noarch
> > 
> > Result:
> > After the restart SD was in maintenance mode.
> 
> To sum up - you moved a domain to maintenance, restarted the engine, and the
> domain still went to maintenance.
> Doesn't this mean the BZ should be VERIFIED on the version you tested it
> with?

For sure not verified as no patch here, 
Can be Works for me or something...
Comment 15 Allon Mureinik 2015-11-08 06:47:16 EST
(In reply to Aharon Canan from comment #14)
> (In reply to Allon Mureinik from comment #13)
> > (In reply to Natalie Gavrielov from comment #11)
> > > Ran the following scenario (a few times):
> > > 
> > > 1. Moving an SD (not a master) to maintenance.
> > > 2. During the "locked state" (during the maintenance operation) perform
> > > engine restart.
> > > 
> > > Configuration: 
> > > 2 hosts (one is in "maintenance", a few SD's, the one that was put in
> > > maintenance state was not a master.
> > > 
> > > Environment: 
> > > rhevm-3.6.0.2-0.1.el6.noarch
> > > 
> > > Result:
> > > After the restart SD was in maintenance mode.
> > 
> > To sum up - you moved a domain to maintenance, restarted the engine, and the
> > domain still went to maintenance.
> > Doesn't this mean the BZ should be VERIFIED on the version you tested it
> > with?
> 
> For sure not verified as no patch here, 
> Can be Works for me or something...
That's a more appropriate course of action, agreed.

Note You need to log in before you can comment on or make changes to this bug.