Bug 1247957 - Misleading audit log when deactivating a storage domain
Summary: Misleading audit log when deactivating a storage domain
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: General
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-3.6.2
: 3.6.2
Assignee: Liron Aravot
QA Contact: Aharon Canan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-29 10:47 UTC by Carlos Mestre González
Modified: 2016-03-10 15:02 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1256841 (view as bug list)
Environment:
Last Closed: 2016-02-18 11:04:51 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-3.6.z+
ylavi: planning_ack+
tnisan: devel_ack+
acanan: testing_ack+


Attachments (Terms of Use)
engine.lgo (1.83 MB, text/plain)
2015-07-29 10:49 UTC, Carlos Mestre González
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 45320 0 master MERGED core: more accurate audit log when domain is deactivated Never
oVirt gerrit 48920 0 ovirt-engine-3.6 MERGED core: more accurate audit log when domain is deactivated Never

Description Carlos Mestre González 2015-07-29 10:47:29 UTC
Description of problem:
Deactivating a storage domain job pass but the storage domain is in status 'preparing for maintenance'. Seems a false positive.

This happens because one of the host is in non-operational state, not sure if this is also a different bug (that storage domain should be deactivated properly when a host is non-operational)

Version-Release number of selected component (if applicable):
ovirt-engine-3.6.0-0.0.master.20150726172446.git65db93d.el6.noarch

How reproducible:
100%

Steps to Reproduce:
1. Have a setup with multiple hosts and storage domains, type doesn't matter
2. One of the host is non-operational (in my case there's an issue accessing one of the gluster domains)
3. Try to deactivate any of the storage domains from the data center

Actual results:
Deactivating a storage domain job seems to finish properly (PASS), but the storage domains are in preparing for maintenance status.

Expected results:
Deactivating a storage domain job status is FAILED (or in case with one host is non operational status the storage domains hould be deactivated then the storage domain should be in the proper state)

Additional info:
engine RHEL 6.7
hosts RHEL 7.1

Comment 1 Carlos Mestre González 2015-07-29 10:49:57 UTC
Created attachment 1057299 [details]
engine.lgo

Comment 2 Carlos Mestre González 2015-07-29 10:50:47 UTC
posting part of engine.log to quick look:

2015-07-29 13:21:32,562 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (org.ovirt.thread.pool-8-thread-10) [24576b6f] FINISH, DisconnectStorageServerVDSCommand, return: {88bca80f-7ce0-4768-8283-de6387e24464=0}, log id: 483509d0
2015-07-29 13:21:32,564 INFO  [org.ovirt.engine.core.bll.storage.DeactivateStorageDomainCommand] (org.ovirt.thread.pool-8-thread-10) [24576b6f] Domain 'e7843c77-73bf-4866-af2e-5fb1ebe8d4b4' will remain in 'PreparingForMaintenance' status until deactivated on all hosts
2015-07-29 13:21:32,569 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-10) [24576b6f] Correlation ID: 2931eb50, Job ID: d67655d9-cd35-4cbf-852a-80d7aa33f563, Call Stack: null, Custom Event ID: -1, Message: Storage Domain test_4831_exp (Data Center golden_env_mixed) was deactivated.
2015-07-29 13:21:32,580 WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (org.ovirt.thread.pool-8-thread-10) [24576b6f] Trying to release exclusive lock which does not exist, lock key: 'e7843c77-73bf-4866-af2e-5fb1ebe8d4b4STORAGE'
2015-07-29 13:21:32,580 INFO  [org.ovirt.engine.core.bll.storage.DeactivateStorageDomainWithOvfUpdateCommand] (org.ovirt.thread.pool-8-thread-10) [24576b6f] Lock freed to object 'EngineLock:{exclusiveLocks='[e7843c77-73bf-4866-af2e-5fb1ebe8d4b4=<STORAGE, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='null'}'
2015-07-29 13:21:32,882 ERROR [org.ovirt.engine.core.dao.jpa.TransactionalInterceptor] (default task-10) [] Failed to run operation in a new transaction: javax.persistence.PersistenceException: org.hibernate.HibernateException: A collection with cascade="all-delete-orphan" was no longer referenced by the owning entity instance: org.ovirt.engine.core.common.job.Job.steps
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1763) [hibernate-entitymanager-4.3.7.Final.jar:4.3.7.Final]
    at org.hibernate.jpa.spi.AbstractEntityManagerImpl.convert(AbstractEntityManagerImpl.java:1677) [hibernate-entitymanager-4.3.7.Final.jar:4.3.7.Final]
    at org.hibernate.jpa.internal.QueryImpl.getResultList(QueryImpl.java:458) [hibernate-entitymanager-4.3.7.Final.jar:4.3.7.Final]
    at org.ovirt.engine.core.dao.jpa.AbstractJpaDao.multipleResults(AbstractJpaDao.java:89) [dal.jar:]
    at org.ovirt.engine.core.dao.JobDaoImpl$Proxy$_$$_WeldSubclass.multipleResults(Unknown Source) [dal.jar:]
    at org.ovirt.engine.core.dao.JobDaoImpl.getJobsByOffsetAndPageSize(JobDaoImpl.java:41) [dal.jar:]
    at org.ovirt.engine.core.dao.JobDaoImpl$Proxy$_$$_WeldSubclass.getJobsByOffsetAndPageSize(Unknown Source) [dal.jar:]
    at sun.reflect.GeneratedMethodAccessor875.invoke(Unknown Source) [:1.7.0_79]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_79]
    at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_79]
    at org.jboss.weld.interceptor.proxy.SimpleInterceptionChain.interceptorChainCompleted(SimpleInterceptionChain.java:51) [weld-core-impl-2.2.6.Final.jar:2014-10-03 10:05]
    at org.jboss.weld.interceptor.chain.AbstractInterceptionChain.finish(AbstractInterceptio

Comment 3 Allon Mureinik 2015-07-29 14:19:10 UTC
Liron, could you take a look please?

Comment 4 Liron Aravot 2015-07-30 08:59:28 UTC
The implemented behavior is that when a host is non operational it's reporting data won't be collected and therefore the domain will remain in Preparing For Maintenance.

two possible action items here (perhaps 2 can be handled on a different BZ).
1. add a audit log message when domain moves to "Preparing to maintenance" and use the current one when the domain is moving to Maintenance status.

2. look into improving the current situation and if host is non-op (as host can be non op by various reasons) but has a domain report, use that report for moving unseen domains to maintenance.

Comment 5 Red Hat Bugzilla Rules Engine 2015-10-19 10:52:08 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 6 Yaniv Lavi 2015-10-29 12:23:21 UTC
In oVirt testing is done on single release by default. Therefore I'm removing the 4.0 flag. If you think this bug must be tested in 4.0 as well, please re-add the flag. Please note we might not have testing resources to handle the 4.0 clone.

Comment 7 Allon Mureinik 2015-11-16 13:44:33 UTC
Not merged to the 3.6 branch - moving back to POST.

Comment 8 Sandro Bonazzola 2015-12-23 13:41:58 UTC
oVirt 3.6.2 RC1 has been released for testing, moving to ON_QA

Comment 9 Aharon Canan 2015-12-29 12:13:50 UTC
Verified according to https://gerrit.ovirt.org/#/c/48920/ description 

rhevm-3.6.2-0.1.el6.noarch

2015-12-29 12:06:07,266 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-16) [544bae] Correlation ID: 544bae, Call Stack: null, Custom Event ID: -1, Message: Storage Domain ISCSI (Data Center Default) was deactivated and has moved to 'Preparing for maintenance' until it will no longer be accessed by any Host of the Data Center.

2015-12-29 12:06:08,867 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-22) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Storage Domain ISCSI (Data Center Default) successfully moved to Maintenance as it's no longer accessed by any Host of the Data Center.


Note You need to log in before you can comment on or make changes to this bug.