Bug 1277558 - [Cinder] creation of a snapshot with multiple cinder disks is reported as failed
[Cinder] creation of a snapshot with multiple cinder disks is reported as failed
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage (Show other bugs)
3.6.0.2
Unspecified Unspecified
unspecified Severity medium (vote)
: ovirt-3.6.1
: 3.6.1
Assigned To: Daniel Erez
Ori Gofen
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-03 09:43 EST by Ori Gofen
Modified: 2016-05-25 21:49 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-16 07:16:52 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
tnisan: ovirt‑3.6.z?
ogofen: planning_ack?
rule-engine: devel_ack+
ogofen: testing_ack?


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 48054 master MERGED core: snapshot with multiple cinder disks reported as failed Never
oVirt gerrit 48063 ovirt-engine-3.6 MERGED core: snapshot with multiple cinder disks reported as failed Never

  None (edit)
Description Ori Gofen 2015-11-03 09:43:47 EST
Description of problem:

Almost Every cinder command flow is using, apparently, api from externalprovider.foreman and it's not working, like never, this cause every cinder action to fail on logs, but more importantly, this causes a failure on the engine-webadmin when attempting a creation of a snapshot of a Vm with 2 cinder virtual disks(or more).


2015-11-03 16:05:51,770 INFO  [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (ajp-/127.0.0.1:8702-6) [3265c76c] Lock Acquired to object 'EngineLock:{exclusiveLoc
ks='[db59cd36-3e20-4771-b7ac-37fddceca45d=<VM, ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM$VmName vm_cin>]', sharedLocks='null'}'
2015-11-03 16:05:52,374 INFO  [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (org.ovirt.thread.pool-7-thread-12) [3265c76c] Running command: CreateAllSnapshotsFr
omVmCommand internal: false. Entities affected :  ID: db59cd36-3e20-4771-b7ac-37fddceca45d Type: VMAction group MANIPULATE_VM_SNAPSHOTS with role type USER
2015-11-03 16:05:52,556 INFO  [org.ovirt.engine.core.bll.storage.CreateCinderSnapshotCommand] (pool-6-thread-2) [30a04640] Running command: CreateCinderSnapshotCommand intern
al: true. Entities affected :  ID: 00000000-0000-0000-0000-000000000000 Type: StorageAction group CONFIGURE_VM_STORAGE with role type USER
2015-11-03 16:05:52,904 ERROR [org.ovirt.engine.core.bll.host.provider.foreman.SystemProviderFinder] (ajp-/127.0.0.1:8702-2) [] Failed to find host on any provider by host na
me 'ovirt-gofen-2.scl.lab.tlv.redhat.com'
2015-11-03 16:05:53,208 ERROR [org.ovirt.engine.core.bll.host.provider.foreman.SystemProviderFinder] (ajp-/127.0.0.1:8702-3) [] Failed to find host on any provider by host na
me 'ovirt-gofen-2.scl.lab.tlv.redhat.com'
2015-11-03 16:05:53,423 INFO  [org.ovirt.engine.core.bll.storage.CreateCinderSnapshotCommand] (pool-6-thread-3) [2d0d946c] Running command: CreateCinderSnapshotCommand intern
al: true. Entities affected :  ID: 00000000-0000-0000-0000-000000000000 Type: StorageAction group CONFIGURE_VM_STORAGE with role type USER
2015-11-03 16:05:54,860 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-12) [] Correlation ID: 3265c76c, Job ID:
aca8b20c-3a68-4711-b996-26245a7ea27b, Call Stack: null, Custom Event ID: -1, Message: Snapshot 't1' creation for VM 'vm_cin' was initiated by admin@internal.
2015-11-03 16:05:54,873 INFO  [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (org.ovirt.thread.pool-7-thread-12) [] Lock freed to object 'EngineLock:{exclusiveLo
cks='[db59cd36-3e20-4771-b7ac-37fddceca45d=<VM, ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM$VmName vm_cin>]', sharedLocks='null'}'
2015-11-03 16:05:56,136 ERROR [org.ovirt.engine.core.bll.host.provider.foreman.SystemProviderFinder] (ajp-/127.0.0.1:8702-3) [] Failed to find host on any provider by host na
me 'ovirt-gofen-2.scl.lab.tlv.redhat.com'
2015-11-03 16:05:56,877 INFO  [org.ovirt.engine.core.bll.storage.CreateCinderSnapshotCommand] (DefaultQuartzScheduler_Worker-19) [2d0d946c] Ending command 'org.ovirt.engine.c
ore.bll.storage.CreateCinderSnapshotCommand' successfully.
2015-11-03 16:05:57,001 INFO  [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (DefaultQuartzScheduler_Worker-19) [2d0d946c] Ending command 'org.ovirt.engine.core.
bll.CreateAllSnapshotsFromVmCommand' successfully.
2015-11-03 16:05:57,001 INFO  [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (DefaultQuartzScheduler_Worker-19) [2d0d946c] There are still running CoCo tasks
2015-11-03 16:05:57,002 WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (DefaultQuartzScheduler_Worker-19) [2d0d946c] Trying to release exclusive lock which does n
ot exist, lock key: 'db59cd36-3e20-4771-b7ac-37fddceca45dVM'
2015-11-03 16:05:57,002 INFO  [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (DefaultQuartzScheduler_Worker-19) [2d0d946c] Lock freed to object 'EngineLock:{excl
usiveLocks='[db59cd36-3e20-4771-b7ac-37fddceca45d=<VM, ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM$VmName vm_cin>]', sharedLocks='null'}'
2015-11-03 16:05:57,031 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-19) [2d0d946c] Correlation ID: 3265c76c, C
all Stack: null, Custom Event ID: -1, Message: Failed to complete snapshot 't1' creation for VM 'vm_cin'.
2015-11-03 16:05:58,007 ERROR [org.ovirt.engine.core.bll.host.provider.foreman.SystemProviderFinder] (ajp-/127.0.0.1:8702-9) [] Failed to find host on any provider by host name 'ovirt-gofen-2.scl.lab.tlv.redhat.com'

So to sum things up every Vm with x>1 cinder disks will report 'operation failed' when trying to create a snapshot. 

Version-Release number of selected component (if applicable):
3.6.0.2

How reproducible:
100%

Steps to Reproduce:
1.have a Vm+2 cinder disks
2.create a snapshot

Actual results:
The operation fails

Expected results:
The operation should not fail (UI wise)

Additional info:
This behavior is new and was not encountered during pre-integration, thus, marking it as a reGRession
Comment 1 Tal Nisan 2015-11-03 10:19:34 EST
Daniel, please have a look asap.

BTW Ori, you can't mark a bug that marks a regression from pre-integration as a regression... Only official versions are in this game.
Comment 2 Daniel Erez 2015-11-03 14:57:22 EST
@Moti:
* When SystemProviderFinder should be invoked?
* Could it affect other operations by taking locks for example?

@Ori:
* Can you please attach full engine and vdsm logs?
* Did you navigate to Errata tab in the webadmin before reproducing the issue?
Comment 3 Moti Asayag 2015-11-04 00:01:47 EST
(In reply to Daniel Erez from comment #2)
> @Moti:
> * When SystemProviderFinder should be invoked?

It depends. In 3.6.1 the SystemProviderFinder will be called only when clicked on the 'Errata' on the system tree or calling via the api /api/katelloerrata resource.

In 3.6.0, before fixing bug 1260499, it will be called when SyncSearch() is called.

> * Could it affect other operations by taking locks for example?
> 

No, it shouldn't gain any lock.

> @Ori:
> * Can you please attach full engine and vdsm logs?
> * Did you navigate to Errata tab in the webadmin before reproducing the
> issue?
Comment 4 Oved Ourfali 2015-11-04 01:25:29 EST
(In reply to Moti Asayag from comment #3)
> (In reply to Daniel Erez from comment #2)
> > @Moti:
> > * When SystemProviderFinder should be invoked?
> 
> It depends. In 3.6.1 the SystemProviderFinder will be called only when
> clicked on the 'Errata' on the system tree or calling via the api
> /api/katelloerrata resource.
> 
> In 3.6.0, before fixing bug 1260499, it will be called when SyncSearch() is
> called.

And this fix will only be part of 3.6.1.

> 
> > * Could it affect other operations by taking locks for example?
> > 
> 
> No, it shouldn't gain any lock.
> 
> > @Ori:
> > * Can you please attach full engine and vdsm logs?
> > * Did you navigate to Errata tab in the webadmin before reproducing the
> > issue?

Moving to storage to examine the cinder issue.
It has nothing to do with Foreman. You'll see these Foreman errors regardless of doing any operations on cinder....

Also, setting "storage | infra" whiteboard is nice.... however, it isn't supported in our queries, and that's not the way we work.
Comment 5 Daniel Erez 2015-11-04 05:21:48 EST
To sum up, the recurring error on SystemProviderFinder seems unrelated and been already addressed by bug 1260499. The error in snapshot creation reproduces on master, however, it's merely a logging issue; i.e. operation is reported as failed and then as succeeded (suggested a patch for it). Hence, lowering severity.
Comment 6 Ori Gofen 2015-12-16 03:29:49 EST
Verified on rhevm-3.6.1.3-0.1.el6.noarch
Comment 7 Sandro Bonazzola 2015-12-16 07:16:52 EST
According to verification status and target milestone this issue should be fixed in oVirt 3.6.1. Closing current release.

Note You need to log in before you can comment on or make changes to this bug.