Bug 1105211 - Executing multiple "template.delete" commands in parallel to "vm.delete" commands, creates a race condition which cause the Blank template to be removed from Data Center
Summary: Executing multiple "template.delete" commands in parallel to "vm.delete" comm...
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.5.0
Assignee: Shahar Havivi
QA Contact: Lukas Svaty
Whiteboard: virt
Depends On: 1113256 1118249
Blocks: 1130887 rhev3.5beta3
TreeView+ depends on / blocked
Reported: 2014-06-05 15:26 UTC by Ori Gofen
Modified: 2016-05-26 01:48 UTC (History)
13 users (show)

Fixed In Version: org.ovirt.engine-root-3.5.0-13
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1130887 (view as bug list)
Last Closed: 2015-02-17 08:26:24 UTC
oVirt Team: ---

Attachments (Terms of Use)
images and logs (1.45 MB, application/x-gzip)
2014-06-05 15:26 UTC, Ori Gofen
no flags Details

System ID Priority Status Summary Last Updated
oVirt gerrit 30760 ovirt-engine-3.4 MERGED engine: set template id in ctor Never
oVirt gerrit 31180 master MERGED engine: Add AddVmTemplateCommand to command executer framework Never
oVirt gerrit 32896 ovirt-engine-3.5 MERGED engine: Add AddVmTemplateCommand to command executer framework Never

Internal Links: 1118249

Description Ori Gofen 2014-06-05 15:26:11 UTC
Created attachment 902576 [details]
images and logs

Description of problem:

Commencing multiple "vm.delete" operations in parallel to multiple template.delete operations and disk.remove cause somehow a race condition which ends with the Blank template removal.

In the middle of the procedure,when the async_task table still has tasks that wait for their turn to be sent to vdsm,we restart the engine.

when the ui comes back blank template has been deleted (view image)

vm_static does not report on the existence of any template:

engine=# SELECT * FROM vm_static;
 vm_guid | vm_name | mem_size_mb | vmt_guid | os | description | vds_group_id | creation_date | num_of_monitors | is_initialized | is_auto_suspend | num_of_sockets | cpu_per_socket | usb_po
licy | time_zone | is_stateless | fail_back | _create_date | _update_date | dedicated_vm_for_vds | auto_startup | vm_type | nice_level | default_boot_sequence | default_display_type | prior
ity | iso_path | origin | initrd_url | kernel_url | kernel_params | migration_support | userdefined_properties | predefined_properties | min_allocated_mem | entity_type | child_count | temp
late_status | quota_id | allow_console_reconnect | cpu_pinning | is_smartcard_enabled | host_cpu_flags | db_generation | is_delete_protected | is_disabled | is_run_and_pause | created_by_us
er_id | tunnel_migration | free_text_comment | single_qxl_pci | cpu_shares | vnc_keyboard_layout | instance_type_id | image_type_id | sso_method | original_template_id | original_template_n
ame | migration_downtime | template_version_number | template_version_name 
(0 rows)

Version-Release number of selected component (if applicable):


How reproducible:
36.66% (4 out of 11)

Steps to Reproduce:
1.create 8 vms+disks on nfs and iscsi
2.make templates from all of them
3.copy all template to block/file domain
4.create 8 vm's from template
5.remove all the vm's by selecting all of them
6.remove all disks by selection all of them
7.remove all templates by selecting all of them(if the selection of the Blank template greys out the remove button,just select all templates beside the blabk template)
8.wait a minute and restart the engine

Actual results:

When engine comes back,the Blank template is missing

Expected results:

A removal of Blank template shouldn't be possible

Additional info:

Comment 1 Shahar Havivi 2014-06-17 10:52:18 UTC
Can you attach the vdsm and engine logs from the relevant time?

Comment 2 Ori Gofen 2014-06-17 15:27:44 UTC
They are updated as far as I know,
Unfortunately I don't have anything else to offer right now,
I'll reproduce and attach new logs soon.

Comment 4 Michal Skrivanek 2014-07-08 08:33:39 UTC
after discussion with Oved there's a bigger infra issue to be addressed, possibly in 3.5; we also need a workaround in 3.4.z as the consequence might be really bad

Comment 5 Oved Ourfali 2014-07-08 09:03:46 UTC
(In reply to Michal Skrivanek from comment #4)
> after discussion with Oved there's a bigger infra issue to be addressed,
> possibly in 3.5; we also need a workaround in 3.4.z as the consequence might
> be really bad


You're right.
However, the z-stream fix shouldn't be the same one as the 3.5.0 one.
We've already suggested a z-stream fix to Shahar.
I think that the right way here is to split to two bugs:
One on infra, for 3.5.0, that will handle this.
The other one on virt, for 3.4.Z, that will do the workaround.

Michal - thoughts on that?

Comment 6 Michal Skrivanek 2014-07-08 15:06:36 UTC
for the workaround - It's still an infra bug though, but it may be easier/more suitable for someone from virt fix it…up to Omer based on his best judgment based on complexity of the implementation.

Comment 8 Oved Ourfali 2014-07-28 08:20:22 UTC
Hi Michal

We've provided the infra for this one, as part of handling Bug 1118249.
Would be best if you sync through it and do the relevant logic for this bug.
Changing to virt.
Please consult Ravi if you have questions regarding the infra work done here.

Comment 9 Eyal Edri 2014-08-04 11:09:12 UTC
these bugs are candidates for z-stream, but not ready yet.
they were not included in 3.4.2 bug tracker [1] for critical bugs by gss,
and out of of scope for the 3.4.2 build.
moving to 3.4.3.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1123858

Comment 11 Eyal Edri 2014-09-28 11:29:34 UTC
this bug was moved to MODIFIED before vt4 build date thus moving to ON_QA.
if you belive this bug isn't in vt4, please report to rhev-integ@redhat.com

Comment 12 Lukas Svaty 2014-10-02 13:19:22 UTC
tested this 10 times - unable to reproduce, seems to be fixed, if you have better verification steps then those in Description please provide them and I'll re-verify

Comment 13 Omer Frenkel 2015-02-17 08:26:24 UTC
RHEV-M 3.5.0 has been released

Note You need to log in before you can comment on or make changes to this bug.