Bug 1395924 - [Scale] Creating VM from Template dialog box blocking UI for 2 minutes during concurrent VM deletes
Summary: [Scale] Creating VM from Template dialog box blocking UI for 2 minutes during...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin
Version: 4.0.5.1
Hardware: Unspecified
OS: Unspecified
unspecified
high vote
Target Milestone: ---
: ---
Assignee: bugs@ovirt.org
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-17 00:24 UTC by mlehrer
Modified: 2016-11-30 09:06 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-30 09:06:34 UTC
oVirt Team: Virt


Attachments (Terms of Use)

Description mlehrer 2016-11-17 00:24:03 UTC
Description of problem:
VM creation from template was attempted from Webadmin UI, while concurrent VM delete operations were running.  During this time the UI was blocked for 2 minutes by a blank dialog box.  Its unclear what is happening to the VM creation request from the user during that time as the dialog box is blocking the screen.


Version-Release number of selected component (if applicable):
RHEL 7.3
vdsm-4.18.15.2-1	
RHEVM 4.0.5

Dataset: 200 VM, with 5 disk per VM

How reproducible:
Reproduces

Steps to Reproduce:
1. Load dataset of 1000 Disks and 200 VMs
2. Attempt to delete multiple VMs
3. Issue VM creation from template

Actual results:
UI blocked for 2 minutes by white dialog box

Expected results:
No blocking dialogs

Additional info:
Possibly related to: https://bugzilla.redhat.com/show_bug.cgi?id=1395922

Comment 1 Yaniv Kaul 2016-11-17 07:19:05 UTC
logs?

Comment 2 Greg Sheremeta 2016-11-17 12:58:04 UTC
Sounds like an engine issue, not UI.

Is the UI waiting on a server response when this is happening? (check dev tools)

Comment 3 Oved Ourfali 2016-11-17 13:18:04 UTC
Moving to virt to examine the issue.

Comment 4 mlehrer 2016-11-17 17:50:17 UTC
(In reply to Yaniv Kaul from comment #1)
> logs?

https://drive.google.com/open?id=0B8V1DXeGhPPWXzZHWXJnVjFQX1U

(In reply to Greg Sheremeta from comment #2)
> Sounds like an engine issue, not UI.
> 
> Is the UI waiting on a server response when this is happening? (check dev
> tools)

The longest request is 34s, a POST but unfortunately GWT debug is not on so I can't see much more than GenericApiGWTService
I also see several "reports-interface-proxy?command=status" GETs that return in 404, likely unrelated but just sharing.


Per Engine.log

Deletes began here
2016-11-17 19:16:55,607 INFO  [org.ovirt.engine.core.bll.RemoveVmCommand] (org.ovirt.thread.pool-6-thread-30) [6cd3ed09] Running command: RemoveVmCommand internal: false. Entities affected :  ID: d049a4fe-edff-4343-bc08-7f17d63eff30 Type: VMAction group DELETE_VM with role type USER

Template Creation began here
2016-11-17 19:18:08,178 INFO  [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (default task-63) [2f494013] Lock Acquired to object 'EngineLock:{exclusiveLocks='[test1=<VM_NAME, ACTION_TYPE_FAILED_OBJECT_LOCKED>]', sharedLocks='[95c31e8b-663d-4842-938e-55863afe0daf=<DISK, ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>, eac7a463-92b7-441a-bbb3-27fa8385adc3=<DISK, ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>, f9d3c73b-8375-4f96-b813-eb21568a2032=<DISK, ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>, 5faf956f-4341-489e-b770-e1c640123cd4=<DISK, ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>, a0b4bf0a-73dd-456d-843b-7392ec866b7d=<TEMPLATE, ACTION_TYPE_FAILED_TEMPLATE_IS_USED_FOR_CREATE_VM$VmName test1>, 89e264d9-ef3a-41e3-95c8-36da9876e53f=<DISK, ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>]'}'

Again, logs, and video of test are found here: https://drive.google.com/open?id=0B8V1DXeGhPPWXzZHWXJnVjFQX1U

Comment 5 Yaniv Kaul 2016-11-20 15:22:02 UTC
(In reply to mlehrer from comment #4)
> (In reply to Yaniv Kaul from comment #1)
> > logs?
> 
> https://drive.google.com/open?id=0B8V1DXeGhPPWXzZHWXJnVjFQX1U
> 
> (In reply to Greg Sheremeta from comment #2)
> > Sounds like an engine issue, not UI.
> > 
> > Is the UI waiting on a server response when this is happening? (check dev
> > tools)
> 
> The longest request is 34s, a POST but unfortunately GWT debug is not on so
> I can't see much more than GenericApiGWTService
> I also see several "reports-interface-proxy?command=status" GETs that return
> in 404, likely unrelated but just sharing.

This is bug 1371570 and indeed, unrelated.
> 
> 
> Per Engine.log
> 
> Deletes began here
> 2016-11-17 19:16:55,607 INFO  [org.ovirt.engine.core.bll.RemoveVmCommand]
> (org.ovirt.thread.pool-6-thread-30) [6cd3ed09] Running command:
> RemoveVmCommand internal: false. Entities affected :  ID:
> d049a4fe-edff-4343-bc08-7f17d63eff30 Type: VMAction group DELETE_VM with
> role type USER
> 
> Template Creation began here
> 2016-11-17 19:18:08,178 INFO 
> [org.ovirt.engine.core.bll.AddVmFromTemplateCommand] (default task-63)
> [2f494013] Lock Acquired to object
> 'EngineLock:{exclusiveLocks='[test1=<VM_NAME,
> ACTION_TYPE_FAILED_OBJECT_LOCKED>]',
> sharedLocks='[95c31e8b-663d-4842-938e-55863afe0daf=<DISK,
> ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>,
> eac7a463-92b7-441a-bbb3-27fa8385adc3=<DISK,
> ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>,
> f9d3c73b-8375-4f96-b813-eb21568a2032=<DISK,
> ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>,
> 5faf956f-4341-489e-b770-e1c640123cd4=<DISK,
> ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>,
> a0b4bf0a-73dd-456d-843b-7392ec866b7d=<TEMPLATE,
> ACTION_TYPE_FAILED_TEMPLATE_IS_USED_FOR_CREATE_VM$VmName test1>,
> 89e264d9-ef3a-41e3-95c8-36da9876e53f=<DISK,
> ACTION_TYPE_FAILED_DISK_IS_USED_FOR_CREATE_VM$VmName test1>]'}'
> 
> Again, logs, and video of test are found here:
> https://drive.google.com/open?id=0B8V1DXeGhPPWXzZHWXJnVjFQX1U

Comment 6 Michal Skrivanek 2016-11-21 13:51:20 UTC
ok, i see it does take most of the time during storage operation on those disks. There seem to be ~25 tasks on SPM at that time so kicking them off takes some time. We should probably make it more async than it is today (it would likely be possible) but I do not think the current behavior is so bad. It is not nicer, but you would need to wait for finishing the template before being able to use it anyway.

Tomas/Arik, any more thoughts?

Comment 7 Yaniv Kaul 2016-11-22 12:18:08 UTC
Note that some operations (such as cloning) is moving to HSM soon, and will be sent in round-robin fashion to the HSMs. This should both reduce the load on the SPM (task-wise) as well as storage-wise (if its pipes are full with traffic / storage actions).

Comment 8 Tomas Jelinek 2016-11-23 09:12:30 UTC
@Michal - I agree it is not that bad. We could close the dialog also sooner, much sooner actually, but we would sacrifice usability in case the environment is not so loaded and some issue occurred the user wants to fix. 
Also, we do not block the user from closing the dialog immediately, it will not cancel the operation.

I would propose based on this and Comment 7 to close this as wont fix.


Note You need to log in before you can comment on or make changes to this bug.