Bug 1860284 - VM can not be taken from pool when no prestarted VM's are available
Summary: VM can not be taken from pool when no prestarted VM's are available
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core
Version: 4.4.1.8
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.4.2
: 4.4.2.2
Assignee: Arik
QA Contact: Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-24 08:48 UTC by Rik Theys
Modified: 2020-09-18 09:04 UTC (History)
2 users (show)

Fixed In Version: rhv-4.4.2-3, ovirt-engine-4.4.2.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-18 07:12:53 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 110632 0 master MERGED core: fix locks handling when taking VMs from pools 2020-11-11 06:37:42 UTC

Description Rik Theys 2020-07-24 08:48:08 UTC
Description of problem:

When a pool is created with no prestarted VM's and a user with user permissions on the pool takes a VM using the user portal, it works exactly once.

When a user (doesn't have to be the same user) powers down the VM and then tries to take a VM again from the pool, it fails with the error:

Cannot allocate and run VM from VM-Pool. There are no available VMs in the VM-Pool

This is incorrect as there lots of VM's available in the pool.

The engine log shows:

2020-07-24 10:33:04,385+02 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-341) [9c744031-028e-4b31-aaf9-8a09863d16fc] START, IsVmDuringInitiatingVDSCommand( IsVmDuringInitiatingVDSCommandParameters:{vmId='68328d2a-4dfb-40d2-ad91-95f186c217cb'}), log id: 6de6f3a4
2020-07-24 10:33:04,385+02 INFO  [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-341) [9c744031-028e-4b31-aaf9-8a09863d16fc] FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 6de6f3a4
2020-07-24 10:33:04,416+02 INFO  [org.ovirt.engine.core.bll.VmPoolHandler] (default task-341) [9c744031-028e-4b31-aaf9-8a09863d16fc] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[68328d2a-4dfb-40d2-ad91-95f186c217cb=VM]', sharedLocks=''}'
2020-07-24 10:33:04,425+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-341) [9c744031-028e-4b31-aaf9-8a09863d16fc] Lock Acquired to object 'EngineLock:{exclusiveLocks='[d2db69d8-2f0c-4666-8fc6-2d79b8466328=USER_VM_POOL]', sharedLocks=''}'
2020-07-24 10:33:04,425+02 WARN  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-341) [9c744031-028e-4b31-aaf9-8a09863d16fc] Validation of action 'AttachUserToVmFromPoolAndRun' failed for user u0045469.be-authz. Reasons: VAR__ACTION__ALLOCATE_AND_RUN,VAR__TYPE__VM_FROM_VM_POOL,ACTION_TYPE_FAILED_NO_AVAILABLE_POOL_VMS
2020-07-24 10:33:04,425+02 INFO  [org.ovirt.engine.core.bll.AttachUserToVmFromPoolAndRunCommand] (default task-341) [9c744031-028e-4b31-aaf9-8a09863d16fc] Lock freed to object 'EngineLock:{exclusiveLocks='[d2db69d8-2f0c-4666-8fc6-2d79b8466328=USER_VM_POOL]', sharedLocks=''}'
2020-07-24 10:33:04,433+02 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-341) [] Operation Failed: [Cannot allocate and run VM from VM-Pool. There are no available VMs in the VM-Pool.]

I believe this to still be a locking issue. See my comment 9 in bug 1462236.

Version-Release number of selected component (if applicable):
ovirt-engine-4.4.1.8-1.el8.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create a pool and set the number of prestarted VM's to 0. Assign a group of users the user-role to this pool
2. Log in to the user portal with one of the users and take a VM from the pool. The VM will start. Power off the VM again
3. Try to take a VM from the same pool again. This will no longer work. Not even as as different user. The pool becomes useless at this point.

4. Restart ovirt-engine
5. A user can take a VM again, once.

A way to mitigate the issue somewhat seems to be to make sure enough VM's are prestarted.

Actual results:
VM's can not be taken from a pool more than once.

Expected results:
VM's can be taken from the pool, even if the pool has no prestarted VM's

Additional info:
See my comment 9 in bug 1462236.

Comment 1 Arik 2020-07-28 07:15:18 UTC
Is your pool set as "Manual" or "Automatic"?

Comment 2 RHEL Program Management 2020-07-28 07:15:24 UTC
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.

Comment 3 Rik Theys 2020-07-28 07:25:45 UTC
Hi Arik,

(In reply to Arik from comment #1)
> Is your pool set as "Manual" or "Automatic"?

My pool type is set to Automatic, and it is a stateful pool.

Regards,
Rik

Comment 4 Arik 2020-07-30 17:30:14 UTC
Thanks Rik.
Yeah, I reproduced this and I see the problem in the log - should be fairly easy to fix.

Comment 5 Tamir 2020-08-20 10:02:09 UTC
Verified on RHV 4.4.2-3. All looks good to me.

Env:
  - Engine instance with RHV 4.4.2-3 and RHEL 8.2.1 installed.
  - Host with RHV 4.4.2-3 and RHEL 8.2.1, vdsm-4.40.25-1.el8ev, ovirt-engine-4.4.2.2-0.3.el8ev

Steps:

In Admin Portal:

1. Create a 4.4 data center and a 4.4 cluster.
2. Install the host, add a storage domain.
3. Create a VM pool with no prestarted VM's, 5 vms in the pool, Max 1 VM per user, pool type is automatic and stateful.
4. Assign permissions for the VM pool to User1 and User2.

In VM portal:

5. Log in to VM Portal as User1.
6. Take a VM and wait for the VM to start.
7. After the VM has started, shutdown the VM.
8. After the VM has stopped, Take the VM again.
9. After the VM has started, shutdown the VM.
10. Log in as User2 and repeat steps 6 - 9.

Results (As Expected):
1. A 4.4 data center and a 4.4 cluster were created.
2. The host installed correctly, the storage domain was created and the VM was created.
3. The VM pool was created.
4. The permissions for the VM were set.
5. Logged in as User1.
6. A VM from the pool has been taken and ran correctly.
7. The VM from the pool has been stoped. 
8. A VM has been taken again from the pool and ran correctly.
9. The VM from the pool has been stoped. 
10. Logged in as User2 and all the steps from 6-9 were executed correctly.

Comment 6 Sandro Bonazzola 2020-09-18 07:12:53 UTC
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.