Bug 1053742

Summary: Faulty storage check when adding a VM with disks
Product: [Retired] oVirt Reporter: Allon Mureinik <amureini>
Component: ovirt-engine-coreAssignee: Vered Volansky <vered>
Status: CLOSED CURRENTRELEASE QA Contact: Ori Gofen <ogofen>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.4CC: acanan, amureini, bugs, dfediuck, gklein, iheim, jkt, lpeer, nlevinki, obasan, ogofen, rbalakri, Rhev-m-bugs, scohen, vered, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: Unspecified   
OS: All   
Whiteboard: storage
Fixed In Version: ovirt-3.5.0-alpha1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1053704
: 1053744 (view as bug list) Environment:
Last Closed: 2014-10-17 12:19:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 960934, 1053744, 1178011    
Attachments:
Description Flags
vdsm+engine logs
none
vdsm+engine correct logs none

Description Allon Mureinik 2014-01-15 17:17:41 UTC
+++ This bug was initially created as a clone of Bug #1053704 +++

+++ This bug was initially created as a clone of Bug #960934 +++

Description of problem:

For each new empty disk of the VM, the required space validations should be:
      | File Domain                             | Block Domain
 -----|-----------------------------------------|-------------
 qcow | 1M (header size)                        | 1G
 -----|-----------------------------------------|-------------
 raw  | preallocated: disk capacity (getSize()) | disk capacity
      | thin (sparse): 1M                       | (there is no raw sparse on 
                                                   block domains)

Comment 1 Vered Volansky 2014-04-08 08:31:35 UTC
Allon, assuming this relates to AddVmFromScratchCommand?
It's only allowed to do that through REST, BTW. Through the UI it's first add a VM, then add disks to it. Different commands.

Comment 2 Allon Mureinik 2014-04-10 12:22:46 UTC
(In reply to Vered Volansky from comment #1)
> Allon, assuming this relates to AddVmFromScratchCommand?
> It's only allowed to do that through REST, BTW. Through the UI it's first
> add a VM, then add disks to it. Different commands.
You can clone the BZ and have a different patch for REST and GUI.

Comment 3 Allon Mureinik 2014-04-13 12:14:13 UTC
(In reply to Allon Mureinik from comment #2)
> (In reply to Vered Volansky from comment #1)
> > Allon, assuming this relates to AddVmFromScratchCommand?
> > It's only allowed to do that through REST, BTW. Through the UI it's first
> > add a VM, then add disks to it. Different commands.
> You can clone the BZ and have a different patch for REST and GUI.

Misunderstood the question.
Yes, this BZ is specific to REST. The GUI flow was handled in bug 1053704.

Comment 4 Vered Volansky 2014-04-29 02:59:03 UTC
This bug was fixed for bz917682, which is targeted for 3.4.1 .
Therefore targeting this to 3.4.1 as well. Will change to modified when backported.

Comment 5 Vered Volansky 2014-07-10 07:31:18 UTC
Thie following is the full data on storage allocation cheacks:

For new empty disks (unchanged from the original)
      | File Domain                             | Block Domain
 -----|-----------------------------------------|-------------
 qcow | 1M (header size)                        | 1G
 -----|-----------------------------------------|-------------
 raw  | preallocated: disk capacity (getSize()) | disk capacity
      | thin (sparse): 1M                       | (there is no raw sparse on 
                                                   block domains)


For cloned disks:
      | File Domain                             | Block Domain
 -----|-----------------------------------------|-------------
 qcow | preallocated : 1.1 * disk capacity      |1.1 * min(used ,capacity) 
      | sparse: 1.1 * min(used ,capacity)       |
 -----|-----------------------------------------|-------------
 raw  | preallocated: disk capacity             |disk capacity
      | sparse: min(used,capacity)


Some more info on how to reproduce:
Create one of each of the disks within the table with sufficient/insufficient storage space, and verify success and failure (should CDA failure).

Comment 6 Ori Gofen 2014-08-04 15:46:05 UTC
Created attachment 923925 [details]
vdsm+engine logs

reproduced on beta.2 failed with several exceptions on creation of raw preallocated file

2014-08-04 18:25:44,613 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-75) Failed in HSMGetAllTasksStatusesVDS method
2014-08-04 18:25:44,614 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-75) SPMAsyncTask::PollTask: Polling task ec8bfc50-7f59-4a09-b059-dfef5d3f1427 (Parent Command AddDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'cleanSuccess'.
2014-08-04 18:25:44,629 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-75) BaseAsyncTask::LogEndTaskFailure: Task ec8bfc50-7f59-4a09-b059-dfef5d3f1427 (Parent Command AddDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) ended with failure:
-- Result: cleanSuccess
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot zero out volume, code = 374,
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot zero out volume, code = 374
2014-08-04 18:25:44,646 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-75) CommandAsyncTask::EndActionIfNecessary: All tasks of command 5c5a7ac1-ff29-4389-9059-020c44f43aaa has ended -> executing endAction
2014-08-04 18:25:44,654 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler_Worker-75) CommandAsyncTask::endAction: Ending action for 1 tasks (command ID: 5c5a7ac1-ff29-4389-9059-020c44f43aaa): calling endAction .
2014-08-04 18:25:44,662 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-8-thread-23) CommandAsyncTask::EndCommandAction [within thread] context: Attempting to endAction AddDisk, executionIndex: 0
2014-08-04 18:25:44,672 ERROR [org.ovirt.engine.core.bll.AddDiskCommand] (org.ovirt.thread.pool-8-thread-23) [421c4c60] Ending command with failure: org.ovirt.engine.core.bll.AddDiskCommand

Comment 7 Ori Gofen 2014-08-04 15:48:24 UTC
Created attachment 923928 [details]
vdsm+engine correct logs

Comment 8 Vered Volansky 2014-08-06 10:04:49 UTC
After a discussion with Ori, it turns out the verification was too strict. This bz fix was intended to give a preliminary solution to non-existing allocation checks. I Opened bz1127121 in order to handle the scenario used by Ori.

The verification needed on this bug is as follows:
1. Create a Vm and add disks to it.
2. Create a template of the Vm.
3. Add a thinly-provisioned VM over that template USING REST.

Verify step 3 against a storage domain with and without enough space according to the one table in comment #0. With enough space should succeed and without should fail on appropriate CDA.

Note this bz was openned to follow an issue specific to REST API.

Changing back to MODIFIED.

Comment 9 Eyal Edri 2014-08-06 10:38:34 UTC
if not code change was done, it should go back to on_qa and re-verfied.

Comment 10 Ori Gofen 2014-08-07 14:51:01 UTC
Vered,the bug fails qa on File domain

Comment 11 Gil Klein 2014-08-07 17:18:24 UTC
Moving back to assigned per Ori's comment #10

Comment 12 Allon Mureinik 2014-08-08 15:19:49 UTC
(In reply to Gil Klein from comment #11)
> Moving back to assigned per Ori's comment #10
Gil, I don't understand this decision.

As Vered explained in comment #8, Ori's verification was beyond the scope of this fix, and the issue he found will be tracked in a separate BZ (see bug 1127121 for details):

(In reply to Vered Volansky from comment #8)
> After a discussion with Ori, it turns out the verification was too strict.
> This bz fix was intended to give a preliminary solution to non-existing
> allocation checks. I Opened bz1127121 in order to handle the scenario used
> by Ori.

There's nothing else to do with /this/ BZ within its defined scope (unless further verification uncovers a different issue, of course).

Please advise.

Comment 13 Allon Mureinik 2014-08-10 15:50:32 UTC
(In reply to Allon Mureinik from comment #12)
> (In reply to Gil Klein from comment #11)
> > Moving back to assigned per Ori's comment #10
> Gil, I don't understand this decision.
> 
> As Vered explained in comment #8, Ori's verification was beyond the scope of
> this fix, and the issue he found will be tracked in a separate BZ (see bug
> 1127121 for details):

Completely missed this comment:
(In reply to Gil Klein from comment #11)
> Moving back to assigned per Ori's comment #10

Vered - we need to see what happens on a file SD.
Thanks!

Comment 14 Vered Volansky 2014-08-11 06:07:59 UTC
(In reply to Ori from comment #10)
> Vered,the bug fails qa on File domain

OK, steps to reproduce, new logs, description of the issue using the validation expected on this bug, not the strict one.

Thanks.

Comment 15 Ori Gofen 2014-08-11 10:46:59 UTC
Step to reproduce:
Setup: have a File domain with 7G free space

1. Create a Vm + disk(3G preallocated)
2. Create a template of the Vm.
3. Add a thinly-provisioned VM over that template USING REST.

results:
the operation is successful due to the fact that preallocated file disks(created from template) has 24k size rather than the virtual

the expected result is to block this scenario with CNA.

supplying logs to this steps will take some time.
I'll attach them as soon as I can.

Comment 16 Vered Volansky 2014-08-11 11:52:06 UTC
After a discussion with Ori and with his agreement:
Comment #15 actually verifies this specific bug. Moving back to ON QA.

Comment 17 Ori Gofen 2014-08-11 13:44:00 UTC
(In reply to Vered Volansky from comment #16)
> After a discussion with Ori and with his agreement:
> Comment #15 actually verifies this specific bug. Moving back to ON QA.

Agreed,The problem with this bug is that the subject is too general,hence, reveals to many scenario's in which some of them the actual result is far from the expected.

After a discussion with Vered, we had agreed that the Steps to verify this bug are only as mentioned in comment #8 against all relevant disks.

all other flows such as cloning vm(on file) or multi-process issues, that contain bugs will be open separately.

moving to verified on rc1

Comment 18 Sandro Bonazzola 2014-10-17 12:19:57 UTC
oVirt 3.5 has been released and should include the fix for this issue.