Bug 1302780

Summary: Can't clone vm from template as thin copy to an imported domain with a copy of the template disk
Product: [oVirt] ovirt-engine Reporter: Raz Tamir <ratamir>
Component: BLL.StorageAssignee: Maor <mlipchuk>
Status: CLOSED CURRENTRELEASE QA Contact: Raz Tamir <ratamir>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.6.2.6CC: amureini, bugs, eshenitz, ratamir, sbonazzo, tjelinek, tnisan, ylavi
Target Milestone: ovirt-4.0.1Keywords: Automation, AutomationBlocker
Target Release: 4.0.1.1Flags: rule-engine: ovirt-4.0.z+
rule-engine: planning_ack+
rule-engine: devel_ack+
acanan: testing_ack+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Import a Template with copied disk to a setup before the second Storage Domain was imported to the setup cause the Template disk to indicate only one storage domain even though the second Storage Domain is being imported later to the DC. Consequence: Try to copy the template disk to SD1 which has an unregistered copied disk of a template is impossible. Fix: Once a copied disk will be issued, if there will be an existing unregistered disk in the Storage Domain then the copy operation will not be issued but only a DB operation will add a new mapping to the unregistered disk. Result: in a recoverable environment which the user already imported the Template even though some of the Storage Domains were not imported yet and the Template had copied disks on other Storage domains, the user can import those Storage Domains and copy the template disk once again. The operation will only be on the DB and no additional copy will be added.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-19 06:26:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1358291    
Attachments:
Description Flags
engine and vdsm logs
none
new logs none

Description Raz Tamir 2016-01-28 15:39:04 UTC
Created attachment 1119183 [details]
engine and vdsm logs

Description of problem:
The basic flow contains few issue:
1) It is impossible to copy template disk to storage domain after importing it, and it used to contain a copy of a template disk.
2) I is impossible to select the imported storage domain as a target domain for the cloned vm disk as thin copy because the target SD isn't containing a copy of the template disk.
1 + 2 = $Summary

From engine.log:

2016-01-28 16:36:28,937 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-48) [] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM host_mixed_3 command failed: Cannot create Logical Volume
2016-01-28 16:36:28,937 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-48) [] SPMAsyncTask::PollTask: Polling task 'c9bb35ff-5189-4ef3-9e77-4be9819d4d38' (Parent Command 'MoveOrCopyDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'.
2016-01-28 16:36:28,944 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler_Worker-48) [] BaseAsyncTask::logEndTaskFailure: Task 'c9bb35ff-5189-4ef3-9e77-4be9819d4d38' (Parent Command 'MoveOrCopyDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure:
-- Result: 'cleanSuccess'
-- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume, code = 550',
-- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume, code = 550'


From vdsm.log:

c9bb35ff-5189-4ef3-9e77-4be9819d4d38::ERROR::2016-01-28 16:36:22,633::volume::479::Storage.Volume::(create) Failed to create volume /rhev/data-center/d2144d19-e41b-4fad-b904-1ad871b2b4b9/65133c8b-076d-441c-8483-34c2e7a70c26/images/f0969223-959a-4b87-a12d-810ed49c5fcb/80f2767a-30c6-4d55-ab31-32754cf2ab95: Cannot create Logical Volume: ('65133c8b-076d-441c-8483-34c2e7a70c26', u'80f2767a-30c6-4d55-ab31-32754cf2ab95')
c9bb35ff-5189-4ef3-9e77-4be9819d4d38::ERROR::2016-01-28 16:36:22,633::volume::515::Storage.Volume::(create) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/volume.py", line 476, in create
    initialSize=initialSize)
  File "/usr/share/vdsm/storage/blockVolume.py", line 133, in _create
    initialTag=TAG_VOL_UNINIT)
  File "/usr/share/vdsm/storage/lvm.py", line 1096, in createLV
    raise se.CannotCreateLogicalVolume(vgName, lvName)
CannotCreateLogicalVolume: Cannot create Logical Volume: ('65133c8b-076d-441c-8483-34c2e7a70c26', u'80f2767a-30c6-4d55-ab31-32754cf2ab95')
c9bb35ff-5189-4ef3-9e77-4be9819d4d38::ERROR::2016-01-28 16:36:22,634::image::844::Storage.Image::(copyCollapsed) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 829, in copyCollapsed
    srcVolUUID=volume.BLANK_UUID)
  File "/usr/share/vdsm/storage/sd.py", line 488, in createVolume
    initialSize=initialSize)
  File "/usr/share/vdsm/storage/volume.py", line 476, in create
    initialSize=initialSize)
  File "/usr/share/vdsm/storage/blockVolume.py", line 133, in _create
    initialTag=TAG_VOL_UNINIT)
  File "/usr/share/vdsm/storage/lvm.py", line 1096, in createLV
    raise se.CannotCreateLogicalVolume(vgName, lvName)
CannotCreateLogicalVolume: Cannot create Logical Volume: ('65133c8b-076d-441c-8483-34c2e7a70c26', u'80f2767a-30c6-4d55-ab31-32754cf2ab95')



Version-Release number of selected component (if applicable):
rhevm-3.6.2.6-0.1.el6.noarch
vdsm-4.17.18-0.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Copy a template disk to SD1 (make sure you can create a vm from the template as thin copy on SD1)
2. remove SD1 without formatting it
3. Import SD1 back the the environment
4. Try to create vm from the template as thin copy on SD1 - IMPOSSIBLE
5. Try to copy the template disk to SD1 to be able to perform (4) - IMPOSSIBLE

Actual results:


Expected results:


Additional info:

Comment 1 Maor 2016-01-31 21:03:55 UTC
This is a known issue which described at https://bugzilla.redhat.com/1108904:

  * Register the remaining disks to the existing VM in the setup
  ....
  * Register a template with part of the storage domains.

Basically the Disaster Recovery process should register entities only after all the existing Storage Domains are active and valid in the setup (see [1]).
The reason for that is that we assume that the recovery process after a setup gets destroyed is that the admin will first activate all its Storage Domains and only then will try to register the rest of the VMs/Templates, otherwise one might import a partial VM, use it, and then the rest of the disks, which were not registered yet will be out of sync with the VM's snapshots.

Can you please describe the automation test steps?
Can the automation test egister the Template only after all the relevant Storage Domains gets imported to the engine?

Reducing this to medium since this is a known issue.

[1] http://www.ovirt.org/Features/ImportStorageDomain#Restrictions:
  "Currently all the Storage Domains which are related to the VMs/Templates disks must exist and be active in the Data Center once the entity get registred."

Comment 2 Raz Tamir 2016-01-31 22:08:47 UTC
Hi Maor,
I think you didn't understand me, so I'll try to be more clear following the steps to reproduce:
I'm not trying to register anything, I just want to be able to clone vm from template as thin copy to the imported SD.

1) To be able to clone a vm from template as thin, I should first need to make sure the template disk is located in the target SD (The domain I want to clone the vm to).
After copying the template's disk to the target domain, I can do the thin clone (the target SD will now appear in the target domain drop down list).

2) In that point, if I remove this domain, the target domain, without formatting it, I expect that the template's disk copy will still be there after I will import this domain back.

3) So after importing the domain, I'm trying again to:

4) Clone a vm from template as thin but this time I can't select the same target domain I just imported to the environment.

5) So I'm trying, again, to copy the template's disk to this domain but now I fail on the error:
error = Cannot create Logical Volume, code = 550

* Please raise the severity again to urgent in case it is not a known issue

Comment 3 Maor 2016-02-01 15:58:45 UTC
That is basically what I meant, the import of Storage Domain is mainly used for recovery flow, for partial use (At this example, import of a Storage Domain which have a copied volume of a template disk which already exists in the setup) is not yet supported.
It is part of 
  "* Register the remaining disks to the existing VM in the setup"

Can you please describe what the automation test is testing for?
Maybe the test should be changed (at least until the RFE will be solved).

Comment 4 Raz Tamir 2016-02-01 16:25:25 UTC
Hi Maor,
The test flow is what described in comment #2.

Comment 5 Maor 2016-04-13 13:56:28 UTC
The solution I suggest is that when a copy will done, the engine will show a message saying that there is already a copied disk on this Storage Domain and the user should register it directly from the Import Disks sub tab (see https://bugzilla.redhat.com/1138139)

Comment 6 Maor 2016-04-14 13:14:17 UTC
Eventually the solution will be much more simple:
Once a copied disk will be issued, if there will be an existing unregistered disk in the Storage Domain then the copy operation will not be issued but only a DB operation will add a new mapping to the unregistered disk.

Comment 7 Raz Tamir 2016-06-15 07:32:14 UTC
Verified on ovirt-engine-4.0.0.4-0.1.el7ev.noarch using the steps to reproduced.
- Copy the template's disk to the imported domain - passed
- Cloning a vm as thin copy to the imported domain - passed

Comment 8 Raz Tamir 2016-06-16 15:26:32 UTC
Seems like this is still happening.
New logs attached

Comment 9 Red Hat Bugzilla Rules Engine 2016-06-16 15:26:42 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 10 Raz Tamir 2016-06-16 15:27:17 UTC
Created attachment 1168765 [details]
new logs

Comment 11 Maor 2016-06-18 20:08:35 UTC
Hi,

Your vdsm log does not include the error which engine reports at 
2016-06-16 18:24:34,071 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler6) [39c1ad6e] BaseAsyncTask::logEndTaskFailure: Task '0a4b9ca0-1230-4b0f-8066-55ec114ef475' (Parent Command 'MoveOrCopyDisk', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure:
-- Result: 'cleanSuccess'
-- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume, code = 550',
-- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume, code = 550'

Can you please attach it so I can analyze it

Comment 12 Maor 2016-06-19 05:49:19 UTC
(In reply to Maor from comment #11)
> Hi,
> 
> Your vdsm log does not include the error which engine reports at 
> 2016-06-16 18:24:34,071 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask]
> (DefaultQuartzScheduler6) [39c1ad6e] BaseAsyncTask::logEndTaskFailure: Task
> '0a4b9ca0-1230-4b0f-8066-55ec114ef475' (Parent Command 'MoveOrCopyDisk',
> Parameters Type
> 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with
> failure:
> -- Result: 'cleanSuccess'
> -- Message: 'VDSGenericException: VDSErrorException: Failed to
> HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume, code = 550',
> -- Exception: 'VDSGenericException: VDSErrorException: Failed to
> HSMGetAllTasksStatusesVDS, error = Cannot create Logical Volume, code = 550'
> 
> Can you please attach it so I can analyze it

Also, what were the reproduce steps I can see that there was a copy operation from storageDomainId '0f0215c9-478e-4749-8a25-742f6ae22d1b' to '21eca906-6332-40b9-990e-499bc54df353' but I can't see to find their attach operation in the engine log

Comment 13 Maor 2016-06-19 07:56:37 UTC
Looks like there is a bug when a template disk has more than one copy of unregistered disk.
As part of the unregistered disks flow, the second attach of the storage domain will only keep only one copy of the unregistered disk.

Comment 14 Raz Tamir 2016-07-03 11:59:52 UTC
Verified on rhevm-4.0.2-0.2.rc1.el7ev.noarch

Comment 15 Maor 2016-07-04 08:20:59 UTC
It appears that automation ovirt moved the bug to modify although not all the patches were merged downstream.
I opened a ticket about it at : https://projects.engineering.redhat.com/browse/BUGZILLA-591

Comment 16 Maor 2016-07-04 09:12:51 UTC
(In reply to Maor from comment #15)
> It appears that automation ovirt moved the bug to modify although not all
> the patches were merged downstream.
> I opened a ticket about it at :
> https://projects.engineering.redhat.com/browse/BUGZILLA-591

Changing the bug to modify so it can be re-verify with change 60125

Comment 17 Raz Tamir 2016-07-17 18:00:57 UTC
Verified on rhevm-4.0.1.1-0.1.el7ev.noarch

Comment 18 Sandro Bonazzola 2016-07-19 06:26:40 UTC
Since the problem described in this bug report should be
resolved in oVirt 4.0.1 released on July 19th 2016, it has been closed with a
resolution of CURRENT RELEASE.

For information on the release, and how to update to this release, follow the link below.

If the solution does not work for you, open a new bug report.

http://www.ovirt.org/release/4.0.1/

Comment 19 Maor 2016-11-15 00:39:49 UTC
*** Bug 1358171 has been marked as a duplicate of this bug. ***