1767549 – Run the preflight check of migration task before waiting for a conversion host

Bug 1767549 - Run the preflight check of migration task before waiting for a conversion host

Summary: Run the preflight check of migration task before waiting for a conversion host

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	V2V
Sub Component:
Version:	5.10.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	GA
Target Release:	5.11.1
Assignee:	Fabien Dupont
QA Contact:	Ilanit Stein
Docs Contact:	Red Hat CloudForms Documentation
URL:
Whiteboard:
Depends On:	1726939
Blocks:
TreeView+	depends on / blocked

Reported:	2019-10-31 16:29 UTC by Satoe Imaishi
Modified:	2022-07-09 10:56 UTC (History)
CC List:	6 users (show)
Fixed In Version:	5.11.1.0
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1726939
Environment:
Last Closed:	2019-12-13 00:35:36 UTC
Category:	---
Cloudforms Team:	V2V
Target Upstream Version:
Embargoed:
Flags:	pm-rhel: cfme-5.11.z+ simaishi: mirror+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2019:4201	0	None	None	None	2019-12-13 00:35:46 UTC

Comment 2 CFME Bot 2019-10-31 21:35:41 UTC

New commit detected on ManageIQ/manageiq/ivanchuk:

https://github.com/ManageIQ/manageiq/commit/0afd8056370baa7bea37b27e4f3767b5ecae0b21
commit 0afd8056370baa7bea37b27e4f3767b5ecae0b21
Author:     Adam Grare <agrare>
AuthorDate: Wed Aug 14 10:58:43 2019 -0400
Commit:     Adam Grare <agrare>
CommitDate: Wed Aug 14 10:58:43 2019 -0400

    Merge pull request #19146 from fdupont-redhat/v2v_job_run_preflight_check_earlier

    Move preflight check before conversion host assignment

    (cherry picked from commit 6900df7b770e6bd7bd0b53cabd4b860fb159a37c)

    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1767549

 app/models/infra_conversion_job.rb | 12 +-
 app/models/service_template_transformation_plan_task.rb | 4 +-
 lib/infra_conversion_throttler.rb | 19 +-
 spec/lib/infra_conversion_throttler_spec.rb | 7 +
 spec/models/infra_conversion_job_spec.rb | 8 +-
 spec/models/service_template_transformation_plan_task_spec.rb | 4 +-
 6 files changed, 27 insertions(+), 27 deletions(-)

Comment 3 Ilanit Stein 2019-12-09 13:31:01 UTC

Fabien,

In order to verify this bug the following actions were run on CFME-5.11.1.1:

1. When migrating 20 VMs, with one conversion host, configured to max 10 VMs ->
First, CFME UI showed overall data size, for 10 VMs, and after few seconds,
the overall data size is updated to the total size.

So now, unlike previous CFMe versions, the overall total disks size is displayed, from an early stage - which is good.
The only problem is that it shows it in 2 phases.
* Is this problem resolvable please?

2. We tried to fail a single VM migration plan, on preflight check, as mentioned in this bug description,
However, the preflight did not fail, as expected. Our trials:
1. Remove cluster via rails console, before running the migration plan.
2. Try to migrate a VM that has VMware source storage, other than what's configured in the migration plan.
3. Try to migrate a VM without network, though in the mapping plan a specific source Network is mentioned. (Which later we understood this is by design)

For all 3, the Transformation Plan task status was OK, and not ERROR:
irb> vm = Vm.find_by(:name => 'v2v_migration_vm_0', :vendor => 'vmware')
irb> task = ServiceTemplateTransformationPlanTask.where(:source => vm).last
irb> task.status
=> "Ok"

The UI shows the plan is successfully approved, and continue to allocate a conversion host, endlessly.
Though there is a valid conversion host, it is not picked.
At first I thought this is bug 1778749, however, I am not sure, since the task status here is: OK, and in bug 1778749 the status is 'Stalled'.
* Is this a known bug, or should we open it?

Note that the CFME UI does filter VMs that does not match Storage/Cluster/Network, from being picked for a migration plan.
The pre flight seem to not check the VM/s Storage/Cluster/Network match the mapping plan.
* Can you please confirm this is a bug, as it seems?

Comment 4 Ilanit Stein 2019-12-09 15:32:34 UTC

Moving bug to VERIFIED, based on the above 20 VMs test.

Still the questions in comment #3, need to be addressed.

Comment 5 Fabien Dupont 2019-12-10 10:06:48 UTC

@Mike, do you think point #1 in comment #3 can be solved? I agree that displaying it in 2 phases isn't ideal.

Comment 6 Fabien Dupont 2019-12-10 10:10:29 UTC

For point #2, some comments:

1. In test 1, did you remove the cluster of the mapping for the cluster?
2. In test 2, you mention a the source storage being part of migration plan. I guess that you mean infrastructure mapping, right?
3. For test 3, this is normal.

I can run similar tests in our lab.

Comment 7 Fabien Dupont 2019-12-10 13:00:52 UTC

I have been able to make the preflight check fail:
  1. Create a infrastructure mapping with one cluster mapping
  2. Create a migration plan with the previously created infrastructure mapping
  3. Delete the cluster mapping item from Rails console
  4. Start the migration plan

To delete the cluster mapping item, I proceeded like this:

irb> mapping = TransformationMapping.find_by(:name => 'My Mapping')
irb> mapping.transformation_mapping_items.select { |i| i.source_type == 'EmsCluster' }.each { |i| i.destroy }

In log/evm.log, you can see the following error message:

[----] E, [2019-12-10T07:50:29.903842 #23938:1206f60] ERROR -- : Q-task_id([job_dispatcher]) /var/www/miq/vmdb/app/models/service_template_transformation_plan_task.rb:68:in `destination_cluster'

The problem then is that in the UI, the plan doesn't fail. This is due to the current implementation with Automate.
We expect that https://bugzilla.redhat.com/show_bug.cgi?id=1740510 will fix it. So, please recheck with the upstream build.

Comment 8 Mike Turley 2019-12-10 20:51:21 UTC

For this point above:

> 1. When migrating 20 VMs, with one conversion host, configured to max 10 VMs ->
> First, CFME UI showed overall data size, for 10 VMs, and after few seconds, 
> the overall data size is updated to the total size.

@Ilanit, would it be possible to capture the list of tasks from the plan request (miq_request_tasks property of the ServiceTemplateTransformationPlanRequest) before and after the data size jumps?

@Fabien, the logic for displaying the total disk space is just based on the data in those miq_request_tasks. If we don't want to display the incorrect total, we'll need some way of inferring from that data that the sum is still being prepared and shouldn't be displayed yet. If you can help me identify that criteria I can maybe just have it display "Calculating..." or something there. Otherwise, we would just need to make sure the totals in there are always accurate.

The code just sums the values of miq_request_tasks[].virtv2v_disks[].size across all disks of all tasks in the request: https://github.com/ManageIQ/manageiq-v2v/blob/master/app/javascript/react/screens/App/Overview/components/Migrations/helpers/inProgressHelpers.js#L91-L106

Comment 9 Fabien Dupont 2019-12-11 08:21:36 UTC

I think I understand what happens.
When we create the ServiceTemplateTransformationPlanRequest, it creates the tasks for the virtual machines that have been successfully migrated.
Once the tasks have been created, it creates the InfraConversionJob's that handle the workflow: https://github.com/ManageIQ/manageiq/blob/master/app/models/service_template_transformation_plan_request.rb#L49.
The miq_request_task.virtv2v_disks array is populated during preflight check that happens when the throttler tries to allocate a conversion host. It takes a few seconds to happen and the tasks are treated serially, so if you look at the tasks before that have all been throttled at least once, you will have an incomplete data set.

Even if we move the preflight check earlier in the process, it might still take some time to process and you will still have that X-phases.
IMO, the UI should check that all the tasks have a virtv2v_disks attribute and display "Calculating..." in the meantime.

@Ilanit, it deserves a BZ to track that. For me, the severity is medium.

Comment 10 Ilanit Stein 2019-12-11 11:12:49 UTC

In reply to comment #9, opened 
Bug 1782184 - [v2v][UI] Migration plan Datastores total size is displayed in phases.

Mike,
Marking the need info as +, as I understand there is no need for my input on this.
Please let me know if I misunderstood.

Comment 11 Mike Turley 2019-12-11 15:29:39 UTC

@Ilanit, that's correct, thank you!

@Fabien, I'll ask a followup question in the new BZ.

Comment 13 errata-xmlrpc 2019-12-13 00:35:36 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:4201

Comment 14 Maayan Hadasi 2020-01-07 13:17:37 UTC

(In reply to Fabien Dupont from comment #7)
> I have been able to make the preflight check fail:
>   1. Create a infrastructure mapping with one cluster mapping
>   2. Create a migration plan with the previously created infrastructure
> mapping
>   3. Delete the cluster mapping item from Rails console
>   4. Start the migration plan
> 
> To delete the cluster mapping item, I proceeded like this:
> 
> irb> mapping = TransformationMapping.find_by(:name => 'My Mapping')
> irb> mapping.transformation_mapping_items.select { |i| i.source_type ==
> 'EmsCluster' }.each { |i| i.destroy }
> 
> In log/evm.log, you can see the following error message:
> 
> [----] E, [2019-12-10T07:50:29.903842 #23938:1206f60] ERROR -- :
> Q-task_id([job_dispatcher])
> /var/www/miq/vmdb/app/models/service_template_transformation_plan_task.rb:68:
> in `destination_cluster'
> 
> The problem then is that in the UI, the plan doesn't fail. This is due to
> the current implementation with Automate.
> We expect that https://bugzilla.redhat.com/show_bug.cgi?id=1740510 will fix
> it. So, please recheck with the upstream build.


Preflight check was tested with ManageIO nightly (CFME upstream) version: master.20191217224046_ebd4ebf
Test scenario was as described above, but preflight check didn't fail as expected.
Opened:
bug 1788505
bug 1788523

Note You need to log in before you can comment on or make changes to this bug.