Bug 1600152 - [v2v] Migration balance is not working as expected
Summary: [v2v] Migration balance is not working as expected
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: UI - OPS
Version: 5.9.0
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: GA
: 5.10.0
Assignee: Fabien Dupont
QA Contact: Kedar Kulkarni
URL:
Whiteboard: v2v
: 1601433 (view as bug list)
Depends On:
Blocks: 1564236 1610054 1613048
TreeView+ depends on / blocked
 
Reported: 2018-07-11 14:27 UTC by Mor
Modified: 2019-02-11 14:04 UTC (History)
13 users (show)

Fixed In Version: 5.10.0.8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1610054 (view as bug list)
Environment:
Last Closed: 2019-02-11 14:04:46 UTC
Category: ---
Cloudforms Team: CFME Core
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
evm and automation logs (8.27 MB, application/zip)
2018-07-11 14:35 UTC, Mor
no flags Details
pending task error (61.33 KB, image/png)
2018-07-15 09:20 UTC, Mor
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github ManageIQ manageiq-content pull 358 0 None None None 2018-07-11 16:40:54 UTC

Description Mor 2018-07-11 14:27:02 UTC
Description of problem:
When running 4 migration plans, each one with 10 VMs of 10GB disk, on total of 4 tagged migration hosts to migrate VMs from VMware to RHV, CFME sometimes selects one host for migration, and sometimes two. We expected to see CFME picking 10 VMs per host.

Version-Release number of selected component (if applicable):
CFME 5.9.3.4.20180702181921_afd03d7

How reproducible:
100%

Steps to Reproduce:
1. Create 4 plans of 10 VMs on CFME environment with 4 conversion hosts.

Actual results:
Balance does not work correctly.

Expected results:
Conversion should be balanced. 10 per conversion host.

Additional info:

Comment 2 Mor 2018-07-11 14:35:14 UTC
Created attachment 1458131 [details]
evm and automation logs

Comment 4 Mor 2018-07-15 09:19:47 UTC
After applying the fix, I'm unable to run migration plans due to new error that appears regarding pending task (see screen shot attached).

Comment 5 Mor 2018-07-15 09:20:11 UTC
Created attachment 1458977 [details]
pending task error

Comment 6 Brett Thurber 2018-07-17 03:05:11 UTC
*** Bug 1601433 has been marked as a duplicate of this bug. ***

Comment 7 Brett Thurber 2018-07-17 03:06:36 UTC
Duplicate BZ:  https://bugzilla.redhat.com/show_bug.cgi?id=1601433

Comment 8 mlehrer 2018-07-17 11:07:54 UTC
Mor ran with Fabien's patch which resulted in endless progress status see below the issue is not currently resolved.

I tried to run it again, but now a get a weird error, which makes which fails the tasks in an endless progress status: 
"Error In Request: 75. Setting Pending Task: 534 To Finished"

This is what I found in the log:

[----] E, [2018-07-12T22:16:07.643366 #7703:85de064] ERROR -- : <AEMethod vmtransform_vmwarews2rhevm_vddk> The following error occurred during method evaluation:
[----] E, [2018-07-12T22:16:07.645408 #7703:85de064] ERROR -- : <AEMethod vmtransform_vmwarews2rhevm_vddk>   ArgumentError: wrong number of arguments (given 1, expected 2)
[----] E, [2018-07-12T22:16:07.649094 #7703:85de064] ERROR -- : <AEMethod vmtransform_vmwarews2rhevm_vddk>   (druby://127.0.0.1:39173) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-ec1e41d2b579/lib/mi
q_automation_engine/engine/miq_ae_method_service/miq_ae_service.rb:69:in `log'
(druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1624:in `perform_without_block'
(druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1584:in `perform'
(druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1657:in `block (2 levels) in main_loop'
(druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `loop'
(druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `block in main_loop'
/ManageIQ/Transformation/TransformationHosts/ovirt_host/VMTransform_vmwarews2rhevm_vddk:37:in `main'
[----] E, [2018-07-12T22:16:07.663541 #7703:85de064] ERROR -- : Method STDERR: (druby://127.0.0.1:39173) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-ec1e41d2b579/lib/miq_automation_engine/engine/miq
_ae_method_service/miq_ae_service.rb:69:in `log': wrong number of arguments (given 1, expected 2) (ArgumentError)
[----] E, [2018-07-12T22:16:07.665071 #7703:85de064] ERROR -- : Method STDERR:  from (druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1624:in `perform_without_block'
[----] E, [2018-07-12T22:16:07.666618 #7703:85de064] ERROR -- : Method STDERR:  from (druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1584:in `perform'
[----] E, [2018-07-12T22:16:07.668090 #7703:85de064] ERROR -- : Method STDERR:  from (druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1657:in `block (2 levels) in main_loop'
[----] E, [2018-07-12T22:16:07.669620 #7703:85de064] ERROR -- : Method STDERR:  from (druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `loop'
[----] E, [2018-07-12T22:16:07.671107 #7703:85de064] ERROR -- : Method STDERR:  from (druby://127.0.0.1:39173) /opt/rh/rh-ruby23/root/usr/share/ruby/drb/drb.rb:1653:in `block in main_loop'
[----] E, [2018-07-12T22:16:07.672934 #7703:85de064] ERROR -- : Method STDERR:  from /ManageIQ/Transformation/TransformationHosts/ovirt_host/VMTransform_vmwarews2rhevm_vddk:37:in `main'
[----] E, [2018-07-12T22:16:07.674597 #7703:85de064] ERROR -- : Method STDERR:  from /ManageIQ/Transformation/TransformationHosts/ovirt_host/VMTransform_vmwarews2rhevm_vddk:103:in `<main>'

Comment 9 mlehrer 2018-07-17 12:34:08 UTC
Summarizing additional issues that are related to this bz per advising of bz 1601433

- concurrent migration rate limit is currently not set, the fix for this bz defines it to 10 concurrent migrations

- round robin over all hosts should be default behavior applied over queuing to max concurrent limit per host before using additional hosts ( originally was https://bugzilla.redhat.com/show_bug.cgi?id=1601433)

- using single or multiple plans the behavior of 'balance' should be clearly defined understood to the user, current intended behavior of balance doesn't lead to 'balanced' migrations across all hosts.

Comment 10 Fabien Dupont 2018-07-18 17:00:06 UTC
The error in pointed by Mor about "[----] E, [2018-07-12T22:16:07.674597 #7703:85de064] ERROR -- : Method STDERR:  from /ManageIQ/Transformation/TransformationHosts/ovirt_host/VMTransform_vmwarews2rhevm_vddk:103:in `<main>'" has been fixed in both CloudForms and ManageIQ appliances. It was due to a typo in a @handle.log call (missing ':info' as an argument).

Comment 11 Brett Thurber 2018-07-18 18:05:53 UTC
RFE BZ for this feature:  1594196

Comment 12 Brett Thurber 2018-07-23 15:49:07 UTC
Please provide latest status in testing post last fix with automate typo.

Comment 15 Daniel Gur 2018-07-26 13:33:38 UTC
When do we expect a build with fix?

Comment 16 Brett Thurber 2018-07-26 14:46:06 UTC
@Daniel, PR is merged and should be in the next 5.9.4 build.  Moving to POST.

https://github.com/ManageIQ/manageiq-content/pull/358

Comment 17 Fabien Dupont 2018-07-28 22:05:06 UTC
Endless loop is caused by wrong nil to integer comparison.
Fixed in https://github.com/ManageIQ/manageiq-content/pull/379

Comment 18 Daniel Gur 2018-07-29 14:43:21 UTC
Hello @Fabien,
You wrote that you had set Maximum parallel migration on one host to 10.
My Questions
1 -  What should happen with plan that has 11 VMS if I have only one Host? Should the 11th VM  wait till one of the 10 VMs finishes migration and then migrate or should the 11th VM  fail migration?

2. What should happen with plan that has 11 VMS if I have  2 Hosts?
Should all 10 go to the first host and the remaining one to the second host or the division should more balanced something like  5 to one host 6 to the the second?

Comment 19 Fabien Dupont 2018-07-29 15:10:49 UTC
Hi @Daniel,

My answers:
1 - If you have only 1 host and 11 VMs in your plan, the 11th VM will wait until 1 of the 10 running migrations is finished.

2 - If you have 2 hosts and 11 VMs in your plan, you should have 6 migrations on host #1 and 5 migrations on host #2.

Comment 20 Daniel Gur 2018-07-30 11:53:15 UTC
Thank you for detailed answers @Fabien - They are very helpful!
Is this fix is  already part of CFME 5.9.4.1 drop?
Should it mov to "On-QA" ?

Comment 21 Fabien Dupont 2018-07-30 12:54:09 UTC
No, it's not in 5.9.4.1. If everything goes well, I have good hope it will be in 5.9.4.2. So, it's a bit early to move it to ON_QA.

Comment 23 CFME Bot 2018-07-31 01:01:29 UTC
New commit detected on ManageIQ/manageiq-content/gaprindashvili:

https://github.com/ManageIQ/manageiq-content/commit/80623ba43ec9a1c78323f4649861297a1593ba95
commit 80623ba43ec9a1c78323f4649861297a1593ba95
Author:     Greg McCullough <gmccullo>
AuthorDate: Tue Jul 17 11:58:31 2018 -0400
Commit:     Greg McCullough <gmccullo>
CommitDate: Tue Jul 17 11:58:31 2018 -0400

    Merge pull request #358 from fdupont-redhat/v2v_fix_transformation_host_acquisition

    Fix computation of currently running conversions by host
    (cherry picked from commit 5ec8138d45278d3cda4ef9c95b244b8ac30ec9fb)

    https://bugzilla.redhat.com/show_bug.cgi?id=1600152

 content/automate/ManageIQ/Transformation/TransformationHosts/Common.class/__methods__/utils.rb | 4 +-
 content/automate/ManageIQ/Transformation/TransformationHosts/ovirt_host.class/__methods__/vmtransform_vmwarews2rhevm_vddk.rb | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

Comment 24 Fabien Dupont 2018-08-20 06:18:32 UTC
This PR was wrongly linked and belongs to this BZ:
https://github.com/ManageIQ/manageiq-content/pull/395

Comment 25 Kedar Kulkarni 2018-08-23 19:36:28 UTC
With CFME 5.10.0.12 migrating multiple VMs distribute over available conversion hosts.


Note You need to log in before you can comment on or make changes to this bug.