Bug 1478518 - CFME reports VM migration passed when it fails on RHV side
CFME reports VM migration passed when it fails on RHV side
Status: ON_DEV
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers (Show other bugs)
Unspecified Unspecified
medium Severity medium
: GA
: 5.8.3
Assigned To: Piotr Kliczewski
Ilanit Stein
: ZStream
Depends On: 1448023
  Show dependency treegraph
Reported: 2017-08-04 14:08 EDT by Satoe Imaishi
Modified: 2017-10-24 02:51 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1448023
Last Closed:
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: RHEVM

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Github ManageIQ/manageiq-content/pull/197 None None None 2017-10-11 13:28 EDT

  None (edit)
Comment 2 CFME Bot 2017-08-04 14:13:46 EDT
New commit detected on ManageIQ/manageiq-content/fine:

commit c511fc7b1b0380fc5de858045b2e5f8667aebcae
Author:     Madhu Kanoor <mkanoor@redhat.com>
AuthorDate: Fri Jun 23 14:51:02 2017 -0400
Commit:     Satoe Imaishi <simaishi@redhat.com>
CommitDate: Fri Aug 4 14:09:09 2017 -0400

    Merge pull request #135 from billfitzgerald0120/check_migrate
    Updated vm migration to report when an error occurs.
    (cherry picked from commit 61793a6a2a1e7aa0b7061d6986a27af84d4052ff)

 .../Methods.class/__methods__/checkmigration.rb    | 68 ++++++++++++-----
 .../__methods__/checkmigration_spec.rb             | 88 ++++++++++++++++++++++
 2 files changed, 139 insertions(+), 17 deletions(-)
 create mode 100644 spec/content/automate/ManageIQ/Infrastructure/VM/Migrate/StateMachines/Methods.class/__methods__/checkmigration_spec.rb
Comment 3 Ilanit Stein 2017-09-03 07:16:55 EDT
Tested on CFME-5.8.2/RHV-4.1.3:

Started VM migration from CFME side.
After RHV UI showed VM is migrating, destination host was rebooted, 
and in RHV events, it was reported that migration failed.

On CFME side,
Shortly after VM migration was started, the Request state turned into "Migrated" - Which is not true.
and the request few minutes later failed on email send.
Request details:

Status Ok
Request State Migrated
Requester Administrator
Request Type VM Migrate
Description VM Migrate for: <vm name> - Host: host_mixed_2
Last Message Server [EVM] VM [cfme_5815_bug_1478108] Step [EmailOwner] Status [Error Emailing Owner] Message [Emailing Owner]
Created On Sun, 03 Sep 2017 10:34:07 +0000
Last Update Sun, 03 Sep 2017 10:36:37 +0000
Approval State Approved
Approved/Denied by admin (Administrator)
Approved/Denied on Sun, 03 Sep 2017 10:34:28 +0000
Reason Auto-Approved

As Migration failed on RHV side, on a problem occurred during the VM migration,
the expected behavior is that on CFME side, the Migration VM request will fail, 
and the Migration error will be propagated from RHV side to CFME side.
Thus, moving bug back to assigned.
Comment 4 Oved Ourfali 2017-09-18 03:51:41 EDT
William, can you take a look?
Comment 5 William Fitzgerald 2017-09-18 10:30:54 EDT

Can you re-create this and let me have access to your appliance? 


Comment 6 Ilanit Stein 2017-09-26 08:10:30 EDT
Recreated migration failure, and sent details in private.
Comment 7 William Fitzgerald 2017-09-26 09:29:07 EDT

You are getting an error trying to send the email.

You need this fix.   https://github.com/ManageIQ/manageiq-content/pull/177


Comment 8 Ilanit Stein 2017-09-26 10:03:03 EDT
The Request state is updated to Migrated, status: Ok,
while VM was still migrating (seen on RHV UI),
before getting the last message on email error.

Therefore, I would suspect that even if we'll have that email send fix,
We'll still have the same behavior: From CFME side seem as if VM Migrated, while migration actually failed.
Comment 9 William Fitzgerald 2017-09-26 12:30:53 EDT
I added a billy domain and I commented out the email.  Can you try it again ?

This should have the same affect as applying the fix: https://github.com/ManageIQ/manageiq-content/pull/177


Comment 10 William Fitzgerald 2017-09-26 17:39:33 EDT
Automate thinks the migration is good

[----] I, [2017-09-26T13:32:36.197916 #18149:1269340]  INFO -- : Q-task_id([vm_migrate_task_46]) <AEMethod checkmigration> CheckMigration returned <ok> for state <migrated> and status <Ok>

And 29 seconds later, I see this Error in the evm.log

log/evm.log:[----] I, [2017-09-26T13:32:55.620861 #28993:126f13c]  INFO -- : MIQ(MiqQueue.put) Message id: [253685],  id: [], Zone: [default], Role: [event], Server: [], Ident: [ems], Target id: [14], Instance id: [], Task id: [], Command: [EmsEvent.add], Timeout: [600], Priority: [100], State: [ready], Deliver On: [], Data: [], Args: [{:event_type=>"VM_MIGRATION_FAILED_FROM_TO", :source=>"RHEVM", :message=>"Migration failed  (VM: cfme_5.8.2_Juan, Source: host_mixed_2, Destination: host_mixed_1).", :timestamp=>2017-09-26 13:32:39 -0400, :username=>"admin@internal-authz", :full_data=>{:id=>"966647", :href=>"/ovirt-engine/api/events/966647", :cluster=>{:id=>"e8271a4c-ad88-4bd7-8c46-f7c9efde5e9c", :href=>"/ovirt-engine/api/clusters/e8271a4c-ad88-4bd7-8c46-f7c9efde5e9c"}, :data_center=>{:id=>"56099573-9402-4da4-8c84-c8decdc94a95", :href=>"/ovirt-engine/api/datacenters/56099573-9402-4da4-8c84-c8decdc94a95"}, :host=>{:id=>"3b2f783a-0168-4629-b8ef-721bd3517b2b", :href=>"/ovirt-engine/api/hosts/3b2f783a-0168-4629-b8ef-721bd3517b2b"}, :template=>{:id=>"0d514194-19a9-426b-a123-36962c1de240", :href=>"/ovirt-engine/api/templates/0d514194-19a9-426b-a123-36962c1de240"}, :user=>{:id=>"5947acbb-0172-0324-016a-000000000217", :href=>"/ovirt-engine/api/users/5947acbb-0172-0324-016a-000000000217"}, :vm=>{:id=>"f1ce76e5-bb73-4aa9-82f4-7473418cfab5", :href=>"/ovirt-engine/api/vms/f1ce76e5-bb73-4aa9-82f4-7473418cfab5"}, :description=>"Migration failed  (VM: cfme_5.8.2_Juan, Source: host_mixed_2, Destination: host_mixed_1).", :severity=>"error", :code=>120, :time=>2017-09-26 13:32:39 -0400, :name=>"VM_MIGRATION_FAILED_FROM_TO"}, :ems_id=>14, :vm_ems_ref=>"/api/vms/f1ce76e5-bb73-4aa9-82f4-7473418cfab5", :host_ems_ref=>"/api/hosts/3b2f783a-0168-4629-b8ef-721bd3517b2b", :ems_cluster_ems_ref=>"/api/clusters/e8271a4c-ad88-4bd7-8c46-f7c9efde5e9c"}]

It appears that we are not waiting for the migration to finish successfully.
Still looking into this ...


Comment 11 William Fitzgerald 2017-10-02 15:04:12 EDT
The state machine is now correct but if the migration fails the state machine doesn't know it failed.  This probably needs to go to the providers group.
Comment 13 Ilanit Stein 2017-10-24 02:51:41 EDT
for the record adding answer to the need info in comment #9,
that was answered offline:

I tested RHV VM migration with the email fix, mentioned in comment #9.
Now it is passing the email stage, and end up with Last message: "[EVM] VM Migrated Successfully".
On RHV side, the VM migration failed, since I rebooted the destination host, while VM was migrating.
It should fail also on CFME side.

Note You need to log in before you can comment on or make changes to this bug.