New commit detected on ManageIQ/manageiq-content/fine: https://github.com/ManageIQ/manageiq-content/commit/c511fc7b1b0380fc5de858045b2e5f8667aebcae commit c511fc7b1b0380fc5de858045b2e5f8667aebcae Author: Madhu Kanoor <mkanoor> AuthorDate: Fri Jun 23 14:51:02 2017 -0400 Commit: Satoe Imaishi <simaishi> CommitDate: Fri Aug 4 14:09:09 2017 -0400 Merge pull request #135 from billfitzgerald0120/check_migrate Updated vm migration to report when an error occurs. (cherry picked from commit 61793a6a2a1e7aa0b7061d6986a27af84d4052ff) https://bugzilla.redhat.com/show_bug.cgi?id=1478518 .../Methods.class/__methods__/checkmigration.rb | 68 ++++++++++++----- .../__methods__/checkmigration_spec.rb | 88 ++++++++++++++++++++++ 2 files changed, 139 insertions(+), 17 deletions(-) create mode 100644 spec/content/automate/ManageIQ/Infrastructure/VM/Migrate/StateMachines/Methods.class/__methods__/checkmigration_spec.rb
Tested on CFME-5.8.2/RHV-4.1.3: Started VM migration from CFME side. After RHV UI showed VM is migrating, destination host was rebooted, and in RHV events, it was reported that migration failed. On CFME side, Shortly after VM migration was started, the Request state turned into "Migrated" - Which is not true. and the request few minutes later failed on email send. Request details: ====================================== Status Ok Request State Migrated Requester Administrator Request Type VM Migrate Description VM Migrate for: <vm name> - Host: host_mixed_2 Last Message Server [EVM] VM [cfme_5815_bug_1478108] Step [EmailOwner] Status [Error Emailing Owner] Message [Emailing Owner] Created On Sun, 03 Sep 2017 10:34:07 +0000 Last Update Sun, 03 Sep 2017 10:36:37 +0000 Approval State Approved Approved/Denied by admin (Administrator) Approved/Denied on Sun, 03 Sep 2017 10:34:28 +0000 Reason Auto-Approved ====================================== As Migration failed on RHV side, on a problem occurred during the VM migration, the expected behavior is that on CFME side, the Migration VM request will fail, and the Migration error will be propagated from RHV side to CFME side. Thus, moving bug back to assigned.
William, can you take a look?
Oved, Can you re-create this and let me have access to your appliance? Thanks Billy
Recreated migration failure, and sent details in private.
llanit, You are getting an error trying to send the email. You need this fix. https://github.com/ManageIQ/manageiq-content/pull/177 Thanks Billy
The Request state is updated to Migrated, status: Ok, while VM was still migrating (seen on RHV UI), before getting the last message on email error. Therefore, I would suspect that even if we'll have that email send fix, We'll still have the same behavior: From CFME side seem as if VM Migrated, while migration actually failed.
I added a billy domain and I commented out the email. Can you try it again ? This should have the same affect as applying the fix: https://github.com/ManageIQ/manageiq-content/pull/177 Thanks Billy
Automate thinks the migration is good [----] I, [2017-09-26T13:32:36.197916 #18149:1269340] INFO -- : Q-task_id([vm_migrate_task_46]) <AEMethod checkmigration> CheckMigration returned <ok> for state <migrated> and status <Ok> And 29 seconds later, I see this Error in the evm.log log/evm.log:[----] I, [2017-09-26T13:32:55.620861 #28993:126f13c] INFO -- : MIQ(MiqQueue.put) Message id: [253685], id: [], Zone: [default], Role: [event], Server: [], Ident: [ems], Target id: [14], Instance id: [], Task id: [], Command: [EmsEvent.add], Timeout: [600], Priority: [100], State: [ready], Deliver On: [], Data: [], Args: [{:event_type=>"VM_MIGRATION_FAILED_FROM_TO", :source=>"RHEVM", :message=>"Migration failed (VM: cfme_5.8.2_Juan, Source: host_mixed_2, Destination: host_mixed_1).", :timestamp=>2017-09-26 13:32:39 -0400, :username=>"admin@internal-authz", :full_data=>{:id=>"966647", :href=>"/ovirt-engine/api/events/966647", :cluster=>{:id=>"e8271a4c-ad88-4bd7-8c46-f7c9efde5e9c", :href=>"/ovirt-engine/api/clusters/e8271a4c-ad88-4bd7-8c46-f7c9efde5e9c"}, :data_center=>{:id=>"56099573-9402-4da4-8c84-c8decdc94a95", :href=>"/ovirt-engine/api/datacenters/56099573-9402-4da4-8c84-c8decdc94a95"}, :host=>{:id=>"3b2f783a-0168-4629-b8ef-721bd3517b2b", :href=>"/ovirt-engine/api/hosts/3b2f783a-0168-4629-b8ef-721bd3517b2b"}, :template=>{:id=>"0d514194-19a9-426b-a123-36962c1de240", :href=>"/ovirt-engine/api/templates/0d514194-19a9-426b-a123-36962c1de240"}, :user=>{:id=>"5947acbb-0172-0324-016a-000000000217", :href=>"/ovirt-engine/api/users/5947acbb-0172-0324-016a-000000000217"}, :vm=>{:id=>"f1ce76e5-bb73-4aa9-82f4-7473418cfab5", :href=>"/ovirt-engine/api/vms/f1ce76e5-bb73-4aa9-82f4-7473418cfab5"}, :description=>"Migration failed (VM: cfme_5.8.2_Juan, Source: host_mixed_2, Destination: host_mixed_1).", :severity=>"error", :code=>120, :time=>2017-09-26 13:32:39 -0400, :name=>"VM_MIGRATION_FAILED_FROM_TO"}, :ems_id=>14, :vm_ems_ref=>"/api/vms/f1ce76e5-bb73-4aa9-82f4-7473418cfab5", :host_ems_ref=>"/api/hosts/3b2f783a-0168-4629-b8ef-721bd3517b2b", :ems_cluster_ems_ref=>"/api/clusters/e8271a4c-ad88-4bd7-8c46-f7c9efde5e9c"}] It appears that we are not waiting for the migration to finish successfully. Still looking into this ... Thanks Billy
The state machine is now correct but if the migration fails the state machine doesn't know it failed. This probably needs to go to the providers group.
for the record adding answer to the need info in comment #9, that was answered offline: I tested RHV VM migration with the email fix, mentioned in comment #9. Now it is passing the email stage, and end up with Last message: "[EVM] VM Migrated Successfully". On RHV side, the VM migration failed, since I rebooted the destination host, while VM was migrating. It should fail also on CFME side.
New commit detected on ManageIQ/manageiq/fine: https://github.com/ManageIQ/manageiq/commit/c69473642892a0ffb30fc0c797196e99177dee0f commit c69473642892a0ffb30fc0c797196e99177dee0f Author: Greg McCullough <gmccullo> AuthorDate: Thu Dec 21 09:30:51 2017 -0500 Commit: Satoe Imaishi <simaishi> CommitDate: Wed Jan 10 09:25:15 2018 -0500 Merge pull request #16705 from bzwei/migrate_with_error Rescue migration error and update status (cherry picked from commit 805e02333a8bfa3b697b691057d1d0a506b43037) https://bugzilla.redhat.com/show_bug.cgi?id=1478518 app/models/vm_migrate_task.rb | 14 ++++++++++---- spec/models/vm_migrate_task_spec.rb | 18 ++++++++++++++++++ 2 files changed, 28 insertions(+), 4 deletions(-) create mode 100644 spec/models/vm_migrate_task_spec.rb
New commit detected on ManageIQ/manageiq/fine: https://github.com/ManageIQ/manageiq/commit/323eabede56ccd804ab5ef35fcb859ba72f0e16d commit 323eabede56ccd804ab5ef35fcb859ba72f0e16d Author: Moti Asayag <masayag> AuthorDate: Wed Jan 10 16:05:35 2018 +0200 Commit: Satoe Imaishi <simaishi> CommitDate: Mon Jan 15 11:03:29 2018 -0500 Merge pull request #182 from pkliczewski/master Block migration call (cherry picked from commit bc7972b4ba1c85bb8b66340e7c7d656e2f35dded) Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1478518 .../manageiq/providers/redhat/infra_manager.rb | 18 +++++++++++++- .../providers/redhat/infra_manager_spec.rb | 29 ++++++++++++++++++++++ 2 files changed, 46 insertions(+), 1 deletion(-)
Tested on CFME-5.8.3.1/rhv-4.1.8 From CFME side: Run VM migration. Reboot destination host, while migration is running. The migration Request failed on: "Server [EVM] VM [test-scat-uz4w] Step [CheckMigration] Status [Error Migrating VM] Message [Error: Validation failed: Status is not included in the list]" Piotr, The VM migration failure on RHV side, is now reflected on CFME, however the error contain "Status is not included in the list" - Can that be changed? Thanks.
I am not sure from where this message is coming. Will check whether it is possible.
Retested on CFME-5.8.3.1/rhv-4.1.8: Case 1: ====== Test: While VM migration was running, destination host was powered OFF. On RHV UI, it was seen that VM migration actually started, and the VM's host was already switched to the destination host. This is actually interrupting the VM migration in a __later__ stage (where VM was already passed to destination host). Results: On RHV logs, there wasn't a VM migration failure, but (destination) host failure (that was powered off). On CFME, after ~10 min, this error appeared in the VM migration request (under Services-> Requests): "Server [EVM] VM [golden_env_mixed_virtio_0] Step [CheckMigration] Status [Error Migrating VM] Message [Error: timed out after 600.008622942 seconds. Timeout threshold [600]]" Case 2: ====== Test: While VM migration was running, destination host was powered OFF, and then powered back ON. On RHV UI, it was seen that VM migration actually started, BUT the VM's host was yet switched to the destination host. This is actually interrupting the VM migration in an __early__ stage (where VM was not yet passed to destination host). Results: On RHV logs, there was a VM migration failure. On CFME, after ~10 min, this error appeared in the VM migration request (under Services-> Requests): "Server [EVM] VM [cfme_57_vm_0] Step [CheckMigration] Status [Error Migrating VM] Message [Error: Validation failed: Status is not included in the list]" ** Logs attached. Piotr, From the testing detailed on comment 16, RHV didn't reflect VM migration even started, so I was under the assumption that we got the unclear error message: "Error: Validation failed: Status is not included in the list" since the VM migration didn;t even start, However, case 2, above, show that even when on RHV side, VM migration was STARTED and failed, we still get on CFME side, this SAME unclear error. Would you please check if this error message can be found, and maybe changed to something more clear? thanks.
Created attachment 1384815 [details] engine.log See "case 1" VM migration started January 23 @12:08 of VM "golden_env_mixed_virtio_0", and ended @12:18. See "case 2" VM migration started January 23 @12:22 of VM "cfme_57_vm_0", and ended @12:24
Created attachment 1384816 [details] evm.log See "case 1" VM migration of VM "golden_env_mixed_virtio_0". See "case 2" VM migration of VM "cfme_57_vm_0".
Created attachment 1384818 [details] automation.log See "case 1" VM migration of VM "golden_env_mixed_virtio_0". See "case 2" VM migration of VM "cfme_57_vm_0".
Created attachment 1384819 [details] rhevm.log
Moving bug to Verified as on CFME-5.8.3.1/rhv-4.1.8 VM migration that fail on RHV side, generate a failure on CFME side. Opened a separate bug 1537953, for the unclear error mentioned above: "[Error: Validation failed: Status is not included in the list]"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:0374