Bug 1692892
Summary: | [v2v][OSP] Migration stuck in refresh inventory state | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Yadnyawalk Tale <ytale> | ||||||
Component: | V2V | Assignee: | Marek Aufart <maufart> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Yadnyawalk Tale <ytale> | ||||||
Severity: | urgent | Docs Contact: | Red Hat CloudForms Documentation <cloudforms-docs> | ||||||
Priority: | high | ||||||||
Version: | 5.10.2 | CC: | bthurber, dmetzger, fdupont, maufart, mnadeem, nperic, pvauter, simaishi, smallamp, sshveta | ||||||
Target Milestone: | GA | Keywords: | TestOnly, ZStream | ||||||
Target Release: | 5.11.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | v2v | ||||||||
Fixed In Version: | 5.11.0.23 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1694190 (view as bug list) | Environment: | |||||||
Last Closed: | 2019-12-13 14:57:13 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | Bug | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | V2V | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1743438 | ||||||||
Bug Blocks: | 1694190, 1712049, 1734338 | ||||||||
Attachments: |
|
Description
Yadnyawalk Tale
2019-03-26 16:12:53 UTC
I confirm the same problem when using VDDK. This only happens when migrating to OpenStack, not RHV. @marek, any idea of what may have changed since https://github.com/ManageIQ/manageiq-providers-openstack/pull/433 ? I have done some extra tests in our lab to try to identify the root cause. I have used two providers : VMware and OpenStack, to compare the behaviours provider-wise. I have also used to CloudForms appliances: 5.10.1.1 and 5.10.2.2, to compare the behaviours CloudForms-wise. Both CloudForms appliances are connected to the two providers. As the current investigations led me to think the problem was linked to OpenStack events, I did the following manipulations directly from the providers and looked at CloudForms logs: 1. Create an instance on OpenStack => The instance succesfully starts on OpenStack => CFME 5.10.1.1 catches the instance.create event and runs a refresh => CFME 5.10.2.2 doesn't catch any event within the 5 minutes following the instance launch => CFME 5.10.2.2 find the instance after a manual refresh of OpenStack provider 2. Start a powered off VM on VMware => The VM successfully starts on VMware => CFME 5.10.1.1 catches the VM power on event => CFME 5.10.2.2 catches the VM power on event From these tests, I'm keen to think that the problem is that CloudForms is not able to catch the OpenStack provider events. This has an impact on V2V, but I doubt it is due to V2V. Just to add one more version with the same problem. I tried migrating 20 VMs to OSP, using CFME 5.10.2.1 (cfme-rhos-5.10.2.1-1.x86_64.qcow2) and am hitting the same issue. Will start downgrading CFME to see which is the first version I manage to get the migration completed. A dangerous effect of this failure is that the environment ended up with the same VMs powered on both on VMware AND OpenStack. New commit detected on ManageIQ/manageiq-providers-openstack/master: https://github.com/ManageIQ/manageiq-providers-openstack/commit/3091bbdf8ff21124cff04b3c0730f4be89ec8a0e commit 3091bbdf8ff21124cff04b3c0730f4be89ec8a0e Author: Marek Aufart <maufart> AuthorDate: Thu Mar 28 11:36:29 2019 -0400 Commit: Marek Aufart <maufart> CommitDate: Thu Mar 28 11:36:29 2019 -0400 Parse OpenStack Event Timestamps as UTC Timestamps returned in Events from Ceilometer/Panko service doesn't contain Timezone information, but OpenStack expects it to use UTC [1]. It should be parsed as UTC to ensure it will not be considered as MIQ machine local Timezone which could break time-based OpenStack Event queries. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1692892 [1] https://specs.openstack.org/openstack/api-wg/guidelines/time.html lib/manageiq/providers/openstack/legacy/events/openstack_ceilometer_event_monitor.rb | 10 +- 1 file changed, 7 insertions(+), 3 deletions(-) Created attachment 1594259 [details]
expected_throttling.png
Migration is still failing with refresh inventory issue with both VDDK and SSH mode. Appliance : https://10.8.197.205 Plan - plan_HVIt80G6dT Created attachment 1605998 [details]
Plan has no mapping
I connected to the appliance and the migration plan "plan_HVIt80G6dT" has an error message: "Infrastructure mapping doesn't exist" (see screenshot).
So, I checked using Rails console and it's confirmed:
irb> ServiceTemplateTransformationPlan.find_by(:name => 'plan_HVIt80G6dT').transformation_mapping
=> nil
And the automation.log is consistent with that error:
----] I, [2019-08-20T01:29:47.449163 #7863:2ad4283545b8] INFO -- : Q-task_id([r1_service_template_transformation_plan_task_1]) <AEMethod [/ManageIQ/Transformation/Infrastructure/VM/Common/CheckVmInInventory]> Starting
[----] E, [2019-08-20T01:29:47.901814 #7863:2ad4360959b8] ERROR -- : The following error occurred during instance method <destination_ems> for AR object <#<ServiceTemplateTransformationPlanTask id: 1, description: "Transforming VM [test-v2v-a-cdf6]", state: "migrat
e", request_type: "transformation_plan", userid: "admin", options: {:dialog=>nil, :workflow_settings=>{:resource_action_id=>44}, :initiator=>nil, :src_id=>1, :request_options=>{:submit_workflow=>true, :init_defaults=>false}, :cart_state=>"ordered", :requester_group
=>"EvmGroup-super_administrator", :executed_on_servers=>[1], :delivered_on=>2019-08-20 04:58:47 UTC, :infra_conversion_job_id=>1, :progress=>{"current_state"=>"/State11/State8", "current_description"=>"Refresh inventory", "percent"=>96.34, "states"=>{"/State5"=>{"s
tatus"=>"finished", "weight"=>1, "description"=>"Waiting for PreflightCheck", "message"=>"State5 is finished.", "started_on"=>2019-08-20 04:59:20 UTC, "percent"=>100.0, "updated_on"=>2019-08-20 04:59:21 UTC}, "/State8"=>{"status"=>"finished", "weight"=>2, "descript
ion"=>"Pre Transform VM", "message"=>"State8 is finished.", "started_on"=>2019-08-20 04:59:18 UTC, "percent"=>100.0, "updated_on"=>2019-08-20 04:59:57 UTC}, "/State8/State2"=>{"status"=>"finished", "weight"=>30, "description"=>"Run pre-migration playbook", "message
"=>"State2 is finished.", "started_on"=>2019-08-20 04:59:18 UTC, "percent"=>100.0, "updated_on"=>2019-08-20 04:59:22 UTC}, "/State2"=>{"status"=>"finished", "weight"=>20, "description"=>"Launch pre migration playbook", "message"=>"State2 is finished.", "started_on"
=>2019-08-20 04:59:19 UTC, "percent"=>100.0, "updated_on"=>2019-08-20 04:59:20 UTC}, "/State8/State5"=>{"status"=>"finished", "weight"=>40, "description"=>"Power off", "message"=>"State5 is finished.", "started_on"=>2019-08-20 04:59:22 UTC, "percent"=>100.0, "updat
ed_on"=>2019-08-20 04:59:55 UTC}, "/State8/State8"=>{"status"=>"finished", "weight"=>40, "description"=>"Collapse Snapshots", "message"=>"State8 is finished.", "started_on"=>2019-08-20 04:59:55 UTC, "percent"=>100.0, "updated_on"=>2019-08-20 04:59:57 UTC}, "/State1
1"=>{"status"=>"active", "weight"=>85, "description"=>"Transform VM", "message"=>"State11 is not finished yet [225/0 retries].", "started_on"=>2019-08-20 04:59:58 UTC, "percent"=>Infinity, "updated_on"=>2019-08-20 05:29:27 UTC}, "/State11/State2"=>{"status"=>"finis
hed", "weight"=>5, "description"=>"Convert disks", "message"=>"State2 is finished.", "started_on"=>2019-08-20 04:59:58 UTC, "percent"=>100.0, "updated_on"=>2019-08-20 05:00:00 UTC}, "/State11/State5"=>{"status"=>"finished", "weight"=>80, "description"=>"Convert dis
ks", "message"=>"Disks transformation succeeded.", "started_on"=>2019-08-20 05:00:01 UTC, "percent"=>100.0, "updated_on"=>2019-08-20 05:24:31 UTC}, "/State11/State8"=>{"status"=>"active", "weight"=>15, "description"=>"Refresh inventory", "message"=>"State8 is not f
inished yet [14/200 retries].", "started_on"=>2019-08-20 05:24:31 UTC, "percent"=>7.000000000000001, "updated_on"=>2019-08-20 05:29:26 UTC}}}, :conversion_host_name=>"auto-conv-instance2", :source_vm_power_state=>"on", :virtv2v_disks=>[{:path=>"[NFS_Datastore_1] te
st-v2v-a-cdf6/test-v2v-a-cdf6.vmdk", :size=>8589934592, :percent=>100, :weight=>100.0}], :network_mappings=>[{:source=>"VM Network", :destination=>"fff214ec-87eb-4814-992d-75ffcc6115dd", :mac_address=>"00:50:56:be:b6:61"}], :virtv2v_wrapper=>{"wrapper_log"=>"/var/l
og/virt-v2v/v2v-import-20190820T010028-11079-wrapper.log", "v2v_log"=>"/var/log/virt-v2v/v2v-import-20190820T010028-11079.log", "state_file"=>"/tmp/v2v-import-20190820T010028-11079.state", "throttling_file"=>"/tmp/v2v-import-20190820T010028-11079.throttle"}, :virtv
2v_started_on=>"2019-08-20 04:59:59", :virtv2v_status=>"succeeded", :virtv2v_pid=>11095, :virtv2v_finished_on=>"2019-08-20 05:24:28"}, created_on: "2019-08-20 04:58:47", updated_on: "2019-08-20 05:29:27", message: "Migrating", status: "Ok", type: "ServiceTemplateTr
ansformationPlanTask", miq_request_id: 1, source_id: 63, source_type: "VmOrTemplate", destination_id: nil, destination_type: nil, miq_request_task_id: nil, phase: nil, phase_context: {}, tenant_id: 1, cancelation_status: nil, conversion_host_id: 6>>
[----] E, [2019-08-20T01:29:47.902148 #7863:2ad4360959b8] ERROR -- : MiqAeServiceModelBase.ar_method raised: <RuntimeError>: <[test-v2v-a-cdf6] Cluster has no mapping.>
[----] E, [2019-08-20T01:29:47.902247 #7863:2ad4360959b8] ERROR -- : /var/www/miq/vmdb/app/models/service_template_transformation_plan_task.rb:68:in `destination_cluster'
/var/www/miq/vmdb/app/models/service_template_transformation_plan_task.rb:77:in `destination_ems'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:333:in `public_send'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:333:in `block in object_send'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:352:in `ar_method'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:362:in `ar_method'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:331:in `object_send'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:182:in `block (3 levels) in expose'
/var/www/miq/vmdb/app/models/user.rb:283:in `with_user'
/opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:182:in `block (2 levels) in expose'
/usr/share/ruby/drb/drb.rb:1630:in `perform_without_block'
/usr/share/ruby/drb/drb.rb:1590:in `perform'
/usr/share/ruby/drb/drb.rb:1674:in `block (2 levels) in main_loop'
/usr/share/ruby/drb/drb.rb:1670:in `loop'
/usr/share/ruby/drb/drb.rb:1670:in `block in main_loop'
[----] E, [2019-08-20T01:29:47.906632 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) <AEMethod checkvmininventory> The following error occurred during method evaluation:
[----] E, [2019-08-20T01:29:47.907095 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) <AEMethod checkvmininventory> RuntimeError: [test-v2v-a-cdf6] Cluster has no mapping.
[----] E, [2019-08-20T01:29:47.908134 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) <AEMethod checkvmininventory> (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /var/www/miq/vmdb/app/models/service_template_transformation_plan_task.rb:68:in `destination_cluster'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /var/www/miq/vmdb/app/models/service_template_transformation_plan_task.rb:77:in `destination_ems'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:333:in `public_send'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:333:in `block in object_send'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:352:in `ar_method'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:362:in `ar_method'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:331:in `object_send'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:182:in `block (3 levels) in expose'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /var/www/miq/vmdb/app/models/user.rb:283:in `with_user'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:182:in `block (2 levels) in expose'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1630:in `perform_without_block'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1590:in `perform'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1674:in `block (2 levels) in main_loop'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1670:in `loop'
(drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1670:in `block in main_loop'
/ManageIQ/Transformation/Infrastructure/VM/Common/CheckVmInInventory:15:in `main'
[----] E, [2019-08-20T01:29:47.911973 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /var/www/miq/vmdb/app/models/service_template_transformation_plan_task.rb:68:in `destination_cluster': [test-v2v-a-cdf6] Cluster has no mapping. (RuntimeError)
[----] E, [2019-08-20T01:29:47.912463 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /var/www/miq/vmdb/app/models/service_template_transformation_plan_task.rb:77:in `destination_ems'
[----] E, [2019-08-20T01:29:47.912821 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:333:in `public_send'
[----] E, [2019-08-20T01:29:47.913201 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:333:in `block in object_send'
[----] E, [2019-08-20T01:29:47.913573 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:352:in `ar_method'
[----] E, [2019-08-20T01:29:47.914040 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:362:in `ar_method'
[----] E, [2019-08-20T01:29:47.914446 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:331:in `object_send'
[----] E, [2019-08-20T01:29:47.914820 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:182:in `block (3 levels) in expose'
[----] E, [2019-08-20T01:29:47.915223 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /var/www/miq/vmdb/app/models/user.rb:283:in `with_user'
[----] E, [2019-08-20T01:29:47.915705 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /opt/rh/cfme-gemset/bundler/gems/cfme-automation_engine-960830d3f106/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:182:in `block (2 levels) in expose'
[----] E, [2019-08-20T01:29:47.916200 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1630:in `perform_without_block'
[----] E, [2019-08-20T01:29:47.916731 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1590:in `perform'
[----] E, [2019-08-20T01:29:47.917101 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1674:in `block (2 levels) in main_loop'
[----] E, [2019-08-20T01:29:47.917473 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1670:in `loop'
[----] E, [2019-08-20T01:29:47.917846 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from (drbunix:///tmp/automation_engine20190820-7863-17ch7hr) /usr/share/ruby/drb/drb.rb:1670:in `block in main_loop'
[----] E, [2019-08-20T01:29:47.918399 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from /ManageIQ/Transformation/Infrastructure/VM/Common/CheckVmInInventory:15:in `main'
[----] E, [2019-08-20T01:29:47.918916 #7863:2ad4350689c8] ERROR -- : Q-task_id([r1_service_template_transformation_plan_task_1]) Method STDERR: from /ManageIQ/Transformation/Infrastructure/VM/Common/CheckVmInInventory:34:in `<main>'
[----] I, [2019-08-20T01:29:47.939016 #7863:2ad4283545b8] INFO -- : Q-task_id([r1_service_template_transformation_plan_task_1]) <AEMethod [/ManageIQ/Transformation/Infrastructure/VM/Common/CheckVmInInventory]> Ending
Fabien , The mapping was deleted after the test failed by some automation. Please check plan_x68MIiG24k which has mapping and failed due at refresh inventory state. Thanks, Shveta Thanks @shveta. Indeed it failed with a timeout on inventory refresh. Can you reset the environment ? I mean remove the instance in OSP and recreate the migration plans. I'd like to troubleshoot it live. So, I'll start the plan myself. You can start the plan 'test_plan' at https://10.8.197.205. What I can see on the reproducer appliance. When I connected, migration was still waiting for inventory refresh for 'test-v2v-a-262h' and VMDB was consistent with that observation: irb> Vm.find_by(:name => 'test-v2v-a-262h', :vendor => 'openstack') PostgreSQLAdapter#log_after_checkout, connection_pool: size: 5, connections: 1, in use: 1, waiting_in_queue: 0 => nil irb> Vm.where(:name => 'test-v2v-a-262h').length => 1 When looking at evm.log, there were a very high number of events from the OpenStack provider. I manually refreshed the provider from the UI and in a few seconds the VM was in the inventory and the migration resumed and succeeded. The VMDB was consistent: irb> Vm.find_by(:name => 'test-v2v-a-262h', :vendor => 'openstack') => #<ManageIQ::Providers::Openstack::CloudManager::Vm id: 88, vendor: "openstack", format: nil, version: nil, name: "test-v2v-a-262h", description: nil, location: "unknown", config_xml: nil, autostart: nil, host_id: nil, last_sync_on: nil, created_on: "2019-08-28 15:02:36", updated_on: "2019-08-28 15:03:08", storage_id: nil, guid: "e5759b2a-e055-4bd0-a22b-7fccff4fb23d", ems_id: 37, last_scan_on: nil, last_scan_attempt_on: nil, uid_ems: "105e4f5c-418c-41c4-8e8f-1474a141f4e2", retires_on: nil, retired: nil, boot_time: nil, tools_status: nil, standby_action: nil, power_state: "on", state_changed_on: "2019-08-28 15:02:36", previous_state: nil, connection_state: "connected", last_perf_capture_on: nil, registered: nil, busy: nil, smart: nil, memory_reserve: nil, memory_reserve_expand: nil, memory_limit: nil, memory_shares: nil, memory_shares_level: nil, cpu_reserve: nil, cpu_reserve_expand: nil, cpu_limit: nil, cpu_shares: nil, cpu_shares_level: nil, cpu_affinity: nil, ems_created_on: nil, template: false, evm_owner_id: nil, ems_ref_obj: nil, miq_group_id: 1, linked_clone: nil, fault_tolerance: nil, type: "ManageIQ::Providers::Openstack::CloudManager::Vm", ems_ref: "105e4f5c-418c-41c4-8e8f-1474a141f4e2", ems_cluster_id: nil, retirement_warn: nil, retirement_last_warn: nil, vnc_port: nil, flavor_id: 4, availability_zone_id: 1, cloud: true, retirement_state: nil, cloud_network_id: nil, cloud_subnet_id: nil, cloud_tenant_id: 1, raw_power_state: "ACTIVE", publicly_available: nil, orchestration_stack_id: nil, retirement_requester: nil, tenant_id: 1, resource_group_id: nil, deprecated: nil, storage_profile_id: nil, cpu_hot_add_enabled: nil, cpu_hot_remove_enabled: nil, memory_hot_add_enabled: nil, memory_hot_add_limit: nil, memory_hot_add_increment: nil, hostname: nil> irb> Vm.where(:name => 'test-v2v-a-262h').length => 2 What I had already seen in MigEng lab is that adding an OpenStack provider generates an event storm, because all events from the history are processed by CloudForms. The inventory refresh generated by an event seems to be queued with all the other event actions. So, it will only happen once the event storm is finished, while a manual refresh will be queued in another queue and processed faster. @marek, is my assumption correct ? In MigEng lab, I simply waited for event storm to finish before launching my migrations. Could QE monitor the events queue to postpone the migration plan run ? @Fabien, you are right! The events catch-up time on current environment looks to take even more than hour. I discussed with Loic, that it would be okay add an option to Settings which will allow skip events from time before OpenStack was added as provider to CF. QE can set this option as part of scenario. Would that work? @Marek, that would be a great option, also for demos ;) (In reply to Marek Aufart from comment #22) > @Fabien, you are right! The events catch-up time on current environment > looks to take even more than hour. I discussed with Loic, that it would be > okay add an option to Settings which will allow skip events from time before > OpenStack was added as provider to CF. QE can set this option as part of > scenario. Would that work? Marek , what if we set this option by default in settings, means CFME comes with this option set to True ? This option is (for consistency with original behaviour) called :event_skip_history, which is false by default. Setting it to true will start loading events that were generated after OpenStack was added as provider to CF (so skip all old events). Note: this will work only for providers added _after_ the :event_skip_history option was set to true. If there was existing provider with Events before, its Event timeline will continue from point where it stopped last time. New commit detected on ManageIQ/manageiq-providers-openstack/master: https://github.com/ManageIQ/manageiq-providers-openstack/commit/d2f8b96ee4d3fd2f2c411d0bba8d4ae4d6efc16d commit d2f8b96ee4d3fd2f2c411d0bba8d4ae4d6efc16d Author: Marek Aufart <maufart> AuthorDate: Wed Aug 28 13:10:06 2019 -0400 Commit: Marek Aufart <maufart> CommitDate: Wed Aug 28 13:10:06 2019 -0400 Allow skip historical Events from OpenStack Multiple issues related to not processing of new events appeared, partially caused by slow processing of historical Events from OpenStack. Adding settings option :event_skip_history to allow skip all events generated in OpenStack before OpenStack was added as provider to ManageIQ. Event backread was decreased from 15 to 5 seconds to speedup catchup of OpenStack Events. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1692892 config/settings.yml | 3 +- lib/manageiq/providers/openstack/legacy/events/openstack_ceilometer_event_monitor.rb | 8 +- spec/legacy/events/openstack_ceilometer_event_monitor_spec.rb | 40 + 3 files changed, 49 insertions(+), 2 deletions(-) New commit detected on ManageIQ/manageiq-providers-openstack/ivanchuk: https://github.com/ManageIQ/manageiq-providers-openstack/commit/b2b97186890f65509354105ea90c09f09eb55cdd commit b2b97186890f65509354105ea90c09f09eb55cdd Author: Ladislav Smola <lsmola> AuthorDate: Fri Aug 30 12:52:11 2019 -0400 Commit: Ladislav Smola <lsmola> CommitDate: Fri Aug 30 12:52:11 2019 -0400 Merge pull request #503 from aufi/event_history_catchup Allow skip historical Events from OpenStack (cherry picked from commit 1a8bc1a679038e81f476e2d25c346c7a6f3cdf1d) https://bugzilla.redhat.com/show_bug.cgi?id=1692892 config/settings.yml | 3 +- lib/manageiq/providers/openstack/legacy/events/openstack_ceilometer_event_monitor.rb | 8 +- spec/legacy/events/openstack_ceilometer_event_monitor_spec.rb | 40 + 3 files changed, 49 insertions(+), 2 deletions(-) We tried event_skip_history=false initially which did showed older events in provider event history and migration also stucked at refresh inventory issue which was expected. Then we did event_skip_history=true and provider just did not showed us older event and migration also worked without refresh inventory error. This is going to be really useful feature least for us moving forward. Thanks to @Shveta for her help in testing it and @Marek + @Fabien for their inputs. Verified on - 5.11.0.23.20190904213640_d113674. |