Description of problem: Have CFME-5.9 connected to a Scale RHV env, with ~1800VMs, and ~400 hosts. On RHV side, move one host to maintenance. After ~15 min host is not displayed at all under CFME UI, Compute->Infrastructure->Hosts. Later, after moving a second host to maintenance, the first host, appeared on CFME UI, in status maintenance. Moving both hosts back up => CFME UI displayed those 2 hosts in UP state, quite quickly. Then moving on RHV side the same 2 hosts to maintenance => even after ~20 min, both hosts were not displayed at all on CFME side. Moving both hosts back up => CFME UI displayed those 2 hosts in UP state. In evm.log/automation.log, there are no errors automation.log report on UNKNOWN event: [----] I, [2017-12-24T10:37:06.637397 #22706:120f138] INFO -- : Instantiating [/System/Process/Event?EventStream%3A%3Aevent_stream=32507&Host%3A%3Ahost=1410&MiqServer%3A%3Amiq_server=1&User%3A%3Auser=1&event_id=32507&event_stream_id=32507&event_type=UNKNOWN&host_id=1410&object_name=Event&vmdb_object_type=event_stream] [----] I, [2017-12-24T10:37:06.660628 #22706:120f138] INFO -- : Updated namespace [/System/Process/Event?EventStream%3A%3Aevent_stream=32507&Host%3A%3Ahost=1410&MiqServer%3A%3Amiq_server=1&User%3A%3Auser=1&event_id=32507&event_stream_id=32507&event_type=UNKNOWN&host_id=1410&object_name=Event&vmdb_object_type=event_stream ManageIQ/System] [----] I, [2017-12-24T10:37:06.724341 #22706:120f138] INFO -- : Following Relationship [miqaedb:/System/Event/EmsEvent/RHEVM/UNKNOWN#create] [----] I, [2017-12-24T10:37:06.747575 #22706:120f138] INFO -- : Updated namespace [miqaedb:/System/Event/EmsEvent/RHEVM/UNKNOWN#create ManageIQ/System/Event/EmsEvent] [----] I, [2017-12-24T10:37:06.806212 #22706:120f138] INFO -- : Instance [/ManageIQ/System/Event/EmsEvent/RHEVM/UNKNOWN] not found in MiqAeDatastore - trying [.missing] and this is reflected in evm.log as well: [----] I, [2017-12-24T10:41:48.250751 #22706:120f138] INFO -- : MIQ(MiqAeEngine.deliver) Delivering {:event_id=>32512, :event_stream_id=>32512, :event_type=>"UNKNOWN", "Host::host"=>1410, :host_id=>1410} for object [EmsEvent.32512] with state [] to Automate Version-Release number of selected component (if applicable): CFME-5.9.0.14, RHV-4.2.1-0.0.master.20171206161426.git88e9120.el7.centos How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 1371911 [details] evm.log First host moved to maintenance: hera10.eng.lab.tlv.redhat.com Second host moved to maintenance: nested_host_from_cluster_nested_10.35.51.157 Looking for "UNKNOWN" show in the log the "unknown" event
Created attachment 1371912 [details] automation.log First host moved to maintenance: hera10.eng.lab.tlv.redhat.com Second host moved to maintenance: nested_host_from_cluster_nested_10.35.51.157 Looking for "UNKNOWN" show in the log the "unknown" event
Created attachment 1371913 [details] rhevm.log
Can you also attach the RHV logs?
Created attachment 1372071 [details] RHV engine.log
Important info: The bug was reported for CFME-5.9.0.14 + RHV Graph refresh - ON The problems that follow moving host to maintenance, on RHV side, and have on CFME in response: host vanish, and not displayed in state maintenance, for a very long time, is specifically related to RHV Graph refresh. For CFME-5.9.0.14 + RHV Graph refresh - OFF, changing host to maintenance, on RHV side, is followed by (after ~1 minute) update of this host state, on the CFME side to maintenance (was tested 3 times).
Checking the logs I see that there are no issues during refresh nor the host seems not to be deleted. I only found: [----] E, [2017-12-24T09:02:32.772102 #22784:148de8c] ERROR -- : MIQ(host_controller-show): Can't access selected records I checked the steps defined in comment #1 and I was not able to reproduce it on the latest master. Ilanit, please retest and provide new logs
Reproduced for a small scale RHV. 1. On RHV side, move host to maintenance. 2. On CFME side, Check Compute->Hosts: The host is no longer displayed. See screenshot attachment "host_mixed_2_is_gone.png" 3. Run Provider full refresh & check again Compute->Hosts: Now Host is displayed in status maintenance. See screenshot attachment "host_mixed_2_is_displayed_in_maintenance.png" 4. On RHV side, Activate host. 5. On CFME side, Check Compute->Hosts: The host is no longer displayed. 6. Run Provider full refresh & check again Compute->Hosts: Now Host is displayed in status UP. evm\rhevm\automation logs attached. Check for host_mixed_2 moved to maintenance, in evm.log, it is @ 2018-01-04T09:16
Created attachment 1376990 [details] host_mixed_2_is_displayed_in_maintenance.png
Created attachment 1376992 [details] host_mixed_2_is_gone.png
Created attachment 1376993 [details] reproduce_evm.log
Created attachment 1376994 [details] reproduce_automation.log
Created attachment 1376995 [details] reproduce_rhevm.log
In evm.log (attached reproduce_evm.log), there is an error (that also repeat later on), but I am not sure what is it related to: [----] E, [2018-01-04T09:14:11.523087 #60277:12fb13c] ERROR -- : MIQ(MiqQueue#deliver) Message id: [256292], Error: [no implicit conversion of Symbol into Integer] [----] E, [2018-01-04T09:14:11.523282 #60277:12fb13c] ERROR -- : [TypeError]: no implicit conversion of Symbol into Integer Method:[block in method_missing] [----] E, [2018-01-04T09:14:11.523450 #60277:12fb13c] ERROR -- : /var/www/miq/vmdb/app/models/ems_event.rb:81:in `[]' /var/www/miq/vmdb/app/models/ems_event.rb:81:in `add' /var/www/miq/vmdb/app/models/miq_queue.rb:449:in `block in dispatch_method' /opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:91:in `block in timeout' /opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `block in catch' /opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch' /opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:33:in `catch' /opt/rh/rh-ruby23/root/usr/share/ruby/timeout.rb:106:in `timeout' /var/www/miq/vmdb/app/models/miq_queue.rb:448:in `dispatch_method' /var/www/miq/vmdb/app/models/miq_queue.rb:425:in `block in deliver' /var/www/miq/vmdb/app/models/user.rb:253:in `with_user_group' /var/www/miq/vmdb/app/models/miq_queue.rb:425:in `deliver' /var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:104:in `deliver_queue_message' /var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:134:in `deliver_message' /var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:152:in `block in do_work' /var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `loop' /var/www/miq/vmdb/app/models/miq_queue_worker_base/runner.rb:146:in `do_work' /var/www/miq/vmdb/app/models/miq_event_handler/runner.rb:24:in `do_work' /var/www/miq/vmdb/app/models/miq_worker/runner.rb:329:in `block in do_work_loop' /var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `loop' /var/www/miq/vmdb/app/models/miq_worker/runner.rb:326:in `do_work_loop' /var/www/miq/vmdb/app/models/miq_worker/runner.rb:153:in `run' /var/www/miq/vmdb/app/models/miq_worker/runner.rb:127:in `start' /var/www/miq/vmdb/app/models/miq_worker/runner.rb:22:in `start_worker' /var/www/miq/vmdb/app/models/miq_worker.rb:357:in `block in start_runner_via_fork' /opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork' /opt/rh/cfme-gemset/gems/nakayoshi_fork-0.0.3/lib/nakayoshi_fork.rb:24:in `fork' /var/www/miq/vmdb/app/models/miq_worker.rb:355:in `start_runner_via_fork' /var/www/miq/vmdb/app/models/miq_worker.rb:349:in `start_runner' /var/www/miq/vmdb/app/models/miq_worker.rb:396:in `start' /var/www/miq/vmdb/app/models/miq_worker.rb:266:in `start_worker' /var/www/miq/vmdb/app/models/miq_worker.rb:153:in `block in sync_workers' /var/www/miq/vmdb/app/models/miq_worker.rb:153:in `times' /var/www/miq/vmdb/app/models/miq_worker.rb:153:in `sync_workers' /var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:53:in `block in sync_workers' /var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `each' /var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:50:in `sync_workers' /var/www/miq/vmdb/app/models/miq_server/worker_management/monitor.rb:22:in `monitor_workers' /var/www/miq/vmdb/app/models/miq_server.rb:329:in `block in monitor' /opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-d865ee63c000/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store' /opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-d865ee63c000/lib/gems/pending/util/extensions/miq-benchmark.rb:28:in `realtime_block' /var/www/miq/vmdb/app/models/miq_server.rb:329:in `monitor' /var/www/miq/vmdb/app/models/miq_server.rb:351:in `block (2 levels) in monitor_loop' /opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-d865ee63c000/lib/gems/pending/util/extensions/miq-benchmark.rb:11:in `realtime_store' /opt/rh/cfme-gemset/bundler/gems/manageiq-gems-pending-d865ee63c000/lib/gems/pending/util/extensions/miq-benchmark.rb:35:in `realtime_block' /var/www/miq/vmdb/app/models/miq_server.rb:351:in `block in monitor_loop' /var/www/miq/vmdb/app/models/miq_server.rb:350:in `loop' /var/www/miq/vmdb/app/models/miq_server.rb:350:in `monitor_loop' /var/www/miq/vmdb/app/models/miq_server.rb:234:in `start' /var/www/miq/vmdb/lib/workers/evm_server.rb:27:in `start' /var/www/miq/vmdb/lib/workers/evm_server.rb:48:in `start' /var/www/miq/vmdb/lib/workers/bin/evm_server.rb:4:in `<main>' Piotr, Would you please take a look? thanks.
I do not see this failure in my log and it could be the cause of the issue. I will investigate.
PR: https://github.com/ManageIQ/manageiq-providers-ovirt/pull/155