Description of problem: The following BZ fixed an issue in which a VM task invoked from a global region could fail and get repeatedly requeued if the VM was removed from the remote region's database before the task is processed by the remote region. Bug 1686035 - Multiple user logins logged every second from Global DB UI to RegionDB https://bugzilla.redhat.com/show_bug.cgi?id=1686035 The repeated requeueing is now fixed, but there is still an issue: 1.) The first failed attempt logs a traceback. 2.) The task is requeued once. 3.) The requeued task correctly logs an error, but an AuditSuccess message gets logged as well. Version-Release number of selected component (if applicable): 5.10.2.0 How reproducible: 100% Steps to Reproduce: 1.) Set up a global appliance and one remote appliance, with an infrastructure provider on the remote appliance. 2.) On the global appliance, edit /var/www/miq/vmdb/app/models/mixins/process_tasks_mixin.rb, and insert a sleep between the lookup of the VM object and the task invocation, e.g., **** if resource_ids.present? resource_ids.each do |id| begin (1) --> obj = collection.find(id) rescue ManageIQ::API::Client::ResourceNotFound => err _log.error(err.message) else _log.info("Invoking task #{action} on collection #{collection_name}, object #{obj.id}, with args #{post_args}") begin (2) --> sleep 20 (3) --> obj.send(action, post_args) rescue NoMethodError => err _log.error(err.message) end end end **** 3.) Restart evmserverd on the global appliance. 4.) Submit a VM power on/off action from the global appliance UI. 5.) A couple seconds later, in the remote appliance UI, remove the VM from the database. 6.) Check the evm.log on the global appliance. Actual results: A traceback is logged from the first attempt, then the task is requeued, then the task logs an error message but an AuditSuccess message is also logged: **** [----] I, [2019-03-14T10:49:57.361978 #38723:97cf5c] INFO -- : MIQ(Vm.invoke_api_tasks) Invoking task start on collection vms, object 18, with args {} [----] E, [2019-03-14T10:50:17.439667 #38723:97cf5c] ERROR -- : An error occurred while invoking remote tasks...Requeueing for 1 minute from now. [----] E, [2019-03-14T10:50:17.440128 #38723:97cf5c] ERROR -- : [ManageIQ::API::Client::ResourceNotFound]: Couldn't find Vm with 'id'=18 [WHERE "vms"."type" IN ('Vm', 'ManageIQ::Providers::CloudManager::Vm', 'ManageIQ::Providers::InfraManager::Vm', 'VmServer', 'ManageIQ::Providers::Amazon::CloudManager::Vm', 'ManageIQ::Providers::Azure::CloudManager::Vm', 'ManageIQ::Providers::Google::CloudManager::Vm', 'ManageIQ::Providers::Openstack::CloudManager::Vm', 'ManageIQ::Providers::Vmware::CloudManager::Vm', 'VmXen', 'ManageIQ::Providers::Kubevirt::InfraManager::Vm', 'ManageIQ::Providers::Redhat::InfraManager::Vm', 'ManageIQ::Providers::Microsoft::InfraManager::Vm', 'ManageIQ::Providers::Vmware::InfraManager::Vm') AND "vms"."template" = $1] Method:[rescue in block in invoke_tasks_remote] [----] E, [2019-03-14T10:50:17.440467 #38723:97cf5c] ERROR -- : /opt/rh/cfme-gemset/gems/manageiq-api-client-0.3.3/lib/manageiq/api/client/connection.rb:108:in `check_response' /opt/rh/cfme-gemset/gems/manageiq-api-client-0.3.3/lib/manageiq/api/client/connection.rb:102:in `send_request' /opt/rh/cfme-gemset/gems/manageiq-api-client-0.3.3/lib/manageiq/api/client/connection.rb:34:in `post' /opt/rh/cfme-gemset/gems/manageiq-api-client-0.3.3/lib/manageiq/api/client/client.rb:65:in `post' /opt/rh/cfme-gemset/gems/manageiq-api-client-0.3.3/lib/manageiq/api/client/resource.rb:60:in `exec_action' /opt/rh/cfme-gemset/gems/manageiq-api-client-0.3.3/lib/manageiq/api/client/resource.rb:46:in `method_missing' /var/www/miq/vmdb/app/models/mixins/process_tasks_mixin.rb:130:in `block in invoke_api_tasks' /var/www/miq/vmdb/app/models/mixins/process_tasks_mixin.rb:121:in `each' /var/www/miq/vmdb/app/models/mixins/process_tasks_mixin.rb:121:in `invoke_api_tasks' /var/www/miq/vmdb/app/models/mixins/process_tasks_mixin.rb:70:in `block in invoke_tasks_remote' /var/www/miq/vmdb/app/models/mixins/process_tasks_mixin.rb:65:in `each' /var/www/miq/vmdb/app/models/mixins/process_tasks_mixin.rb:65:in `invoke_tasks_remote' /var/www/miq/vmdb/app/models/mixins/process_tasks_mixin.rb:38:in `invoke_tasks' /var/www/miq/vmdb/app/models/miq_queue.rb:455:in `block in dispatch_method' /usr/share/ruby/timeout.rb:93:in `block in timeout' /usr/share/ruby/timeout.rb:33:in `block in catch' /usr/share/ruby/timeout.rb:33:in `catch' /usr/share/ruby/timeout.rb:33:in `catch' /usr/share/ruby/timeout.rb:108:in `timeout' /var/www/miq/vmdb/app/models/miq_queue.rb:453:in `dispatch_method' /var/www/miq/vmdb/app/models/miq_queue.rb:430:in `block in deliver' /var/www/miq/vmdb/app/models/user.rb:268:in `with_user' /var/www/miq/vmdb/app/models/user.rb:283:in `with_user_group' /var/www/miq/vmdb/app/models/miq_queue.rb:430:in `deliver' [...] [----] I, [2019-03-14T10:50:17.454548 #38723:97cf5c] INFO -- : MIQ(MiqQueue.put) Message id: [99000000000127], id: [], Zone: [default], Role: [], Server: [], MiqTask id: [], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Vm.invoke_tasks_remote], Timeout: [600], Priority: [100], State: [ready], Deliver On: [2019-03-14 14:51:17 UTC], Data: [], Args: [{:ids=>[18], :task=>"start", :userid=>"admin", :miq_task_id=>nil}] [----] I, [2019-03-14T10:50:17.454983 #38723:97cf5c] INFO -- : MIQ(MiqQueue#delivered) Message id: [99000000000120], State: [ok], Delivered in [22.701902075] seconds [...] [----] I, [2019-03-14T10:51:17.573208 #38723:97cf5c] INFO -- : MIQ(MiqGenericWorker::Runner#get_message_via_drb) Message id: [99000000000127], MiqWorker id: [99000000000013], Zone: [default], Role: [], Server: [], MiqTask id: [], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Vm.invoke_tasks_remote], Timeout: [600], Priority: [100], State: [dequeue], Deliver On: [2019-03-14 14:51:17 UTC], Data: [], Args: [{:ids=>[18], :task=>"start", :userid=>"admin", :miq_task_id=>nil}], Dequeued in: [60.125051915] seconds [----] I, [2019-03-14T10:51:17.573327 #38723:97cf5c] INFO -- : MIQ(MiqQueue#deliver) Message id: [99000000000127], Delivering... [----] E, [2019-03-14T10:51:17.964066 #38723:97cf5c] ERROR -- : MIQ(Vm.invoke_api_tasks) Couldn't find resource with 'id' [18] [----] I, [2019-03-14T10:51:17.983566 #38723:97cf5c] INFO -- : <AuditSuccess> MIQ(ProcessTasksMixin.task_audit_event) userid: [admin] - 'start' successfully initiated for remote VMs: [18] [----] I, [2019-03-14T10:51:17.983841 #38723:97cf5c] INFO -- : MIQ(MiqQueue#delivered) Message id: [99000000000127], State: [ok], Delivered in [0.410428068] seconds **** Expected results: The first failed attempt logs a clean error message, and there is no requeueing of the task or spurious AuditSuccess message. Additional info:
https://github.com/ManageIQ/manageiq/pull/18565
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/b3532af44c579e4733ae9c2933dd997cf6f821a8 commit b3532af44c579e4733ae9c2933dd997cf6f821a8 Author: Yuri Rudman <yrudman> AuthorDate: Mon Mar 18 15:10:15 2019 -0400 Commit: Yuri Rudman <yrudman> CommitDate: Mon Mar 18 15:10:15 2019 -0400 fix for hypothetical situation when remote object exists and reference to it found on global region, after that object got deleted on remote region but global region attempting to use previously found reference for Centralized Administration. What happened is: attempt to execute queue item raising 'resource not found' and message requeud. Desired behavior: if resource not found then do not requeue message Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1689330 app/models/mixins/process_tasks_mixin.rb | 11 +- 1 file changed, 3 insertions(+), 8 deletions(-)
Verified on 5.11.0.2.